500 Themes

In Macroanalysis: Digital Methods and Literary History (UIUC Press, 2013), I explain how I extracted 500 themes from a corpus of 19th-century novels using Latent Dirichlet Allocation. On this page you can select any one of the 500 themes to see a cloud visualization of the key words and then a series of plots showing the prevalence of the theme in relation to time, author-gender, and author-nationality.

Assigning labels to topic clusters is a subjective process. The labels I have assigned here are most frequently derived from the topic headwords. Some may find the labels unhelpful or even controversial. My goal was not to label the topics in a way that would satisfy all tastes or interpretations, but instead to create a workable title by which I could easily refer to a given topic. By default the modeling process assigns topics a number (e.g. topic 1, topic 2, etc). While referring to topics by number is certainly less controversial, it’s not a very useful way to talk about them. These labels should be read as “general terms of convenience” and not as definitive statements on the ultimate meaning of the word cluster.

3 thoughts on “500 Themes

  1. […] 5: An Interlude on Scale: Micro, Meso, and Macro. For another, Jockers has put a topic tool online, 500 Themes from a corpus of 19th-Century Fiction. Those are the topics he discusses in this […]

  2. […] access and I use it all the time. So, when I got to Chapter 8, “Theme,” I also accessed the topic browser that Jockers had put on the web. Through this browser I could explore the topic model Jockers used in the book and, in particular, […]

  3. […] Tapor visualization tools Jockers’ macro-themes […]

Leave a Reply