For my final Digital Humanities assignment this year, I was asked to experiment with some visualisation tools and then see what I can learn from a text by visualising it through an analytical process. This text can be anything that is publicly available – such as a book, film script or document, and the tool I decided to use is Voyant 2.0.

I chose to analyse  Dickens’ Pickwick Papers not only because it is such an enjoyable read, but also because Dickens is a master at shaping characters through their use of language. Every character has his or her own particular way of expressing themselves when speaking; so much so that certain colloquailisms in the English language can be easily recognised and associated with Dickensian characters – “Bah! Humbug!”, for instance. Therefore, I thought that the process of analysing various patterns and word recurrences in Pickwick Papers would prove to be both stimulating, fruitful and would answer questions such as

  • Out of the numerous characters in such a large text, which are the main ones?
  • How does Dickens develop/ cast off his idea of the Pickwickian club as the novel continues?
  • What were the differences in language use between the upper- and working-classes in the 19th century?


A quick run of Pickwick Papers through Voyant 2.0 showed that “mr”, “pickwick”, “said”, “sam” and “sir” are the most frequent words in the text. This implies that the two main characters are Mr. Pickwick and his servant, Sam Weller. The former is hardly surprising since he is, after all, the eponymous character. However, I would have thought that Mr. Pickwick’s three Pickwickian companions (Messrs. Winkle, Snodgrass and Tupman) would have ranked higher in the word count – in fact, one can see that “snodgrass” occurs roughly only the same amount of times as “bardell” (Mr. Pickwick’s landlady). On the other hand, I was surprised to see that “sam”, “sammy” and “weller” feature prominently in the cirrus, evidently pointing to the fact that Mr. Pickwick’s servant is the second most important character. This observation reiterates the well-known fact that Dickens was a “champion of the poor”(2015, University of California) and frequently wrote about the lives of the lower-class.


Note – green                                        Club – blue                       Pickwickian – red

Trends, a function in Voyant 2.0, is a line graph that shows the relative frequencies of words within a document, divided into ten equal segments. Here, I analysed how often the words “note”, “club” and “pickwickian” appear in each segment of the book. The graph clearly shows that they were frequently referred to at the beginning, but as the novel continued, the Pickwickian club idea was dispensed with. Originally, Mr. Pickwick’s travels with his companions were intended to be for the purpose of gathering information, on which they would take notes. However, after a few chapters, the club and note-taking is forgotten. With the exception of a short mention in Chapter 11, there is no mention of it until the end of the book, when Mr. Pickwick disbands the club (CliffsNotes, 2016). In particular, the graph of the term “club” reflects this observation: in the final third of the book, it is hardly alluded to.

The Trends tool reminds me of another similar tool, called the Dáil Éireann Ngram Viewer. It, too, chronologically displays the frequencies of search terms in an aesthetic and comprehensible manner. The Ngram Viewer is, however, more detailed – for instance, one can hover over any point on the graph and see the month and year in which the topic was discussed. Perhaps a similar feature could also be added to the Trends tool.


Pickwick – pink          Sam – turquiose          Wot – dark blue           Wery – yellow

I was intrigued to examine the characters’ use of language. I chose to focus in on Sam, whose colloquialisms stand out in particular. Dickens further accentuates Sam’s cockney accent by phonetically spelling some of the words he uses – such as “wot” for “what” and “wery” for “very”. I analysed this using the Bubblelines and Streamgraph tools. The latter (above) visualises the frequency and repetition of a term’s use in a corpus by representing the term as a bubble; the larger the bubble, the higher the frequency. The former (below) depicts how the frequency of the words changes throughout the text (Sinclair and Rockwell, 2016)


Sam – blue                                    Wery – yellow                            Ain’t – green

The Bubblelines visualisation shows a strong correlation in the frequency and size of the turquoise bubbles with the yellow and dark blue ones . On the other hand, there appears to be no direct correlation with the pink bubbles. This indicates that the terms “wot” and “wery” are not used by members of the upper-class, such as Mr. Pickwick, but by those of the lower-class, such as Sam.

From the Streamgraph tool, we see that the frequencies of “wery” and “ain’t”  closely correlate to that of “sam”. This again supports the observation about Sam’s particular use of language. However, our conclusion can be taken even further: it not only demonstrates Sam’s vocabulary use, but that of the general lower-class population in the 19th century as well.

I also analysed the terms “wery”, “good” and “samivel” (“Samivel” is the manner in which Sam’s father pronounced his son’s name) using the Links tool. I was interested to note how closely they were linked, thus proving that these three words frequently occur in Mr. Tony Weller’s vocabulary. I was impressed by the ability of this tool to reflect a character’s manner of speaking – for example, the terms “wery” and “good” (in red) are both linked to “samivel”. Similarly to Sam’s case, examining Mr. Weller’s language gives us an insight into the linguistic mannerisms of the 19th century working-class.


Therefore, I believe that Voyant 2.0 is an invaluable tool for text analysis. In analysing novels, it allows us to identify the protagonists without having to read the entire text. It can also give us some information on characters, their use of language, places, events and even on the author. Text analysis even helps us to answer questions on the society and culture in the novel and possibly apply this to our knowledge on other subjects.


