Analyzing Textual Data with Voyant Tools

The second half of this assignment for my Visualizing Historical Research class is a textual analysis of a book that pertains to my research, using a program called Voyant Tools. Voyant Tools is a web-app that uses interactive analytical tools to facilitate a more efficient reading of groups of text. By showing the frequency of the use of certain words as well as where they appear in the text, one can gather a general idea of what a corpus is about without having to read everything from cover to cover. I wanted to analyze a text I have not yet read, and that would further my understanding of Niagara’s economic history. The book that I chose to analyze is titled The correspondence of Lieut. Governor John Graves Simcoe: with allied documents relating to his administration of the government of Upper Canada.

John Graves Simcoe was the first Lieutenant Governor of Upper Canada from 1791-1796, and this book reveals much of his influential work as the British administrative foundations of the Niagara region took shape. It is a collection of primary sources from 1789-1793, compiled by Canadian historian Ernest Cruikshank in 1923. Cruikshank was A Brigadier General in World War I, influential in the establishment of Ontario’s Bureau of Archives, and originally from Fort Erie. Placed in charge of the province’s military documents for a few years, but never formally trained as an historian, he wrote a number of brief histories about the Niagara region during the Loyalist era. Cruikshank’s archival expertise is evident in this work, as he selected particular letters either written by or addressed to Simcoe, to be included in the corpus.

By downloading the full text, copy and pasting it into Voyant Tools and clicking “Reveal”, this is the resulting initial output. In the following paragraphs, I will explain my analysis of this text using Voyant Tools.

1. Cirrus & Terms

Below, you can see the word cloud that Voyant Tools has created based on the words in this text. The more frequently a word appears in the text, the larger it appears in the word cloud. Digital humanists Geoffrey Rockwell and Stefan Sinclair, the creators of Voyant Tools, describe the Cirrus tool as follows:

“One gets the impression of a birds-eye view of all the important words. Words appear next to other words serendipitously, which can rightly or wrongly suggest combinations to explore. The word cloud provides a different visual synthesis of the information. It has different affordances for interpretation.” [1]

So what does this visualization tell us about Simcoe’s correspondence? Well, the word “simcoe” is the most frequently used term in this corpus. This is not surprising, considering the text we are working with, so we can move on to other words. To maximize the efficiency of this tool, one of my suggestions for best practice is to switch the view from “Cirrus” to “Terms.” This lists the terms in order from most used to least used.

A few of the terms that immediately stand out to me are the terms “Indians”(#4) and “Indian” (#16). The words clearly hold value to Simcoe; he is either writing about Indigenous peoples or he is receiving information about them. The frequency of the word alone does not tell us their meaning in context, but some assumptions that could be made are that:

  1. Lt. Governor Simcoe is discussing land agreements. The American Revolution had ended less than a decade prior, and the British government ceded traditional lands of their Six Nations allies, mainly the Mohawk, Cayuga, Onondaga and Seneca nations, to the new United States. As white Loyalist families were resettled in the Niagara region and elsewhere in British North America, Simcoe and British officials also tried to resettle their Indigenous allies into new regions around Upper Canada.
  2. Lt. Governor Simcoe is discussing military alliances. Although the British were not at war between 1789-1793, Simcoe might have thought it wise to discuss the strategic benefits of placating specific Indigenous people groups in order to further their military alliances.

Another best practice that I suggest for using Voyant Tools is to look for the names of people and places as a way of determining a few key themes from the text. Thus, I typed in some of the names of individuals that pertain to my study. The name “Hamilton” appears a total of 83 times, but considering the popularity of this surname, I cannot assume that all mentions are with specific reference to Robert Hamilton the Niagara merchant. Alternatively, “Robert Hamilton” appears 14 times throughout the corpus. Similarly, the name “Richard Cartwright,” one of Kingston’s most prominent merchants, appears 14 times, and the full name of Detroit merchant “John Askin” appears 59 times. This is good news for me, since I now know that there are references to some of the individuals that were heavily involved in the economic development of Niagara during the Loyalist era.

Some of the most common places mentioned in this text include: Detroit(289), Quebec(231), Britain(208), Niagara(207), America(125), Erie(125), Ohio(99), Philadelphia(95), Kingston(81), and Ontario(58). What stands out to me here is the emphasis on the southwestern peninsula of Upper Canada, the doubly frequent mention of Lake Erie compared to Lake Ontario, and the prevalence of American cities like Detroit and Ohio. Lt. Governor Simcoe was a staunch British patriot who at one point had tried to make London the capital of Upper Canada, and was interested in facilitating trade towards the interior regions of the continent in the ultimate goal of colonial expansion. [2] The frequent mention of these central locations further substantiates their importance to Simcoe.

2. Contexts

While the Cirrus and Terms tool can show the frequency of word usage, they do not reveal the contexts in which these terms are used. In this way, the Voyant Tools’ Contexts pane takes textual analyses to the next level. By viewing certain words in their contexts we can determine their meaning within the corpus. For example, the words “Indian” and “Indians” actually hold very different meaning when contextualizing them. “Indian” is often used as a reference to government roles like an Indian Agent, the Indian department or the Superintendent of Indian affairs. This shows the intended use of the word as a reference to a British institution, rather than an example of actions by or references to Indigenous peoples.

The term “Indians,” however, holds a much different meaning. The term clearly incites fear for a few individuals. For example we read:
“of fighting in which the indians Excell [sic]”
“expert & more savage than the indians themselves”
“the Wabash or other hostile indians”

At times we see mentions of violence towards them:
“he will get the indians out of the way”
“of awing and curbing the indians in that corner”
“hope we shall give the indians a thorough drubbing this summer”

The term is also often used in the context of British paternalism:
“distribution of Presents to the indians”
“deficiency of Presents for the indians in Public stores”
“British officers in furnishing the indians with arms & ammunition”

Due to the disjointed format of the text, we are seeing divided opinions about Indigenous peoples. One particularly intriguing sentence was written to Lt. Gov. Simcoe by none other than Robert Hamilton as he discussed the impact of the American Revolution on colonial trade as a new national border was created and both European settlers and Indigenous peoples were shifted into new territories. Suggesting what was likely an unpopular opinion at the time, he writes:

“In extending their Territory in this quarter, some degree of moderation and justice has been shown in the purchase of the lands from the Native Indians, however inadequate the sum paid may be to the value.” [3]

In a similar vein the word “women” is only mentioned 21 times throughout the text, and the Contexts tool reveals that this is almost always conjunction with “and children.” In this particular analysis, showing the context of certain words demonstrates the official attitudes towards particular groups in society. The mention of Indigenous peoples and women reveal their relative value to the colonial administration, and how they are made useful in the bigger picture of expanding empire.

In addition to Voyant’s Contexts, tapor.ca is a site that holds hundreds of web tools for similar textual analyses. If you are not satisfied with the Voyant Tools version of contextual visualization, TAPoR 3.0 lists a number of alternate tools such as KWIC (Key Word in Context) and concordance TAPoRware that perform similar functions in detecting specific words anywhere in an HTML document. The concordance is not a new concept, as Rockwell & Sinclair discuss in their book Hermeneutica: Computer-Assisted Interpretation in the Humanities. The concordance is actually one of the innovations that influenced the development of their tools, as “its roots reach back to the Bible.” [4] They argue that context analyses developed by focusing on keywords as opposed to the earlier focus on key concepts, which was popularized by biblical indexation.

3. Bubblelines & Trends

Bubblelines is a useful comparison tool that visualizes the frequency of use of certain terms throughout a corpus, also showing the places in the text where they are mentioned. Bubblelines does not work well with corpora that have multiple documents because there are too many trends to accurately chart, so considering the fact that this text is a compilation of letters written by different individuals, this tool is not the best choice for analyzing this specific type of text.

Click “Separate Lines for Terms” to see a better output.

The results are similarly skewed in the Trends tool. As a best practice, I suggest using tools like Bubblelines and Trends only when analyzing one or two specific sources.

Conclusions

Voyant Tools has other options that could be used to explore this text, but so far I have learned that Lt. Governor Simcoe is focused on establishing trade networks in the southwestern regions of Upper Canada, whilst also managing the sensitive relationships between British officials, Loyalist immigrants, Indigenous peoples, and their new American neighbours. By doing this distant reading, one gathers a general sense of how Simcoe and those writing to him felt about certain groups of people. Many of these authors did not fully respect Indigenous people as human beings, but rather saw them as tools of empire or impediments to progress.

Voyant Tools can also be used to compare two or more texts. This means that this analysis could go a step further by comparing this text with, for example, a compilation of primary sources from the succeeding Lt. Governor of Upper Canada Peter Hunter who served from 1799-1805, or Francis Gore who served from 1806-1811. This is, of course, assuming that such compilations even exist. One could compare their correspondence with Simcoe’s, exploring what was important to the British government at different times, what legislation was implemented, who held power in colonial relationships, and how society was being formed from the “top down.” Voyant Tools provide useful visualization techniques for digital humanists. While the traditional close reading of a book holds its own value, it is important to recognize alternative forms of scholarly interpretation. Why not choose a text and try it yourself?

Notes:
[1] Geoffrey Rockwell and Stefan Sinclair, Hermeneutica: Computer-Assisted Interpretation in the Humanities, (Massachusetts: MIT Press, 2016), 35.
[2] Mary Beacock Fryer and Christopher Dracott, John Graves Simcoe 1752-1806: A Biography, (Toronto, Dundurn Press, 1998), 120.
[3] Ernest A. Cruikshank, The Correspondence of Lieut. Governor John Graves Simcoe: With Allied Documents Relating to his Administration of the Government of Upper Canada, vol 1, (Toronto: Ontario Historical Society, 1923), 98, accessed from the Internet Archive, https://archive.org/details/correspondenceof01simc.
[4] Rockwell & Sinclair, Hermeneutica, 47.

Visualizing Historiographical Data

Hi there, it’s been a while. This semester is coming to a close and thank goodness we are finally getting some spring weather!

This post and the next one are a little different from all of my posts so far in that they are also assignments for a required course I am taking at Brock as part of my Master’s thesis. The course is entitled Visualizing Historical Research and the aim is to work with different tools of data visualization to engage with history in a way that we as historians are not quite as familiar with. This course fits neatly with my current research as I work to visualize the spatial relationships between colonial settlers in the Niagara region, and I have learned a few useful things from this course this past semester.

If you’ve been following this blog, you’ll know that over the past six months I have been studying the scholarship of Canadian economic history, and now I need to organize the historiography in a clear manner. Of course, I could do this textually by simply writing down names and titles of books, describing the themes and categories that have appeared over the past century, but another helpful way of organizing such information is by using visualizations. This first blog post will discuss the benefits and limitations of the Timeline and the Venn Diagram when presenting historiographical information.

In his 2006 paper on the history of data visualization, American psychologist and statistician Michael Friendly states that the timeline was first used as an educational tool by natural philosophers and physicists of the 18th century, namely men like Joseph Priestly and Jacques Barbeau-Dubourg. [1] They were used to chart the progression of an individual’s biography, indicating the most noteworthy moments in the person’s life. Timelines are a good way of showing influential moments, and thus I thought it might be a good idea to create one that shows the different categories of historiography that appeared over time, pertaining to my area of research. Using Microsoft PowerPoint and aided by Carl Berger’s The Writing of Canadian history: Aspects of English-Canadian Historical Writing since 1900, I organized some of who I felt were the most influential historians into distinct categories. The result looked like this.

Click to enlarge

Timeline Overview
As you can see, I began with the 1930s and Harold Innis, a scholar that I have written about multiple times already in this blog. I grouped Innis, Creighton, Lower, and Careless into the category of “traditional economic history,” since the staples thesis and the Laurentian thesis largely form the basis for contemporary studies of Canadian economic development. Economic history became overshadowed by political biographies, and eventually became popular again by the 1960s when historians like W. L. Morton began to look at economic developments as regional studies, understanding that patterns of growth and decline are subject to their own environments. This is clearly important for my study, since I am putting a regional focus on these questions of enterprise and transfers of commodities. Out of that came work influenced by the Annales school, and a re-emerging interest in political economy, and eventually social history. Histories involving a closer look at ethnicity, gender, sexuality, labour, and religion gave another dimension to how we view Canada’s past. However, as Canada entered into a new millennium, fragmentation within the study of Canadian history had reached a crisis point. Ian McKay eventually wrote the essay “The Liberal Order Framework” which argues that historians should approach Canada “not as ‘an essence we must defend or an empty homogenous space we must possess,’ but rather as an ongoing ‘project of liberal rule.’”[2] In other words, instead of looking at Canada within its geographical boundaries, this framework investigates how liberalism as a specific worldview affected the way in which colonial peoples interacted, made decisions, and saw the world. Finally, one of the most popular ways that we approach history today is with a post-colonial consensus that Indigenous people are integral to any study of Canadian history; that we should not just view them as victims but rather try to understand how they displayed agency through their daily choices.

Issues
Although I used colour coding techniques to match the authors with their categories and produced a timeline that I felt adequately reflected some of the most basic moments in the historiography of Canadian economic development, I found the timeline visualization to be problematic when demonstrating the existing scholarship of my more specific topic. This timeline shows the viewer a basic categorization of developments over time, but it is far too broad to help me visualize the nuances of my Loyalist-Era, Niagara based project. One problem is that placing an historian into rigid, one-dimensional categories assumes that they are incapable of exploring more than one topic in their writing; an absurd presumption. For example, I placed Allan Greer under the category of “Annales school” even though he could also fit under the umbrella of “Regionalism.” I began to realize that imposing a specific beginning or end date to these categories does not accurately reflect the hundreds of people who might adhere to tenets of “Regional” or “Social” or “Traditional” histories outside of the boundaries I had prescribed here. Am I not currently in 2019 working on a regionally focused history of my own? Am I not also basing some of my assumptions on “traditional” theories?

A timeline’s singular categories do not permit engagement with multiple groups, but they also do not take into account the wide variety of economic and communication theories that historians have created and adapted over time. Scholars placed in different categories, while focusing on different topics can still share theoretical approaches to studies of economy. For example, both Ian McKay and Allan Greer display Marxist approaches to their writing of history. This timeline does not show these authors’ theories about how trade functioned, who held the power in economic relationships, and what drove the business networks in a certain place at a certain point in history. These categories alone show nothing of historians’ engagement with theories of environmental determinism, materialism, Marxism, economic determinism, or liberalism.

Another issue that arose was with the broad categorization of “social history.” From around the 1960s onward, gender history, Indigenous peoples’ histories, labour history, histories of religion, and more were all becoming more prominent in academia and despite their vast differences are all grouped under the same category. Ultimately, I realized that the timeline is far too general, squeezing historians into one-dimensional categories and ignoring their multi-faceted approaches to history that encompass a variety of geographical areas and time periods. Because of this, I wondered if there could ever be an ideal way of visually presenting historiographical information.

Solution
However, dealing with the issue of overlapping categories made me consider the solution of using a Venn Diagram. I wanted to show how my thesis fit into existing scholarship, so I substantially narrowed my focus. While researching the historiography of my topic, I realized that historians have studied Canadian economic development, the Niagara region, and the Loyalist era before, but few have studied all three simultaneously. This diagram shows the three areas that my project covers in terms of space, time period, and category of analysis. Canadian historians have always been fascinated by Loyalist history, many publishing studies of loyalism in Ontario, but these studies are mostly socio-political in nature, discussing the structural development of Upper Canadian government, the Family Compact and the tensions leading to the 1837-38 Rebellions. A general trajectory of Canadian economic history has developed over time, encompassing the growth of trade networks, migration patterns and industrialization throughout the large geographic area, but does not accurately reflect the economic development of Niagara itself. Finally, those historians that do look at the economic history of Niagara in most cases study the area in its early industrial years, focusing on the building of the Welland Canals and the railway system. These historians are completing their studies upon scholarship that has a weak substructure. There is a clear need for more in-depth studies of the very economic foundations of the Niagara region.

Placing the work of Canadian historians within classifications of:
1) Space (Niagara)
2) Time (Loyalist era)
3) Category of Analysis (Economic)

This Venn diagram eliminates the issue of singularly categorizing historians, allowing them to fill as many as three categories here. By looking at this diagram, you can see that there are a lot of Canadian historians who have studied Canadian economic history in the colonial period, but studying the Niagara region in a more specific lens is less common. You can also see that there are a couple of historians that do analyze all three areas. Bruce Wilson especially has contributed to this area of study in his 1983 book about the enterprises of Robert Hamilton, who was Niagara’s most prominent merchant in this time period. There are still issues with a Venn diagram, like the fact that it only allows for three categories. However, it is possible to make more complicated Venn diagrams with four or five circles if you want to get really specific.

What is a Mind Map? Taken from iMindMap.com

Other Ideas
There are many other ways that historiography could be visualized. Mind mapping is another effective way of organizing one’s thoughts, showing the relative importance of each point based on its size or location on the page, and showing how the points relate to one another.

Check out this video featuring Tony Buzan, the inventor of the Mind Map, as he explains some of the best practices for creating your own.


Notes:
[1] Michael Friendly, “A Brief History of Data Visualization,” in Handbook of Computational Statistics: Data Visualization, eds. Chen, Hardle & Unwin (Berlin: Springer-Verlag, 2006), 7.
[2] Jean-Francois Constant and Michael Ducharme, “Introduction: A Project of Rule Called Canada,” in Constant and Ducharme eds., Liberalism and Hegemony: Debating the Canadian Liberal Revolution, (Toronto: University of Toronto Press, 2009), 4.

Mapping New Knowledges Conference

Last Thursday, Brock University held its 14th annual Mapping the New Knowledges Student Research Conference, where graduate students from all departments are welcome to share their research in either oral or poster presentations throughout the day.

I was pleased to present my poster, share my research, and gather advice throughout the day, making it a successful first-ever conference appearance for me.

My fellow MA history students shared their work as well. It was a great way to meet new people and form new connections in a trans-disciplinary atmosphere. Thank-you to Brock’s Faculty of Graduate Studies and the Graduate Students’ Association for organizing the conference!