Getting Started with the Google nGram Viewer

Getting started with the Google nGram Viewer couldn’t be easier: just go to the nGram Viewer site, replace the default names of “Albert Einstein, Sherlock Holmes, Frankenstein”with a word or phrase of your choice, and a graph soon appears, charting the fluctuating use of the term over time (more precisely, the number of times the word appears in a given year relative to the entire body of words in books published in that year).

The results are very often intriguing. Type in “embarrassing,” for example, and you discover a sudden surge in usage starting from the end of the twentieth century:

Once you’ve tried out a few words or phrases (as Google’s default example of Einstein, Holmes, and Frankenstein shows, you can also insert multiple words separated by commas), check out these short videos to get more of a feel for the the sorts of questions that the nGram Viewer can be used to study.

These early articles in the Atlantic magazine and Slate offer more detailed information about both the history of the nGram Viewer and its possible uses. This 2013 follow-up Atlantic article underscores the new possibilities opened up by introduction of wildcard searches into the Viewer.

Because of its power and ease of use, the nGram Viewer should be part of the repertory of all students of culture. It is important, however, to be mindful of its limitations. In particular, it is critical to understand that the database is far from comprehensive, and subject to the selection bias of those books that Google has scanned. Moreover, since the scanned pages had to be converted into texts using OCR technology, which inevitably introduces some erroneous readings—a problem particularly noticeable in pre-1800 texts. The current iteration of the nGram Viewer also allows for searches on texts in French, Spanish, Italian, German, Chinese and Hebrew, but the cautions about selection limitation for texts in English apply all the more for searches in these languages. See this article for its limited useful for Chinese texts.

Given these limitations, the graphs generated by the nGram Viewer should generally not be viewed as conclusive proof or definitive evidence. Their chief value lies, rather, is heuristic: the nGram Viewer is a wonderful way of generating research questions, of discovering fresh puzzles to investigate.

The above example of “embarrassing,” for instance, makes us wonder: What is the meaning of this recent intensified reliance on this adjective? If we reset the date range (upper left corner) from the default 1800-2019 to 1700-2019, we find discover a development that is perhaps even more suggestive and intriguing: the word seems to have first come to be adopted in the latter half of the eighteenth century. Why? What cultural changes might account talk of “embarrassing” come into vogue starting from around 1750?

One of the most powerful, though often neglected, features of the nGram Viewer is the possibility of pursuing such questions by directly perusing the sources from which the graph is generated. Just click on the array of years below the graph to look inside the books published during a particular span that contain the word.

Clicking on 1800-1829 thus turns up uses of “embarrassing” between January 1, 1800 to December 31, 1829.  Click on this button at the top of the page to customize the dates and examine texts containing the word during any range. In this case, for example, we might wish to look at texts between 1750-1800 for clues about why “embarrassing” came to widely-adopted in this period.

In addition to studying the contexts in which “embarrassing” was first invoked, a complete analysis of the vagaries of the adjective would almost certainly need to investigate its trajectory alongside the trajectories of related terms, such as “mortifying” and “shameful”:

While it presents a suggestive sketch of a term’s history of usage, the nGram Viewer generally gives no precise information about when it first appears. Guidance about the question of first use should be sought rather in etymological reference texts. The classic reference for checking the earliest use of an English word is the Oxford English Dictionary [note: access requires Harvard ID login]; but many may find the publicly-accessible Online Etymology Dictionary both easier to use and more useful for discovering unexpected connections between words.

