Voyant Tools

Voyant Tools is an extremely powerful and modular online tool set for visualizing one or many documents. It works particularly well across a number of documents, and we will be exploring some of those capacities here. We will be adapting the tutorial materials supplied by Voyant-Tools help pages:

We tend to view workshops on Voyant as serving two purposes:

  1. how to use Voyant and get a grasp on what’s available (how to use it)
  2. how to think differently about texts with tools such as Voyant (why use it)

Basics of Voyant

The About page contains a useful introduction to Voyant: what it is, who has contributed, and some of the core design principles. It’s worth reading as time permits.

Some points on Voyant

Ok, that’s all somewhat abstract and conceptual, let’s jump into using Voyant.

Create a Corpus

There’s more detail in the Creating a Corpus page, but there are four main ways of creating and using a corpus in Voyant (“corpus” is another word for a set of documents):

  1. open an existing corpus (click on the “Open” button under the text box)
  2. type or paste text into the main text box (this creates a corpus with one document)
  3. type or paste one or more URLs into the main text box (one URL per line)
  4. click the “Upload” button under the text box to use files from your computer

Take a moment to try each kind of corpus source (open, text, URLs, upload) in the box below (or in a new window).

Some tips:

If an error occurs during corpus creation you may see an error message appear. Unfortunately, sometimes the system just fails silently (especially if you’re creating a big corpus and there’s a server timeout). One advantage of using a local instance of VoyantServer is that you may see some helpful errors reported in the VoyantServer window.

Voyant can be used with a corpus of variable size, from one document to many. Some of the functionality depends on multiple documents and some tools work less well when there are hundreds or more documents.

It’s good to think of a corpus as a fluid concept, you may be able to change the meaning as you proceed. For instance, if you have thousands of tweets, you could treat each one as a separate document, or combine tweets by time, by author, or by some other criteria. Sometimes you might want to edit a corpus or sometimes you might want to create a new corpus for comparison.

We won’t experiment for now with more advanced options for Corpus creation, but there are several options available to tweak the processing of text, XML, and even spreadsheets.

Explore a Corpus

Once you create a corpus you will arrive at the default “skin” or arrangement of tools. There is a lot happening at once, but we’ll start by describing the tools that you see and how they can interact.

Default Skin

At first you will see three tool panels along the top and two tool panels along the bottom:

Tool Interactions

An essential part of Voyant is that events in one tool can cause changes in other tools (the exact interactions depend on a number of factors, including which tools are visible). For instance, try the following sequence in the window above:

Cirrus

Now let’s zoom in for a moment on Cirrus. We might ask ourselves several questions about what we’re seeing. For instance, what does size represent? What about colour? What about the placement of words? This is a text in English but the most commonly occurring words aren’t there (like “the” and “a”), why not? For a tool like Cirrus some of these questions may be obvious, but for all tools it’s good to continuing asking ourselves what we’re seeing, what we’re not seeing, and why.

Options

Word Cloud options from Voyant Tools
Word Cloud from Voyant Tools

Some options in Cirrus (like other tools) are available directly within the tool panel, like the “Scale” button and the “Terms” slider (experiment with both of these to see what they do). Other options are accessible from the options dialog that appears when you click on slider in the grey bar of the tool (like the image to the left). The full list of options are described in the Cirrus documentation page, but let’s have a quick look at the Stopwords option by clicking the “Edit List” button.

This will show a dialog box with a long list of words that are excluded from the top frequency words in Cirrus. These are generally “function” words, or words that carry less meaning, but you may be surprised to see some of the words included (like “must” or “nobody”). Likewise, you may want to add a word that’s not currently in the list (like “said”). Either way it’s good to know what’s there and to confirm that’s what you want. You can find more information on stopwords.

Summary

One last thing to point out from the Summary tool (bottom left): the “Distinctive words” list. Assuming your corpus has multiple documents, this shows the words that are not only high frequency, but high frequency and relatively distinctive to that document (the frequency is weighted by how often it’s found i other documents using something called TF-IDF).



WORK TIME!



Additional Tools

Voyant’s default view (or skin) shows a collection of 5 tools, but in fact there are many more tools available in Voyant. You may have already discovered one way of accessing some of them: by clicking on a tab (for instance, the Cirrus tool can be replaced by the Terms or Links tools simply by clicking on the tab (and of course you can click on the Cirrus tab to return to the default view).

More Tools information from Voyant Tools
How to change tools in Voyant Tools

The tabs are pre-programmed alternatives, but you can also choose from a much longer list of tools by clicking on the little window icon that appears when hovering over the header (either the blue header at the top that replaces all of the tools in the window or the grey header in each tool panel that replaces just that tool). Additional tools are organized into the following categories (tools can appear in multiple categories):

Another convenient way of browsing tools is to consult the list of tools, especially as there is a small thumbnail image and short description for each tool.