Overview Docs
Overview Docs is a tool used primarily by journalists to visualize and analyze large corpus/corpora of texts. The site includes a few default text sets to get you started, but you can also use your own.
Get Started with Overview
- Sign up for an account
- Upload a set of texts.
- You may use your own for this. Overview has a number of options for uploading texts.
- You can also use the CSV file we provided of Shakespearean Sonnets.
- Explore the different views provided
- The default view is a Word Cloud (you should be familiar with this!)
- Add some a view of Entities and a “Tree” view
- Entities explores unique words (much like Voyant) and can also extract city, corporation, and other data from the text files and allow you to quickly filter your corpus
- The Tree view (which you can title whatever you like–I just call it tree to reduce my own confusion).
- This view is a basic form or Topic Modeling
- A topic model organizes a corpus of documents based on the counts contained in the good ol’ bag of words combined with an algorithm that looks at proximity related to those words.
- Explore your corpus by clicking through the various groupings this view provides you with.
![Tree view of a Topic Model of Shakespearean Sonnets](https://dcnb.github.io/text-analysis-for-writers/images/overview-Tree.png)
For more on Overview Docs, check out their extensive blog.
The topic model feature is a good tool for writers looking to put together sections of a book, or seeing themes that might otherwise have gone unnoticed.