DMDH: Play With Your Data materials!

Today we're experimenting with a new kind of event — one where people can get together and test out various data visualization tools, and just get a sense of what it's like to work with them, while Brian and Sarah and I are easily available to provide tech support.

To do this, we're providing links to several different tools, and tutorials that accompany them.

Gephi is a network visualization tool — it allows you to see the connections between various individuals. Here’s a good tutorial for getting started with it. Gephi also provides a number of test datasets that you can work with.

Edited to add: if you're working with Gephi on a MacBook, without a mouse, then you can hold down the Command key while navigating on your trackpad to move the graph around.

ManyEyes is a text visualization tool, created by IBM. Unfortunately, it only runs on browsers with Java-enabled, so you'll need to run it in Safari or Firefox. There are a number of different visualizations that you can run, and a huge number of data sets to work with — just search for a particular text. We have one sample data set below, and will have more available during the workshop: because of copyright issues, they can't be linked here.

Joseph Conrad’s Heart of Darkness

MIT's Simile Exhibit allows you to create a timeline using a Google Spreadsheet. It's a really useful tool, though it's a little more work than some of the others. See an example of its use here. There's a tutorial here, written by David Karger of MIT. We think you might also find it useful to have a version of the HTML code with extra comments for explanation of what each part is doing.

Commented version of HTML code: simile demo — you'll want to right-click this, and open it in a text editor.
Accompanying CSS stylesheet: egypt-styles

Finally, we think you might enjoy playing with the TAPor (Text Analysis Portal) toolsuite. TAPor has hundreds of tools for visualization — and most of them allow you to simply copy and paste plain text in, or upload a plain text file. Some of these tools are in development, meaning that they're a little bit crashprone — you just have to keep experimenting with them. Two tools that work well, however, are Textometrica and Voyant Cirrus.

If you want to find texts to play around with, Project Gutenberg is a great source — just look for the plain text file with UTF-8 encoding.

We've uploaded a few more texts files that you might want to play with:

The Communist Manifesto: communist_manifesto
Northanger Abbey: northanger_abbey


Add Comment Register

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>