Bea Alex, Uta Hinrichs and myself have written a guest post, “Bringing Kew’s Archive Alive” for Kew Gardens’ Library, Art and Archives’ blog.
The post looks at how digital data produced by Kew’s Directors’ Correspondence team can be used as a source for visualising the British Empire’s 19th Century trade networks.
You can read the post in full here: http://www.kew.org/news/kew-blogs/library-art-archives/bringing-kews-archive-alive.htm
Here is one of the videos I created for the blog post:
Bringing Kew’s Archive alive from Jim Clifford on Vimeo.
I’ve been informed that my original download missed a lot of the files. I’m going to recreated the two graphs below over the next few days with the missing data and rework this post.
I’m working with Bea Alex on a blog post for the Kew Garden Directors’ Correspondence project. They shared their meta data collection with Trading Consequences and Bea reformatted it into a directory of 7438 xlm files (one for every letter digitized to date by the project). The metadata includes all the information found on the individual letter webpages (sample). Bea and the rest of the team in Edinburgh focused on extracting commodity-place relationships from the description field. We’re currently working with the data for coffee, cinchona, rubber, and palm to create an animated GIS time-map for the blog post we are writing. However, because this is one of the smallest collections we are processing in the Trading Consequences project, I decided to try and play around with the data a little more.
XML files are pretty ubiquitous once you start working with large data sets. They are generally easier to read and more portable than standard relational databases and presumably have numerous other advantages. The syntax is familiar if you know HTML, but I’ve still found it challenging to learn how to pull information out of these files. As with most things, coding in Mathematica, instead of Python, makes it easier. It turned out to be relatively straight forward to import all 7438 xml files, have Mathmatica recognize the pattern of the “Creator” field and pull out a list of all of the letter authors. From there, it was easy to tally up the duplicates, sort them in order of frequency (borrowing a bit of code from Bill Turkel) and graph the top ten (of the 1689 total authors). Continue reading
Timothy Bristow, a digital humanities librarian and Trading Consequences team member, and I are hosting a one day workshop on text mining in the humanities in the library at York University:
A macroscope is designed to capture the bigger picture, to render visible vastly complex systems. Large-scale text mining offers researchers the promise of such perspective, while posing distinct challenges around data access, licensing, dissemination, and preservation, digital infrastructure, project management, and project costs. Join our panel of researchers, librarians, and technologists as they discuss not only the operational demands of text mining the humanities, but also how Ontario institutions can better support this work. Read More
Co-Authored with Beatrice Alex
Trading Consequences is a Digging Into Data funded collaboration between commodity historians, computational linguists, computer scientists and librarians. We have been working for a year to develop a system that will text mine more than two million pages of digitized historical documents to extract relevant information about the nineteenth-century commodity trade. We are particularly interested in identifying some new environmental consequences of the growing quantity of natural resources imported into Britain during the century.
During our first year we’ve gathered the digitized text data from a number of vendors, honed our key historical questions, created a list of more than four hundred commodities imported into Britain, and developed an early working prototype. In the process we’ve learned a lot about each others’ disciplines, making it increasingly possible for historians, computational linguists, and visualization experts to discuss and solve research challenges.
Our initial prototype has limited functionality and focuses on a smaller same of our corpus of documents. In the months ahead it will then become increasingly powerful and populated with more and more data. Late last year, we completed the first prototype. Here’s a picture of the overall architecture:
[This is my first post for The Otter since I passed on the editorial duties to Josh MacFadyen in the summer]
One of the major weaknesses in using GIS for historical research are the limitations in showing change over time. GIS was designed with geography in mind and until recently historians needed to adapt the technology to meet our needs. Generally this meant creating a series of maps to show change overtime or as Dan MacFarlane did last week, include labels identifying how different layers represent different time periods. More recently, ArcGIS and Quantum GIS introduced features to recognize a time field in data and make it possible to include a time-line slider bar or animate the time series data in a video.
UK Tallow Imports, 1865-1904 from Jim Clifford on Vimeo.
[This post first appeared on ActiveHistory.ca]
When did the modern environmental movement begin? Did one event mark its beginning? Earlier this year we commemorated the fiftieth anniversary of Rachel Carson’s Silent Spring, which is often identified as bringing about the environmental movement. While this book’s importance is without question, focusing on it as the birth of environmentalism ignores the importance of urban environmental problems, from unsafe drinking water to severe air pollution, in raising people’s environmental awareness.
Ten years before Carson’s book, a great smog blanketed Greater London. From Friday December 5th through to the following Tuesday (Dec 9) 1952, the thick air pollution disrupted daily life and killed thousands of people. In the aftermath of the Great Smog, the British passed the Clean Air Act (1956). Continue reading