Python and the Natural Language Toolkit

A graph showing the use of different words in a corpus of the U.S. Presidential Inaugural Addresses.

Since the end of the academic year, I’ve been able to focus a lot more attention on my post-doc research. This included a research trip in London archives and a week long course on databases at the Digital Humanities Institute in Victoria. Now I’ve started focusing on learning a programing language called Python. In the short-term, I don’t need to learn advanced computer skills for the Trading Consequences, as we have a team of highly skilled computer and linguistic experts. However, I do need a basic understanding of what we are actually doing when we text mine historical documents and looking forward to the end of the grant, I would like to be able to continued to work with the database. I would also like to develop the skills to continue this kind of research on my own in the future.

I always intended to start with the Programming Historian, but the new version will not come out for another few weeks, so instead, I began working through Learning Python the Hard Way over a few days in May. This was interesting, but solely focused on teaching programing and not particularly connected to the kind of research I would like to do. A few days ago I took a closer look at Natural Language Processing with Python, written by Steven Bird, Edward Loper and one of Trading Consequences team members, Ewan Klein. Reading the preface, it became clear the book was accessible people with no background in programming. The early chapters included both an introduction to computational linguistics and Python. Continue reading “Python and the Natural Language Toolkit”