The first seminar I attended in the academic year 2012-2013 wasn't a medieval one but one organised by KCL's Department of Digital Humanities, which featured Alan Liu from UC Santa Barbara, a veteran of the digital humanities. (He started his Voice of the Shuttle catalogue of websites in 1994). Alan was talking on "The Meaning of Digital Humanities", and arguing that issues about the meaning of the digital humanities are really about the wider question of the meaning of the humanities themselves, and about how you get from numbers to meaning.
The talk was part of his putting together of an introductory essay on the digital humanities for the PMLA journal. Alan's background is in English literature, so one of the interesting things for me was hearing about literary attitudes to digital methods. He contrasted history, which is relatively used to working with big data, and literary studies which aren't. For history he was mentioning GIS projects, such as the Stanford Spatial History Project, but also pointing out that there was a much longer cliometric tradition, especially in economic history. Historians don't think that counting things is necessarily diminishing the humanities.
Alan also touched on the different techniques that digital projects could use: one aspect is quantification, with its inevitable problems of losing context. But he also pointed out the possible use of digital models and visualisation, as a way of reducing dimensions to see patterns (such as generating social network diagrams that aren't incomprehensible blurs).
As an example of the use of digital methods in the study of literature, both in its methodology and its problems, he mentioned a project from the Stanford Literary Lab: Ryan Heuser and Long Le-Khac , A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method . This focuses on finding clusters of words with the same usage trends, and Alan was discussing the possibility of the hypothesis-free initiation of analysis. That is, it's possible to use algorithms to "play games" and find patterns within your data, and only bring in the human interpreters and close reading at a later point, when you've got material that looks statistically significant.
The problem is that you may have to set up the parameters in a way that's already potentially rigged the game. For example, what this project did was start from "seed words", such as "land" or "country" and generate sets of terms that had similar usage trends to these over time. In practice, they had an oscillating dialogue between the empirical data of words that behaved similarly and words that humans thought were semantically linked. Alan suggested this hybridity of methods may be necessary, and quoted Stephen Ramsay: "the best digital humanities is the hermeneutics of screwing around". He also pointed out that one of the big problem of such projects is that they're often insufficiently documented: researchers need to provide more details of both how the data corpus is formed and how it is cleaned. (Cleaning up data at inputting stage is a major issue for most projects).
A project like that of Heuser and Le-Khac suggests the possibilities of digital humanities as one tool for scholars. But there's still the question remaining when you've generated this data: what is the meaning of changes in word-use frequency? And this links into the problem of the meaningfulness of the humanities as a whole Ė where is the residual space for humans in a world of scientific golems? Do we need Raymond Williams on culture if we've got http://books.google.com/ngrams">Google Ngram Viewer?
Alan concluded by saying that those working in humanities have to come up with wider answers about the significance of the humanities and also about new forms of digital pedagogy (especially with the rise of MOOCs). In the discussion afterwards, he referred to the possibility of humanists as being the repositories of the meaning that can't be extracted from the texts. It's a line that has interesting implications for the current Making of Charlemagne's Europe project and the tension between charter as data-source and charter as one-off textual and material object. Digital humanities sometimes oversells itself as a new paradigm for all humanities research, but it is making us think about how and what we study in some very interesting ways.