While digital humanities is often committed to the realism and rationality of big data when its practitioners build critical machines, recent work has placed the necessity of this association under question. Postcolonial Digital Humanities with its stated goal to examine the intersections between cultures of data and broader colonial projects, "The Manifesto of Modernist Digital Humanities" with its call to "decouple methodological strategies from the content of the objects we study," and the many essays in the special issue of differences on "The Dark Side of the Digital Humanities" all represent a turn to a new kind of critical digital humanities method: questioning why DH must absorb the logic of data into its practices while still continuing the project of a digital humanities as such.
Borrowing the term "critical informatics" from analysis of the work of information scientist Rob Kling, this panel seeks to document emergent methods for DH that are critical of the rational, realist, colonial, etc. bent that a culture of data often imposes on DH practices. The critical informatics turn in DH, so far, represents a new engagement with the rich methodological tradition of humanities criticism within the arena of the digital. To capture this new sense of the critical, we seek theoretical approaches, real-world examples, methodological treatises, rants, screeds, or manifestos that imagine or enact a critical informatics of the digital humanities.
It is not enough to train algorithms to make inferences about vast corpora. Algorithms that evaluate and interpret those inferences must also be designed. The future of scholarship and science in general, according to many visions, is automated at the level of hypothesis generation and interpretation. Citation analysis of humanities scholarship reveals a very strong Matthew effect: few papers are cited very often, and most are not cited at all. It may not be the case that citation equals readership, but it bears at least some relation to it. Therefore it seems safe to claim that the majority of humanities scholarship is read by very few people outside the editorial process. (This dynamic is even more prominent in other areas of research.)
Does a critical digital humanities have a responsibility to the great unread—not of primary literature—but of scholarship? This paper argues that it does. Machine inference techniques such as topic modeling can potentially reveal affinities between heavily cited and largely ignored scholarship. Individual scholars can use these techniques to explore existing research databases in a more egalitarian way, but I will also speculate about the future of automated scholarship with references to Stanislaw Lem's "Golem XIV" and Summa Technologiae.
The new wave of digital humanities (dh) research can be defined in one word: data. This trend involves applying methods borrowed from the sciences (forming a hypothesis, testing through experimentation, observing, analyzing the results, and reproducing the experiment) to provide evidence for assumptions made in humanistic inquiry. This in turn raises the question, what is the role of evidence in the humanities? Unlike the sciences, which rely on what Richard Haswell terms “RAD research,” or replicable, aggregable data, the humanities has long staked its claims in the authority and reliability of the author. This presentation seeks to identify what can be gained by applying algorithmic analysis to large-scale corpora, and likewise what may be lost. The presenter will showcase a series of projects that use data to reveal unethical practices or unfair bias in our field as a way to investigate the potential of “critical informatics” to address inequality in our traditional modes of practice.
As scholars are increasingly encouraged to employ digital tools and databases for research, what remains under-studied is the role of such entities in the mediation – or indeed production – of history and cultural heritage. This paper explores one particular effort in the digitization that relies on the Early English Books Online-Text Creation Partnership (EEBO-TCP), a collaboration between the University Libraries of Michigan and Oxford, the Council on Library and Information Resources, and the commercial publisher, ProQuest. Over the last decade, the searchable text files of EEBO-TCP have emerged as a valuable resource for historians and scholars of early modern England and English literature.
In this initiative of EEBO-TCP, English texts dating from the 15th to the 18th centuries are transcribed and encoded by workers in various Asian countries, and, more lately, India in particular. Given the British subjugation of the Indian subcontinent until 1947, the use of workers in the same region to provide digital access to English intellectual traditions seems worthy of further scrutiny. How might these circumstances of production influence our present-day investigations of that which is called English history? Furthermore, what are the local effects of such transcription and encoding work? Namely, what are the implications for workers in southeast Asia and India as they grapple with the pre-colonial English mores and morality transmitted in these early modern texts? This paper will raise such questions and likely answer none, but nevertheless embarks upon a consideration of how our initiatives in the digital humanities are complicit in a cultural imperialism that is expanding the reach of the West simultaneously backwards and forwards in time.
Artist Julian Oliver has observed that "Infrastructure must not be a ghost. Nor should we have only mythic imagination at our disposal in attempts to describe it." In our efforts to make the digital conceptually accessible, we speak of clouds and cookies, make virality a virtue, and imagine ourselves as resources to be (data) mined. Not only do figures such as these often construct the digital in ways that may not be in our best interests, but they often erase the signs of their construction as well. Specific technological choices and configurations become naturalized. As the digital humanities position themselves to work with big data and its methods, we should not lose sight of what Evan Selinger has called the contextual integrity of that data, the degree to which the "datafication" often mitigates against the contexts that make that it valuable in the first place. To that end, this presentation suggests that we need to develop a productive concept of friction, a counterpoint to discussions of virality or rhetorical velocity.