SP-26: Tools Interfaces and Infrastructures
Friday, 12/Jul/2019:
11:00am - 12:30pm

Session Chair: Simon Gabay
Location: Cloud Nine
250 pax

Complexities And Compromise – User-Centred Interfaces For Public Humanities projects

Monika Renate Barget, Susan Schreibman, Pádraig MacCarron

Maynooth University, Ireland

Drawing on recent experiences of the Letters 1916-1923 team in re-designing their project website, this paper will elaborate how changing user expectations, academic standards and special requirements of source material can be reconciled in the creation of database driven interfaces designed for public humanities projects. Digital Humanities scholars agree that interfaces are “part of the design” and need to visually tell the project’s story. But despite extensive theoretical discourse on user-centered designs, many DH projects still tend to fall short in practice. This paper will explore these issues theoretically and practically by describing some of the initial problems and design choices made in the course of the recent Letters 1916-1923 re-launch, thus contributing to an on-going discourse. The results of onsite user testing, in particular, have shown how user expectations have transformed since the first release of the website in 2015, and these findings may benefit similar public humanities projects.

Improving OCR of Black Letter in Historical Newspapers: The Unreasonable Effectiveness of HTR Models on Low-Resolution Images

Phillip Benjamin Ströbel, Simon Clematide

University of Zurich, Switzerland

We showcase the usefulness of Handwritten Text Recognition (HTR) models when it comes to the recognition of black letter in historical newspapers. We illustrate how simple the production of a ground truth, the training and the evaluation of such HTR models are with the help of the integrated platform Transkribus. Our paper highlights that a model trained on only a limited amount of data achieves state-of-the-art performance and beats commercial software like ABBYY FineReader by oftentimes large margins. We are particularly interested in how HTR models trained on medium-resolution data perform on high-resolution images and we are able to show that the performance is comparable, which means that costly and time-consuming re-digitisation processes are not required in order to improve OCR quality. Moreover, we investigate the transferability to other newspapers. In short, our findings demonstrate how digital humanists can improve their source material for text mining with a reasonable effort.

Kraken - a Universal Text Recognizer for the Humanities

Benjamin Kiessling1,2

1Université PSL, France; 2Leipzig University

Kraken is a language-agnostic optical character recognition engine that can be applied to both printed and handwritten texts with relatively modest training effort. It includes a number of features making it of special interest to digitization work in the humanities.

One More Time With Feeling: Revisiting XPointers to Address the Complexities of Promptbook Encoding

Joey Takeda1, Jennifer Roberts-Smith2

1University of British Columbia, Canada; 2University of Waterloo, Canada

TEI has long supported the use of XPointers (Grosso et al. 2003), but they are seldom implemented or recommended as a method of linking TEI documents (Cayless 2013). We make the case that they may still a viable option for TEI projects, by means of a real-world example in which XPointers are necessary: the Waterloo-based Stratford Festival Online (SFO) project, which aims to encode the Festival’s world-class collection of theatrical promptbooks (Malone 2013; 2018). To represent the complex ontologies of the contents of promptbooks, our research team is developing an approach that uses two data files linked by stand-off markup and XPointers, one for the verbal text that a stage manager uses as a timeline during a performance, and the other for the non-verbal events the stage manager enacts or monitors (Roberts-Smith, Kaethler, Malone et al. forthcoming). This short paper is illustrated by a sample implementation in XSLT.

Laboratoire numérique pour l’étude de paratextes : l'exemple de Tacitus On Line

Anne Garcia-Fernandez1, Isabelle Cogitore1,2

1Univ. Grenoble Alpes, CNRS, Litt&Arts, 38000 Grenoble, France; 2Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, Grenoble INP, Sciences Po Grenoble, MSH-Alpes, 38000 Grenoble, France

Nous proposons d'exposer une modélisation et des outils de visualisation pour l’étude de paratextes. À partir du corpus des commentaires de Juste Lipse aux Annales de Tacite, nous défendons l’intérêt de proposer des solutions propres aux objectifs scientifiques du projet tout en respectant des standards et permettant la documentation et la réutilisation des outils. Notre démarche est fondée sur les principes suivants : le questionnement préalable de la nature de l'objet d'étude et sa définition ; la volonté de servir avant tout les objectifs scientifiques du projet ; et la mise en place de solutions permettant la réutilisation tant des données que des outils et méthodes.

