ID: 110
/ PS: 1
Poster
Keywords: TEI encoding, newspapers and periodicals, best practices, standardization, digitization
Establishing Best Practices for TEI Encoding of Newspapers: A Case Study of the Darmstädter Tagblatt
K. Kuck, S. Kalmer
Centre for Digital Editions in Darmstadt (CEiD), ULB, TU Darmstadt, Germany
This poster addresses the reuse of periodicals and newspapers and proposes a best practices for TEI encoding in digitization projects, focusing on the Darmstädter Tagblatt. As part of the newspaper working group of DHd – Association for Digital Humanities in the German Speaking Areas, we recognize the growing need for standardized TEI encoding to facilitate data reuse across various newspaper projects. Leveraging insights from a recently initiated series of workshops on the topic of reuse of newspaper data, of which the first is taking place in Darmstadt in May 2024, the objective is to explore the potential of TEI in enhancing access to historical newspaper data while fostering collaboration among researchers. Currently, concepts are being developed and will come to fruition after the initial workshop. One idea is the creation of a universal TEI header that is suitable for editions of newspapers as well as periodicals. An ongoing digitization project at ULB Darmstadt, which is funded by DFG (German Research Foundation), will serve as the scientific basis. The poster will provide an overview of the project “a Darmstadt Newspaper in Three Centuries - Digitisation of the Darmstädter Tagblatt, 1740 – 1986, (Thomas Stäcker, Marcus Müller, Dario Kampkaspar, et. al.)”, highlighting its significance. As one of the longest running periodicals within Germany, it represents a heterogeneous data set and provides an excellent case study for establishing a best practice. It will also address the challenges and opportunities in TEI encoding of newspapers, proposing best practices tailored to meet the needs of diverse projects. Engaging with the international TEI community, we seek to foster discussions on standardization, collaborative markup, and the revitalization of the SIG "Newspapers and Periodicals." By establishing standardized TEI encoding practices for newspapers, we aim to facilitate collaboration, enhance access to historical resources, and advance research in digital humanities.
Count: 298
ID: 122
/ PS: 2
Poster
Keywords: library, standardisation, digitisation, metadata
Developing a Base Format for Heterogeneous Texts According to the TEI P5 Guidelines
S. Kalmer
University and State Library Darmstadt, Germany
The University and State Library Darmstadt (Germany) is collecting and digitising texts, i.e., scientific journal articles and monographs, digital scholarly editions, etc., to make them available for open access. Since not only the text types but also the file formats differ greatly, for example, JATS, TEI or non-XML formats, the goal is to transform these heterogeneous formats into one standardised format. For this, a so-called base format is being developed using the TEI P5 guidelines. The base format is basically a subset of the TEI, selected and put together with consideration of what information is needed, what information is present, and how it has to be depicted.
The basic structure consists of a <teiHeader>, a <standOff> for entities and a <text> containing at least a <body>, and, when applicable, also <front> and <back>. While the entire content of the input file gets mapped to and converted into TEI, the main focus lies on the metadata in the <teiHeader>. Problems arise on details, for example, how to depict a <persName> (do we have a separation into surname and forename in the source material?) or the title (is the title separated into main title and subtitle?). Moreover, the fact that different text types need different TEI modules has to be taken into consideration, for example an edition additionally requires a <msDesc>.
This base format guarantees not only a structured format that is the same for all texts but also that the handling of the metadata meets the criteria of the library, since the metadata of all texts will ultimately be put into the library catalogue. The texts will be made available and searchable on the library’s publishing platform for digital texts ‘TUeditions’ (https://tueditions.ulb.tu-darmstadt.de) at the Centre for Digital Editions. The base format itself will get a RelaxNG schema for validation, as well as a documentation in DITA format.
ID: 117
/ PS: 3
Poster
Keywords: Calderón de la Barca, Corpus literario, DraCor, Teatro, Análisis cuantitativo, Análisis de Redes
Codificación TEI y análisis de redes: a propósito de Calderón Drama Corpus (CalDraCor) v.2.0
H. Ehrlicher1, A. Rojas Castro1, S. Padó2, K. Jung2, A. Keith2
1Eberhard Karls Universität Tübingen, Germany; 2Universität Stuttgart, Germany
Tras la publicación en su totalidad en acceso abierto en 2022 en DraCor (Fisher et al 2019) bajo el nombre CalDraCor, la codificación en formato TEI de un corpus de 205 obras teatrales de Calderón de la Barca abrió nuevos caminos para la investigación del Siglo de Oro con métodos digitales, incluyendo el análisis cuantitativo (p.e., Ehrlicher et al 2020). Sin embargo, el desarrollo de nuevas preguntas de investigación ha generado la necesidad de un enriquecimiento del corpus más allá de la codificación original. A modo de ejemplo, este póster se centrará en la codificación TEI para el análisis de las redes sociales. Presentaremos la metodología que desarrollamos para realizar la segmentación de actos en escenas a partir de las acotaciones que indican entradas y salidas de personajes a fin de captar la interacción y discutiremos la plusvalía de la codificación TEI con ejemplos del teatro calderoniano.
Bibliografía
Ehrlicher, Hanno, Jörg Lehmann, Nils Reiter, & Marcus Willand. «La poética dramática desde una perspectiva cuantitativa: la obra de Calderón de la Barca». Revista de Humanidades Digitales 5 (25 de noviembre de 2020): 1-25. https://doi.org/10.5944/rhd.vol.5.2020.27716.
Fischer, Frank, Ingo Börner, Mathias Göbel, Angelika Hechtl, Christopher Kittel, Carsten Milling, y Peer Trilcke. «Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama». En Digital Humanities 2019: Book of Abstracts. Utrecht, 2019. https://doi.org/10.5281/zenodo.4284002.
ID: 119
/ PS: 4
Poster
Keywords: stage direction, drama, theater
Enter the <stage>. A Review of the Encoding of Stage Directions within the TEI
J. N. Jokisch
Max Planck Institute for Empirical Aesthetics, Germany
Stage directions and other non-dialogical aspects of the dramatic texts have received only minor scholarly attention. It is therefore not surprising that they remain undertheorized in the TEI as well. This is the host of a variety of problems. For one, it leads to certain not uncommon structures of the drama being impossible to encode within the TEI. For example, the blend of speech prefix and genuine stage direction in “Pendant ce temps, ADRIEN.” has no obvious nor trivial markup within the TEI. More generally, the current guidelines tend to mislead practitioners, resulting in questionable markup choices. Maybe the most aggravating example occurs whenever a dialog is introduced not with a speech prefix but with a stage direction. In these cases, the dialog is habitually ignored as a speech or the stage direction is changed to conform to our idea of a classic speech prefix. These problems are augmented by the fact that the term “stage direction” lacks a stable equivalent in many languages. The poster explores “impossible” and unconventional stage directions from French, Spanish, and German plays from the 17th and 18th century as well as their encoding within the TEI(-adjacent) projects Théâtre Classique, EMOTHE, Deutsches Textarchive, and TextGrid Repository. It aims at a productive criticism of the current TEI schema and guidelines and suggests ways to overcome existing shortcomings.
ID: 126
/ PS: 5
Poster
Keywords: global, multilingual, antiracist, decolonial, inclusivity
The Adaptive TEI Network: Antiracist, Decolonial, and Inclusive Markup Interventions
S. Revilla-Sanchez, D. Orizaga Doguim
University of British Columbia, Canada
This poster presentation introduces the “PhD CoLab” project (University of British Columbia, 2024-26, with the collaboration of the SFU Digital Humanities Innovation Lab, DHIL) which brings together graduate students, faculty, and staff from various fields in humanities, languages, and literatures. While based in Vancouver, an English-speaking North American education system, this multifaceted project is concerned with the continuities and limitations of text encoding across languages (English, Spanish, German and Russian), geographic regions, and literary genres. This poster will provide concrete examples to illustrate the larger objectives of the PhD CoLab.
One of the encompassed projects is NovElla, which focuses on making visible and accessible short prose fiction written by early modern Spanish writers. It includes a catalog of annotated bibliographic resources to help promote future research by both students and scholars. Another example, related to Latin America, is Unión Cívica Project that focuses on the newspaper Unión Cívica published by the eponymous political movement founded in 1961 in the aftermath of the Rafael L. Trujillo dictatorship (1930-1961) in Dominican Republic. We will offer high resolution digital reproductions of 140 issues, with annotations, to provide political and historical context.
Furthermore, the very structure of the Adaptive TEI Network, rooted in a team-oriented ethos, disrupts the traditional mode of solitary, humanistic research. PhD students collaborate in a transdisciplinary team-based, project-oriented environment where we learn from and with one another while we propose a new TEI schema for text-encoding projects that consider antiracist, decolonial, inclusive and feminist markup practices.
In short, the TEI schema aims to address some of the projects’ research questions like: Can we adapt current TEI modules or does an antiracist/decolonial and feminist engagement with the literary text necessitate new TEI markup standards or new modules? Is the TEI also robust enough to address/function for multilingual texts?
ID: 165
/ PS: 6
Poster
Keywords: tei transformation, JSON, DOCX, HTML, XSLT
A Python Library for TEI Conversion into Edition Formats
A. Kostyanitsyna1, D. Skorinkin2
1Independent researcher; 2Digital Humanities Potsdam
TEI/XML was designed to encode the structure and purpose of document parts, and not to represent them visually. Unlike HTML or DOCX, TEI is generally not tied to any particular software that would offer a standard way of rendering it. At the same time, TEI is not as easily interoperable as more popular data exchange formats like CSV or JSON, for which there are tools ranging from Excel to OpenRefine to programming libraries like Pandas. Developers used to dealing with JSON might get confused when confronted with the task of using TEI.
All this creates the need for easy-to-use transition tools from TEI to these more common formats. Traditionally, this has been accomplished with XSLT stylesheets (Rahtz, 2006). However, this approach has limitations. As of 2024, XSLT is not the most widespread technology in the developer world and it is not part of most programming courses. For continuous support of TEI in the future, it seems crucial to produce tools that wrap TEI transformation into more widely-known data processing ecosystems than bare XSLT (even if XSLT is used under the hood).
With that in mind, we have developed a TEI transformation library for Python (pypi.org/project/TEItransformer/). Currently the library performs conversion of TEI/XML into three formats: JSON, DOCX, and HTML. Each format is created by a separate class with its own settings. For the DOCX and HTML transformation, the user gets to choose a transformation scenario. The client interface for the user is implemented by the TEITransformer class. Under the hood, the conversion is divided into three main steps: validation, transformation, and stylization. The library enables specification of an XML schema and the input of ODD or CSS files for the output customization. Since ODD is XML-based, the user does not need to have knowledge of XSLT to adjust the transformation.
ID: 147
/ PS: 7
Poster
Keywords: collation, svg, xslt, javascript, interface
Visualizing the Frankenstein Variorum
E. Beshero-Bondar1, R. Viglianti2, Y. Jin3
1Penn State Erie, United States of America; 2University of Maryland, United States of America; 3Northeastern University, United States of America
The Frankenstein Variorum team has completed work on a digital scholarly edition that compares five distinct versions of the novel Frankenstein. While we have presented this project at several conferences in recent years, we propose to share a new view of the project at TEI 2024: a visual summary of our publication method, and a visual survey of our collation data. Now that we have fully published the Frankenstein Variorum's TEI edition files, we are analyzing what we have learned about how much the novel changed over five distinct instantiations. These include:
The 1816 manuscript notebook,
The 1818 first anonymous publication,
The Thomas copy's marginal handwritten revisions that were later lost,
The 1823 edition produced by Mary Shelley's father,
The 1831 edition, substantially revised.
For the TEI 2024 conference, we propose a poster to display two things: 1) our publishing process applying JavaScript-based static site generation to publish the edition, and 2) a "big picture" view of Frankenstein’s changes over time drawn in Scalable Vector Graphics (SVG) from our TEI data. The poster will show how our project's TEI grounds the edition's interactive visualizations of the novel's transformation over time.
Our TEI standoff spine and edition files store the collation data of the novel's five versions, and the pipeline processing and algorithmic refinement of those files has been the subject of several past conference papers and presentations. Now that the edition is complete, the TEI data invites us to analyze the edition's moments of alignment, divergence, and gaps where material was missing or removed. For this poster and for our edition, we illuminate the Variorum interface design and visualize Frankenstein's transformations, each built directly from the TEI. If successful, our poster will welcome discussion of our publishing architecture and invite a holistic, nonlinear exploration of the digital variorum edition.
ID: 155
/ PS: 8
Poster
Keywords: French, Caribbean, Digital Edition, TEI
The Revue des Colonies Scholarly Edition and Translation: a Distributed and Bilingual TEI Project
M. Beliaeva Solomon1, G. Pierrot2, R. Viglianti1
1University of Maryland, United States of America; 2University of Connecticut, United States of America
The Revue des Colonies Scholarly Edition and Translation project marks the first effort to digitally annotate and translate a landmark abolitionist periodical published in Paris between 1834 and 1842. Led by an international team of scholars, the project aims to make the Revue’s invaluable store of journalistic and literary contents accessible to the public. This poster will outline the project’s operations and workflows, as well as describing the TEI encoding strategies that we have adopted and plan to adopt.
The project’s operation depends on a dynamic interplay between the encoding team and the editorial team, both geographically distributed and each with distinct roles and expertise contributing to the project's overarching goal. The encoding team includes graduate students currently completing training modules in transcription, TEI encoding, and research in online databases.
The editorial team is composed of scholars covering a range of relevant disciplines who bring a deep understanding of the historical and cultural contexts in which the Revue des Colonies operated. These scholars compose meticulously researched annotations for the named entities identified in the text. Their annotations thus shed light on otherwise overlooked individuals, events, organizations, and historical documents, ensuring that the journal's original contributions to global discourse are recognized and contextualized within relevant scholarly fields.
Collaboration between these teams is facilitated by an online content management system that exports TEI data and adjusts the project customization ODD as content is added. The project’s TEI customization focuses on the encoding of named entities to enable the creation of bilingual, substantial, and cross-navigable entries. Translation of the Revue itself is undertaken by members of the editorial team with significant experience in professional French–English translation, ensuring both the accessibility of the text to the widest audience and its fidelity to the rhetorical and stylistic features of the original.
ID: 149
/ PS: 9
Poster
Keywords: publicación, anotación asistida por ordenador, extracción de entidades, FLOSS
TEI Publisher 9: más allá de TEI y de la publicación / TEI Publisher 9: going beyond TEI and publication
M. Turska1, H. Bermúdez Sabel2
1e-editiones, Switzerland; 2Jinntec, Germany
La filosofía de TEI Publisher gira en torno a la modularidad, reutilización y sostenibilidad gracias al uso de estándares. TEI Publisher nace con el objetivo de facilitar la producción de ediciones digitales, para que humanistas puedan crear productos académicos que responden a sus objetivos de investigación, con poca o ninguna programación. Esto se consigue gracias a un diseño modular que permite que las funcionalidades se puedan organizar y recombinar libremente. Esta concepción permite que personas con perfiles técnicos puedan hacer ajustes con facilidad, y que usuarios/as sin conocimientos de programación puedan aprovechar los valores predeterminados adaptados a diferentes modelos de edición.
TEI Publisher admite diferentes formatos, tanto de entrada como de salida. Además de TEI, TEI Publisher puede utilizarse para la publicación de documentos en otros estándares como DocBook, MS Word (DOCX) o JATS. Los documentos fuente no tienen que responder a un esquema específico, y serán fácilmente transformados a una variedad de formatos de salida para su publicación: desde una interfaz web, hasta un libro electrónico, un archivo PDF o su fuente LaTeX.
En las versiones más recientes de TEI Publisher se ha respondido a las demandas de la comunidad de usuarios/as que pedían ayuda para convertir e incorporar datos en diferentes formatos, así como para enriquecer la anotación de las fuentes sin tener que editar directamente el XML. TEI Publisher es pues más que una caja de herramientas para la publicación y explotación de documentos XML, convirtiéndose en un instrumento que permite la generación automática de documentos TEI y la anotación automatizada. Este póster presenta las características más importantes de TEI Publisher haciendo hincapié en las novedades introducidas en la versión 9.
|