keylog.js: An Open Source Pedagogical Tool for DH and Data Studies
Taylor ARNOLD
University of Richmond, United States of America
Presents the pedagogical tool keylog.js, a minimal javascript-based tool that provides privacy-focused, client-side keylogging software served through a static website to address questions about the ethics, privacy, and accessibility of technologies and algorithms.
HTR of a historical manuscript with multiple languages, scripts, and hands
Martina Scholger1, Elisabeth Steiner1, Melanie Frauendorfer1, Sabrina Strutz1, Hans-Jörg Döhla2, Henning Klöter3
1University of Graz, Austria; 2University of Tübingen, Germany; 3Humboldt-Universität zu Berlin, Germany
This contribution investigates the application of Handwritten Text Recognition (HTR) to the automated transcription of a multilingual historical manuscript.
DoTS: FAIRly publishing your textual data with the DTS API
Philippe Pons, Vincent Jolivet, Jean-Victor Boby, Lucas Terriel
École des chartes - PSL, France
This poster aims to present DoTS, a comprehensive and functional suite of tools for publishing corpora in compliance with the DTS specification, integrating backend, API responses, and frontend for the creation of adaptable websites.
User Experience and Accessiblity in Digital Humanities Projects: A Survey
Kathie Gossett1, Liza Potts2
1Brigham Young University, United States of America; 2Michigan State University, United States of America
This poster investigates how digital humanities (DH) scholars world-wide implement user experience (UX) practices in their projects. Through a survey, the authors will explore the integration of UX methodologies, identify barriers to adoption, and aim to promote more accessible, user-centered digital tools, ultimately broadening DH’s reach and engagement.
Trauma Writing and Climate Migration Narratives
Parham Aledavood
Université de Montréal, Canada
This research examines a corpus of contemporary migration novels to explore trauma. Using a mixed methodology, it investigates how these narratives depict human and non-human migrations, challenging anthropocentric environmental discourse while revealing the cultural imagination of climate change through recurring motifs, emotional arcs, and anticipatory memory.
Beyond the Rugged Consumer: Enabling Communal Experiences in Digital Cultural Heritage
Jonatan Jalle Steller
Academy of Sciences and Literature Mainz, Germany
The poster presents five strategies to produce more communal experiences in cultural-heritage software, exemplified using the Cultural Heritage Framework. The strategies are developed against the background of experts or 'rugged consumers' being the de-facto target audience of many editions, dictionaries, repositories and similar offerings.
Development of a Commentary Generation System for Western Classical Texts
Ikko Tanaka1, Jun Ogawa2, Naoya Iwata3
1J.F. Oberlin University; 2National Institute of Informatics; 3Nagoya University
We present Humanitext Antiqua, a system employing Retrieval-Augmented Generation (RAG) and advanced large language models (LLMs) to generate scholarly interpretations of classical texts. Using Plato’s Republic as a case study, the system integrates primary texts, commentaries, and secondary literature, addressing challenges in traditional referencing and text segmentation for enhanced academic research.
Oracle Bone Reassembly Based on Diffusion Model
Guang Yang
BNU-HKBU United International College, China, People's Republic of
This paper introduces a machine learning approach to reassemble fragmented oracle bones, which are important materials for understanding early Chinese history. Specifically, we propose a model based on the Diffusion Model, a generative deep learning framework that has demonstrated remarkable performance in computer vision tasks in recent years.
Which chatbot generated the most racial and ethnic stereotypes?
Aleksandra Rykowska
Jagiellonian University in Kraków, Poland
This study proposes a comparison between three most popular chatbots: ChatGPT, Claude AI and Google Bard. Racial, ethnic and gender stereotypes were researched in the generated short stories. The stylo package for R and its function oppose() as well as the method of topic modelling were used in the study.
Webs of Cruelty: Network Analysis of Carceral Institutions for Girls and Women in 19th Century Indiana
Brianna Jean McLaughlin
Indiana University, United States of America
I have created a network that I have used to track influence, cashflow, and cruelty across 8 carceral institutions over approximately 50 years. In doing so, I can prove that there was an intricately weaved web of custodial cruelty among "fallen" girls and women in 19th Century Indiana.
Nature versus Artefacts: Places and Objects in Nineteenth-Century Novels from Spain and Latin America
Ulrike Henny-Krahmer, Caroline Müller
Universität Rostock, Germany
Space, places, and spatial objects have been of interest for literary historical research for a long time. We take up this state of research by analyzing natural and artificial spaces, places, and spatial objects in nineteenth-century novels from Spain and Latin America.
Towards the “Model Building in the Humanities through Data-Driven Problem Solving” based around the Japanese Literary Studies
Nobuhiko Kikuchi
National Institute of Japanese Literature, Japan
This paper introduces a large-scale DH research project in Japan that the National Institute of Japanese Literature is undertaking over the next ten years. The aim of this project is to construct big data on Japanese pre-modern texts and to promote data-driven humanities.
Programming Pedagogies: Exploring GitHub as a Platform for Coding Training in DH
Owen Monroe, Zoe LeBlanc
University of Illinois Urbana-Champaign, United States of America
This poster examines GitHub as a pedagogical platform in DH, analyzing its role in fostering coding literacy. By identifying pedagogical activities and practices, we explore how GitHub data can address gaps in training and inform the development of inclusive and effective programming pedagogies for the field.
The Social Sciences and Humanities Open Marketplace: contextualising digital resources in a registry
Clara Boavida2, Elena Battaner Moro3, Laure Barbot1, Michael Kurzmeier1
1DARIAH, Germany; 2Iscte-Instituto Universitário de Lisboa; 3Universidad Rey Juan Carlos
The SSH Open Marketplace is a discovery portal which pools and contextualises resources for Social Sciences and Humanities research communities: tools, services, training materials, datasets, publications and workflows. This poster presents how this service can provide insights into the use of tools, methods and standards in the DH research communities.
Controlled Vocabularies for a Knowledge Graph on Open Educational Resources
Petra C. Steiner1, Jonathan Geiger2, Frank Lange3, Abdelmoneim Amer Desouki1
1Technical University of Darmstadt; 2Academy of Sciences and Literature Mainz; 3RWTH Aachen University
The DALIA project aims to make open educational resources (OER) on data literacy accessible and interoperable. A knowledge graph is developed to link the materials, using the DALIA Interchange Format (DIF) to ensure transparency and interoperability. This poster focuses on picklists for DIF and invites feedback from the professional community.
Scholarly Navigation on an Open Science Platform: A Computational Study of OpenEdition’s Server Logs
Mohsine Aabid1,2, Simon Dumas Primbault1, Patrice Bellot2
1OpenEdition (CNRS / AMU), France; 2Laboratoire d'informatique et des Systèmes (LIS), France
This study analyzes OpenEdition’s server logs to uncover user navigation patterns across its platforms. Using methods like transition analysis, clustering, and topological modeling, it reveals platform fidelity, distinct user profiles, and shared interests. Future work aims to expand the scope with action-based analysis for deeper insights.
Mapping Collaborations in Performing Arts: Building the Festival d’Avignon Digital Corpus
Nicolas Foucault, Jeanne Fras, Clarisse Bardiot
Université Rennes 2, France
This poster presents the p2AFA corpus, a digital resource of Festival d’Avignon programs and playbills (1947–2024) for studying performing arts collaborations. Combining OCR, machine learning, and diplomatic transcription, it enables network visualization and historiographical analysis.
Intangible and Tangible heritage data integration. Models for management, visualization and knowledge. [INTHEDATA]
Patricia Ferreira-Lopes, Francisco Pinto-Puerto, Elena González-Gracia
Departamento de Expresión Gráfica Arquitectónica, Universidad de Sevilla, Spain
In this poster we will present the INTHEDATA project, in particular, the current state of knowledge and best practice in the area of cultural heritage and semantic knowledge model, the objectives of the project, its methodology by implementing the CIDOC-CRM standard and the first results.
Mirror, Mirror on the Wall: Enabling Computational Research on Beauty Ideals
Tim Gollub1, Pierre Achkar2,3, Martin Potthast4, Benno Stein1
1Bauhaus-Universität Weimar, Germany; 2Leipzig University; 3Fraunhofer Institute Leipzig; 4University of Kassel, hessian.AI, and ScaDS.AI
We present our work on the development of machine learning classifiers trained to assess whether a given input image aligns with a specific beauty ideal. The work is part of our effort toward enabling large-scale computational research on beauty ideals, a subject that is both culturally significant and socially impactful.
Ghost City:Augmented Reality Restoration of Two Hundred Lost Mosques in Belgrade
Uliana Pyadushkina
Faculty of Liberal Arts and Sciences Montenegro, Russian Federation
This project aims to recover the cityscape of lost Muslim heritage in Belgrade by superimposing 200 destroyed mosques onto the modern cityscape at their original locations, using 3D-models in augmented reality, with textures based on the restored appearances of the mosques derived from old photographs, documents and sketches from travelogues.
Development and Evaluation of the Information Retrieval System for Humanities Archives using LLM
Kenshin Kobayashi1, Koki Itagaki2, Tomoaki Tsutsumi2, Atsushi Matsumura2, Norihiko Uda2
1GLOBAL SECURITY EXPERTS Inc., Japan; 2University of Tsukuba, Japan
This study aims to establish an effective information-provision method for humanities research, and as part of this effort, we developed an information retrieval system utilizing recently prominent technologies, LLM (Large Language Models) and RAG (Retrieval-Augmented Generation). This paper describes the developed system and its performance evaluation.
Minimal Computing Meets Public History: The Stadt.Geschichte.Basel Approach to Open Research Data with CollectionBuilder
Moritz Twente1, Moritz Mähr1,2
1Universität Basel, Switzerland; 2Universität Bern, Switzerland
This poster highlights how Stadt.Geschichte.Basel created an Open Research Data Platform using CollectionBuilder. By applying minimal computing principles, the platform addresses challenges of accessibility, sustainability, and inclusivity in digital history. It provides adaptable, FAIR solutions that enhance interdisciplinary research, support marginalized perspectives, and foster long-term usability of historical data.
CLARIAH-ES: A Distributed Research Infrastructure for the Digital Humanities
Elena Battaner Moro1, Ainara Estarrona Ibarloza2, Aritz Farwell2
1Universidad Rey Juan Carlos, Spain (URJC); 2Euskal Herriko Unibertsitatea (UPV/EHU)
CLARIAH-ES is a Spanish national research infrastructure that strengthens research and facilitates innovative approaches within the digital humanities. By integrating language technologies, text analysis, cultural heritage, and multilingual resources, CLARIAH-ES offers a unique ecosystem for scholars interested in exploring the Spanish, Catalan, Galician, and Basque languages and cultures.
Romani Language in Google Translate: Ethical Considerations
Olga Shablykina, Leonardo Melis, Murad Mustafayev, Shayan Ahmed Shariff
IDMC, Université de Lorraine, France
Google including Romani in their MT engine raises ethical concerns regarding linguistic preservation and cultural respect. Lack of transparency, poor translation quality, possible negative implications for language speakers are among the issues. It appears that the BigTech companies prioritize quality over quantity when it comes to support of lower-resource languages.
READ-COOP and Transkribus: cooperative approaches to sustainable and responsible digital infrastructure
Melissa Terras1, Bettina Anzinger2, Guenter Muehlberger3, C. Annemieke Romein4, Andy Stauder2, Florian Stauder2
1University of Edinburgh, United Kingdom; 2READ-COOP, Innsbruck, Austria; 3Leopold Franzens Universität für Innsbruck, Austria; 4University of Twente, the Netherlands
How can we sustainably build digital scholarship infrastructures that best serve their communities, encouraging co-ownership and input into their development? This poster examines the cooperative business model underpinning READ-COOP (https://readcoop.eu) and Transkribus (https://transkribus.org), an Automated Text Recognition platform, providing a blue-print for the establishment of responsible, democratic, cooperative digital infrastructures.
Engaging Researchers for Improving Services and Training: Insights from the ATRIUM Survey and Researcher Forum
Tomasz Umerle1, Tiziana Lombardo2, Iulianna van der Lek3, Maria Ilvanidou4, Carol Delmazo5
1Digital Humanities Centre IBL PAN; 2Net7; 3CLARIN ERIC; 4Athens University of Economics and Business; 5OPERAS
The ATRIUM project enhances access to digital research infrastructures in Arts, Humanities, and Social Sciences by improving services and creating a tailored curriculum for the research community. The poster showcases how, through a survey and workshops, ATRIUM integrates community feedback to bridge skills gaps and deliver impactful open training resources.
Longevity, Accessibility, and Multilingual Micro-editions at Scholarly Editing: A Multimedia, Open-access Journal for Recovery Practitioners
Raffaele Viglianti1, Noelle A. Baker2
1University of Maryland, United States of America; 2Independent Scholar
Scholarly Editing is an open-access, peer-reviewed journal that welcomes contributions that feature rare or marginal texts and small-scale editions for the discoverability of underrepresented stories and artifacts. This poster will introduce the journal’s purpose and present the journal’s strategies to ensure the longevity of its digital content.
O multilinguismo da produção científica em Humanidades Digitais nos últimos 5 anos: uma análise a partir da Web of Science Core Collection
Maria Filipa Torres1, Maria Manuel Borges2
1Univ. Coimbra, FLUC; 2Univ Coimbra, CEIS20, FLUC
O multilinguismo deveria afirmar-se nas Humanidades Digitais (HD). O objetivo deste trabalho é analisar se a produção científica em HD na Web of Science Core Collection o reflete. É um estudo bibliométrico com um corte transversal retrospetivo (2020-2024). Conclui-se que o inglês predomina, mas existe espaço para outros idiomas.
Memory of 518: A Web-Based Platform Connecting Literature, Archival Records, and User-Generated Data
Chaeyeon Jeong, Moonui Kim, Jihyo Jeon
Korea University, Korea, Republic of (South Korea)
This project builds a web-based literary tour platform called ‘Memory of 518’, integrating literary works, factual records, and user-generated data related to the Gwangju May 18 Democratic Uprising. Using maps, 360-degree images, and user contributions, it documents and visualizes the fictional, historical, and everyday aspects of 518 Gwangju.
Geo-Databases on Paper - Structured Data from Historical Maps
Anastasia Bauch, Klaus Stein, Carmen Enss
UrbanMetaMappingTransfer, University of Bamberg
The proposed poster introduces a workflow for data extraction from historical maps into a structured format by manually digitising scanned maps with the OpenSource GIS software QGIS. We present our work in progress on a set of maps from our research in the UrbanMetaMapping project.
Bootstrapping Corpora Building of Low-Resourced Language Texts Using the Universal Declaration of Human Rights
David Bainbridge1, Sulhan Algee1, J. Stephen Downie2, Hemi Whanga3
1University of Waikato, New Zealand; 2University of Illinois, United States of America; 3University of Massey, New Zealand
Digtal Humanities scholars need NLP tools to create new corpora of low-resourced languages, but such tools need to be trained on “non-existent” corpora creating a classic boot-strapping problem. We use the text from the Universal Declaration of Human Rights, along with a lexicon-based interative search strategy, to overcome this problem.
Visualising Africa in Chinese Media: A Preliminary Computer-Assisted Study of 1950s-1980s Representation in Journal Illustrations and Book Covers
Jodie Yuzhou Sun (co-first author)1, Fudie Zhao (co-first author)2, Qilin Hu1
1Fudan University, China; 2University of Oxford, United Kingdom
This study explores the visual representation of Africa in Chinese media (1950s-1980s), creating a digital archive and applying AI tools, including large language models and Contrastive Language-Image Pre-training (CLIP), for bidirectional text-image retrieval, offering fresh insights into Sino-African relations and cross-cultural visual studies.
Customizing Omeka S for Linguistic Linked Open Data: A Case Study of the NINDA Language Resource Archive
So Miyagawa1,2, Yifan Wang1,3, Takanori Ito4, Tomokazu Takada1
1National Institute for Japanese Language and Linguistics (NINJAL), Japan; 2University of Tsukuba, Japan; 3University of Tokyo, Japan; 4Institute of Science Tokyo, Japan
NINDA (NINJAL Digital Archive) adapts Omeka S to manage linguistic resources, particularly for Japonic languages. It implements IIIF for multimedia content and OntoLex Lemon for lexical data structuring, supporting FAIR principles. The system handles annotated recordings, interlinear texts, and lexical databases, making linguistic resources more accessible to researchers and communities.
Integrity in Digital Scholarly Editing: The GreekSchools Case
Simone Zenzaro1, Angelo Mario Del Grosso1, Federico Boschetti1, Graziano Ranocchia2
1Istituto di Linguistica Computazionale "A. Zampolli" - CNR, Italy; 2Università di Pisa
Textual scholarship aims to reconstruct and publish texts through critical apparatuses. The DSL-based Digital Scholarly Editions (DSE) method merges traditional editing with computational techniques, enhancing workflows and adhering to open science principles. The GreekSchools project exemplifies this approach, and the CoPhiEditor implements it as a software solution.
Quil2Vec: A Tool for Vector Manipulation of Medieval Latin Script
Herman Gerrit Makkink
University of Vienna, Austria
This poster will present a tool currently being developed for doing image vectorization of medieval script, called “Quil2Vec”. This tool to intended to expedite the production of image vectors as ground truth for multiple different text-based machine learning research applications.
Enhancing Open Science through the SCIROS Project
Gabriela Manista, Maciej Maryl, Tomasz Umerle, Cezary Rosiński, Marta Świetlik, Magdalena Wnuk, Mateusz Franczak, Piotr Wciślik
Institute of Literary Research Polish Academy of Science, Poland
The SCIROS project aims to enhance Open Science in the humanities and social sciences by tackling theoretical, practical, and infrastructural challenges with 6 international partners. By fostering interdisciplinary collaboration and sharing insights via the blog, the project supports the widespread adoption of OS practices.
Building a Peer Review Framework for Non-Traditional Research Outputs
Françoise Gouzi1, Anne Baillot1, Sarah Bénière2, Carol Delmazo3, Toma Tasovac1
1DARIAH-EU; 2INRIA; 3OPERAS
This poster aims to present our ongoing work on developing the evaluation framework for open peer review assessment of non-traditional research outputs as a contribution toward maximising the quality and impact of Arts and Humanities research in Europe in the context of the Coalition for Advancing Research Assessment (CoARA).
Disputes over Cultural Power in Digital Repatriation: Insufficient Interpretations of Cultural Objects in Cross-cultural Contexts
yujue wang, jingya fan, hanying wen
Wuhan University, the People's Republic of China
After digital repatriation, cultural institutions often still remain digital replicas. This study compares metadata records of ancient Chinese paintings across various museums, revealing that interpretations in cross-cultural contexts are influenced by cultural backgrounds, and finally suggests improving original communities' control over digital replicas in legal, ethical, and technical aspects.
Privatbriefe als marginalisiertes Kulturgut
Debby Trzeciak1,2
1TU Darmstadt, Germany; 2Hochschule Darmstadt, Germany
Die Sammlung und Bewahrung von marginalisierten Kulturgütern wie Privatbriefen unterliegen bisher keiner einheitlichen Archivierung und Standards. Das Dissertationsprojekt adressiert die Frage, wie die nachhaltige und dauerhafte Erschließung nach internationalen Standards im Spannungsfeld der FAIR-, CARE- und Open-Prinzipien gelingen und die maschinenlesbare und interoperable Digitalisierung des Kulturguts ermöglicht werden kann.
“HumAInities: Exploring the Impact of AI on Humanities disciplines”
Michael Sinatra1, Dominic Forest1, Jean-Philippe Magué2
1Université de Montréal, Canada; 2ENS Lyon
Our poster will present the partnership development grant “HumAInities: Exploring the Impact of AI on Humanities disciplines”, its goals and expected results. Our project seeks to understand the changes brought about by the impact of AI on the production and dissemination of knowledge within the humanities.
Vedic Sanskrit OCR as a Bridge between Text and Image Platforms
Yuzuki Tsukagoshi, Ikki Ohmukai
The University of Tokyo, Japan
This study develops a Vedic Sanskrit OCR model to bridge the gap between text and image platforms.We fine-tunes TrOCR on Vedic, aligning images with texts using eScriptorium as a tool for creating groundtruth, suggesting a cyclic process to create text and image correspondences and to impove the performance of OCR.
A Multimodal Approach to Historical Sources in the 18th–19th Century Balkans
Kristiyan Sergeev Simeonov1, Maria Baramova2
1Sofia University "St. Kliment Ohridski", Bulgaria; 2Sofia University "St. Kliment Ohridski", Bulgaria
This poster proposes a multimodal approach to historical research, utilizing HTR, NLP pipelines, and GIS in a user-friendly manner. By integrating advanced computational methods with traditional humanities research, we aim to create a model that can be replicated for other underrepresented regions and languages.
From Late-Antique Text to 21st Century Literature Database: Babylonian Talmud Stories as a Case Study
Itay Marienberg-Milikowsky
Ben-Gurion University of the Negev, Israel
This poster explores the challenges and opportunities of digitizing late-antique literature, focusing on the Babylonian Talmud. By creating a comprehensive database of Talmudic stories, it aims to expand computational literary studies. The poster will discuss the methodological challenges involved in building this database, including text extraction, annotation, and modeling.
Detecting divergent language use in Russian Media during the Russo-Ukrainian War: Steps towards interpretable propaganda detection and analysis
Anastasiia Vestel, Stefania Degaetano-Ortlieb
Saarland University, Germany
This study examines divergent language use in Russian state-controlled media and social media during the Russo-Ukrainian war using the WarMM-2022 corpus and Kullback-Leibler Divergence (KLD). KLD offers interpretability advantages over more opaque machine learning techniques allowing a deeper understanding of how propaganda techniques are linguistically construed and evolve over time.
O compromisso com a Ciência Aberta: a Gestão de Acervos da Fiocruz
Mônica Garcia1, Maria Manuel Borges2, Maria Cristina Soares Guimarães3
1Univ. Coimbra, FLUC, Portugal; 2Univ. Coimbra, CEIS20, FLUC, Portugal; 3Fundação Oswaldo Cruz, Brasil
A proposta visa desenvolver um modelo de gestão de acervos científicos alinhado com as diretrizes internacionais de Acesso Aberto (AA), especialmente considerando o Plano S e as transformações no sistema de comunicação científica.
Creating Open Source, Multilingual DH Tools with Rust
Ian Patrick Goodale
University of Texas at Austin, United States of America
This poster highlights three open source software packages I created in the programming language Rust. The packages include lemmatizing, readability, and stylometry algorithms, and were intentionally designed to create new resources to facilitate analysis of and engagement with multilingual and non-English languages in the Rust ecosystem.
Doing Literature: A Multimedial Index of Research Outputs
Stefanie Messner1, Viktor J. Illmer2, Mark Schwindt2
1fortext lab, Technische Universität Darmstadt, Germany; 2EXC 2020 ‘Temporal Communities’, Freie Universität Berlin, Germany
Doing Literature is a web portal designed to collect and curate research contributions of the humanities in multimedia formats. It aims to develop an innovative framework that engages diverse audiences, thereby enhancing Digital Public Humanities and emphasising their collaborative character as well as their potential in knowledge creation.
Making cultural heritage open: a semantic portal for Germanic Cultural Heritage in Veneto
Chiara De Bastiani
Università Ca' Foscari Venezia, Italy
This poster presents a user interface developed within the OntoVE project. The poster focuses on the search interface, built with the Sampo Model (Ikkala et al. 2022), and its search perspectives, which allow users to explore data through the faceted search paradigm (Tunkelang 2009).
Computer-Assisted Hermeneutics of Philip K. Dick's Corpus: Constructing a Personal Knowledge Base with SpaCy and Obsidian for Literary Analysis
Yann Audin
Université de Montréal, Canada
This proposition showcases a Python library designed to interface with the text editor Obsidian to create a literary database of a corpus. We use Philip K. Dick's science-fiction as the exemplatory corpus, and showcase how classical Natural Language Processing can be used in computer-assisted literary hermeneutics.
|