Conference AgendaOverview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Please note that all times are shown in the time zone of the conference. The current conference time is: 13th Oct 2025, 06:22:58pm WEST
SP-21: Handwritten Text Recognition and Artificial Intelligence
Time:
Thursday, 17/July/2025:
11:00am - 12:30pm
Session Chair: Mikhail Biriuchinskii , Sorbonne Université
Location: Aud B2 (TB) Zoom link to be included 152 places
Presentations
Visualizing the 'New Woman': Analyzing Visual Content in The Delineator Using CLIP.
Luana Moraes Costa
University of Göttingen, Germany
This study explores how the American magazine The Delineator reflects the evolving representation of the 'New Woman' from 1894 to 1914 through its visual content. By employing artificial intelligence techniques, particularly CLIP, the research shifts focus from text to visual analysis, revealing insights into societal perceptions of femininity.
Using ChatGPT for generating SKOS thesauri from handwritten sketches
Felix Kraus , Nicolas Blumenröhr
Karlsruhe Institute of Technology (KIT), Germany
This paper demonstrates how ChatGPT simplifies SKOS thesauri creation from hand-drawn sketches or digital drafts, improving efficiency over traditional editors. Testing with DH and fictional taxonomies reveals high accuracy but minor errors. While less suited for large thesauri, this method promotes FAIR data practices and facilitates SKOS thesauri development.
Towards an automatic transcription of Catalan notarial manuscripts from the Late Middle Ages
Mariona Coll Ardanuy , Ramon Sarobe, Joan Giner-Miguelez, Felipe Gómez, Paolo Marangio, Mercè Crosas, Coral Cuadrada
Barcelona Supercomputing Center (BSC)
This paper introduces an interdisciplinary pilot project centered on the automatic transcription of Catalan manuscripts from the Late Middle Ages, focusing on notarial documentation. We describe the creation of a new dataset for our initial experiments. The resulting datasets, models, and code will be made publicly available.
Progress of The New Spain Fleets Project: accurate Handwritten Text Recognition models for 16th -17th century Spanish calligraphies.
Rodrigo Vega-Sánchez 1 , Edna Brito-Ramos2 , Francisco Cruz-Ríos3 , Fryda Montiel-Alejos4 , Andrea González-Aceves2 , Abril Hernández-Ronquillo2 , Martín Díaz-Vázquez2 , Ricardo Valadez-Vázquez5 , Lidia Camacho-Gamez6 , Guillaume Candela7 , Mariana Favila-Vázquez8 , Flor Trejo-Rivera9 , Alexander Sánchez-Díaz10 , Patricia Murrieta-Flores1
1 Lancaster University, United Kingdom; 2 Escuela Nacional de Antropología e Historia, México; 3 Independent researcher; 4 Archivo General de la Nación, México; 5 Universidad Nacional Autónoma de México; 6 Universidad de Guadalajara, México; 7 University of Leeds, United Kingdom; 8 Centro de Investigaciones y Estudios Superiores en Antropología Social, México; 9 Subdirección de Arqueología Subacuática-INAH, México; 10 Universidad de Alicante, España
We describe advances and results in developing four accurate Handwritten Text Recognition models for the automatic transcription of Itálica cursiva , Procesal simple , Redonda , and Procesal encadenada calligraphies, the most prevalent in 16th -17th -century Spanish American historical documents.
Using LLMs for post-OCR correction on historical French texts: A case study using synthetic data
Mikhail Biriuchinskii , Motasem Alrahabi, Glenn Roe
ObTIC, Sorbonne University
This study explores the use of large language models (LLMs) for correcting OCR errors in 19th-century French texts. Despite its advanced capabilities, fine-tuned models faced challenges with generalization, increasing error rates. The findings highlight limitations of LLMs in character-level OCR corrections and point to future research directions.