“Un Manuscrit Naturellement ” Rescuing a library buried in digital sand
1CNRS (Centre National de la Recherche Sientifique), France; 2Agence Limonade & Co
This long story about preservation of human thought began during the Middle Ages, with the creation of manuscripts by copyist monks.
An agreement was signed with the Ministry of Culture and IRHT to digitize all the manuscripts stored in French public libraries. In 2018, this corpus is among the most important digitized medieval source representing more than 6000 manuscripts: this is still a work in progress!
The fantasy of digital immortality is widely shared, but in reality, digital resources are highly fragile. In short, over many years, we have built a very safe and costly digital necropolis progressively covered by layers of digital sand rather than a clean organized library. This paper will present the consecutive operations made during the preservation project of this very valuable collection of manuscripts.
A Database of Islamic Scientific Manuscripts — Challenges of Past and Future
Max Planck Institute for the History of Science, Germany
I will present the database of the Islamic Scientific Manuscript Initiative (ISMI) which aims to make accessible information on all Islamic manuscripts in the exact sciences (astronomy, mathematics, optics, mathematical geography, and related disciplines), whether in Arabic, Persian, Turkish, or other languages from the 9th to the 19th century.
The first version of the database was built in 2006 using a flexible graph-like data model that developed and expanded over time.
The database and its web presentation are now being migrated to new standard tools like a Drupal web frontend, a CIDOC-CRM based data model and a ResearchSpace based backend.
The new Drupal frontend is already online offering access to more than 6900 witnesses of 2300 texts and an experimental area with access to the graph database.
The ISMI project aims to be a continuing and growing resource in the future and we invite all interested to participate.
Analytical Edition Detection In Bibliographic Metadata
1University of Helsinki, Finland; 2University of Turku
Analytical bibliography's aim is to understand books and other printed objects as artifacts and how they were produced. Bibliographic metadata can represent important historical trends and resolve issues such as the ordering of editions.
In this paper, we present the state of the art analytical approach for determining editions and their ordering. By providing harmonized data and information on historical developments in book production, this will be a great aid for projects aiming to do large-scale text mining. Contemporary text mining approaches do not utilize edition level information to the fullest extent and therefore are limited in their scope.
Using the ESTC metadata, we have developed harmonizing techniques that convert free-form text into more coherent entries for statistical analysis. Furthermore, a new gold standard was developed for validation purposes, with multiple layers of information. The use of this data would significantly enhance the understanding of early modern publishing.
The Emerging Paradigm of Bibliographic Data Science
1University of Helsinki, Finland; 2University of Turku, Finland
In order to facilitate research use of library catalogues, we recently proposed the concept of bibliographic data science. This aims to improve data reliability and completeness through systematic and reproducible harmonization, deriving from the paradigms of open science and data science. We have constructed a comprehensive bibliographic data science ecosystem that facilitates semi-automatic harmonization and enrichment of bibliographic entries. The work is based on an iterative process where research use often leads to new enhancements in data processing. The overall ecosystem integrates a number of distinct workflows that are dedicated to harmonizing specific subsets of the data collections. Further algorithmic tools support the integration with other data sources, such as full text collections, and final statistical analysis, visualization, and summarization of the data. As such, bibliographic data science can advance the methodological and conceptual basis in book history and digital humanities.
Syriac Persons, Events, and Relations: A Linked Open Factoid-based Prosopography
Texas A&M University, United States of America
This paper explores the development of a prosopographical database for the field of Syriac studies called SPEAR: Syriac Persons, Event, and Relations. Syriac is a dialect of Aramaic used in the Near East between the 3rd and 8th centuries and continues to be used liturgically by Christians in the Middle East and India as well as expatriate communities in Europe and North America. This project employs a factoid-based approach to prosopography. Where most factoid-based prosopographies organize data in a relational database, SPEAR encodes prosopographical data from primary source texts in TEI XML using a customized schema designed to facilitate linking this propopographical data to other linked data resources and for serialization into RDF. SPEAR shows how a prosopography project can employ TEI, field-specific scholarly standards, and Linked Open Data to produce a highly structured and semantically rich database that maintains close ties to the texts from which it is derived.