Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.

 
 
Session Overview
Session
Opening Keynote: Libraries, Copyright, and Language Models
Time:
Wednesday, 09/Apr/2025:
9:50am - 10:45am

Session Chair: Andrew Jackson, Digital Preservation Coalition
Location: Målstova (upstairs)

1 level up from ground floor

Streamed to Store Auditorium.

Session Abstract

Javier de la Rosa will present the groundbreaking findings of the Mímir Project, a collaborative effort between the National Library of Norway, the University of Oslo, and the Norwegian University of Science and Technology. This initiative explores a pressing issue in AI development: the role of copyrighted materials in training large language models (LLMs).

In this keynote, De la Rosa will delve into how incorporating publisher-controlled copyrighted corpora—specifically, books and newspapers—affects the performance of Norwegian LLMs. By empirically testing various data mixtures, the Mímir Project provides critical insights into how copyrighted content improves model capabilities in tasks like sentiment analysis, reading comprehension, and translation. At the same time, the research raises profound ethical and legal questions, highlighting the ongoing tension between AI innovation and intellectual property rights.

Through this session, attendees will gain a deeper understanding of how copyright influences AI training, why certain datasets enhance (or hinder) model performance, and what this means for policy and fair compensation schemes for authors.




 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IIPC WAC 2025
Conference Software: ConfTool Pro 2.6.153
© 2001–2025 by Dr. H. Weinreich, Hamburg, Germany