Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available). To only see the sessions for 3 May's Online Day, select "Online" for location.

Only Sessions at Location/Venue 
Session Overview
Thursday, 11/May/2023:
11:00am - 12:30pm

Session Chair: Ditte Laursen, Royal Danish Library
Location: Theatre 1

These presentations will be followed by a 10 min Q&A.

Show help for 'Increase or decrease the abstract text size'
11:00am - 11:20am

Through the ARCHway: Opportunities to Support Access, Exploration, and Engagement with Web Archives

Samantha Fritz

Archives Unleashed Project, University of Waterloo, Canada

For nearly three decades, memory institutions have consciously archived the web to preserve born-digital heritage. Now, web archive collections range into the petabytes, significantly expanding the scope and scale of data for scholars. Yet there are many acute challenges research communities face, from the availability of analytical tools, community infrastructure, and inaccessible research interfaces. The core objective of the Archives Unleashed Project is to lower these barriers and burdens for conducting scalable research with web archives.

Following a successful series of datathon events (2017-2020), Archives Unleashed launched the cohort program (2021-2023) to facilitate opportunities to improve access, exploration and research engagement with web archives.

Borrowing from the hacking genre of events often found within the tech industry, Archives Unleashed datathons were designed to provide an immersive and uninterrupted period of time for participants to work collaboratively on projects and gain hands-on experience working with web archive data. The datathon series cultivated community formation and empowered scholars to build confidence and the skills needed to work with web archives. However, the short-term nature of datathons ultimately saw focused energy and time to research projects diminish once meetings concluded.

Launched in 2021, the Archives Unleashed cohort program was developed as a matured evolution of the datathon model to support research projects. The program ran two iterative cycles and hosted 46 international researchers from 21 unique institutions. Programmatically, researchers engaged in a year-long collaboration project, with web archives featured as a primary data source. The mentorship model has been a defining feature, including direct one-on-one consultation from Archives Unleashed, connections to field experts, and opportunities for peer-to-peer support.

This presentation will reflect on the experiences of engaging with scholars to build scalable analytical tools and deliver a mentorship program to facilitate research with web archives. The cohort program asked researchers to step into an unfamiliar environment with complex data, and they did so with curiosity while embracing opportunities to access, explore, and engage with web archive collections. While the program highlights a broad range of use cases, we seek to inspire the adoption of web archives for scholarly inquiry more commonly across disciplines.

11:20am - 11:40am

‘Research-ready’ collections: challenges and opportunities in making web archive material accessible

Leontien Talboom1, Mark Simon Haydn2

1Cambridge University Libraries, United Kingdom; 2National Library of Scotland, United Kingdom

The Archive of Tomorrow is a collaborative, multi-institutional project led by the National Library of Scotland and funded by the Wellcome Trust collecting information and misinformation around health in the online public space. One of the aims of this project is to create a ‘research-ready’ collection which would make it possible for researchers to access and reuse the themed collections of materials for further research. However, there are many challenges around making this a reality, especially around the legislative framework governing collection of and access to web archives in the UK, and technical difficulties stemming from the emerging platforms and schemas used to catalogue websites.

This talk would primarily address IIPC 2023's Access and Research themes, while also touching on the Collections and Operations strands in its discussion of a short-term project promising to deliver technical improvements and expanded access to web archives collections by 2023. The presentation would like to challenge and explore the difficulties the project encountered by offering different ways into the material, including exposing insights that can be generated from working with metadata exports outside of collecting platforms; detailing the project’s work in surfacing web archives in traditional library discovery settings through metadata crosswalks; and exploring further possibilities around the use of Jupyter Notebooks for data exploration and the documentation and dissemination of datasets.

The intended deliverables of this session are to present the tools developed within the project to make web archive material suitable and useful for research; to share frameworks used by the project’s web archivists when navigating the challenges of archiving personal and political health information online; and to discuss the barriers to access around collecting web archive and social media material in a UK context.

11:40am - 12:00pm

Developing new academic uses of web archives collections: challenges and lessons learned from the experimental service deployed at the University of Lille during the ResPaDon Project

Jennifer Morival1, Sara Aubry2, Dorothée Benhamou-Suesser2

1Université de Lille, France; 2Bibliothèque nationale de France, France

2022 marks the second year of the ResPaDon project, undertaken by the BnF (National Library of France) and the University of Lille, in partnership with Sciences Po and Campus Condorcet. The project brings together researchers and librarians to promote and facilitate a broader academic use of web archives by demonstrating the value of web archives and by reducing the technical and methodological barriers researchers may encounter when discovering this source for the first time or when working with such complex materials.

One of the ways to meet the challenges and address new ways of doing research is the implementation of an experimental remote access point to the web archives at the University of Lille. The project team has renewed the offer of tools and conducted outreach to new groups of potential web archive users.

The remote access point to web archives has been deployed in two university libraries in Lille: this service allows for both consultation of the web archives in their entirety (44 billion documents, 1.7 PB of data) and for exploring a collection, "The 2002 presidential and local elections", which was the the first collection constituted in-house by the BnF 20 years ago. This collection is now accessible , through various tools for data mining, analysis, and data visualization. And the use of those tools is accompanied by guides, reports, examples, use cases - multiple types of supporting documentation that will also be evaluated on their usefulness as part of the experimentation.

The presentation will focus on the implementation of this access point from both technical and practical aspects. It will address the training of the team of 6 mediators responsible for accompanying the researchers in Lille, as well as the collaboration between the teams in Lille and at the BnF. It will also tackle the challenges of outreach and the path we have taken to communicate within the academic community to find researcher-testers.

We will share the results and lessons learned from this experimentation: the first tests conducted with the researchers have allowed us to obtain feedback on the tools deployed and the improvements to be made to this experimental service.