Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available). To only see the sessions for 3 May's Online Day, select "Online" for location.

Please note that all times are shown in the time zone of the conference. The current conference time is: 28th Apr 2024, 05:40:27pm CEST

 
Only Sessions at Location/Venue 
 
 
Session Overview
Session
OL-SES-06: Q&A: COLLABORATIVE WEB ARCHIVING
Time:
Wednesday, 03/May/2023:
6:40pm - 7:10pm

Session Chair: Lauren Ko, University of North Texas
Virtual location: Online


Show help for 'Increase or decrease the abstract text size'
Presentations

Empowering Bibliographers to Build Collections: The Browsertrix Cloud Pilot at Stanford Libraries

Quinn Dombrowski, Ed Summers, Laura Wrubel, Peter Chan

Stanford University, United States of America

The purview of subject-area librarians has expanded in the 21st century from primarily focusing on books and print subscriptions to a much larger set of materials, including digital subscription packages and data sets (distributed using a variety of media, for purchase or lease). Through this process, subject-area librarians are increasingly exposed to complex issues around copyright, license terms, and privacy/ethical concerns, where both norms and laws can vary significantly among different countries and communities. While it is nearly impossible for subject-area librarians in any field to treat “data” as outside the scope of their collecting efforts in 2022, the same does not hold true for web archives. Many libraries have at least some access to web archiving tools, although this access may primarily be in the hands of a limited number of users, sometimes associated with library technical services or special collections / university archives (e.g. for institutions whose focus of web archiving is primarily their own digital resources).

In late 2022, the web archiving task force at Stanford Libraries – a cross-functional team that brought together the web archivist, technical staff, and embedded digital humanities staff – set out to shift this dynamic by empowering disciplinary librarians to add web archiving to their toolkit for building the university’s collections. By partnering with Webrecorder, Stanford Libraries set up an instance of Browsertrix Cloud, and provided access to a pilot group of bibliographers and other subject-matter experts as part of a short-term pilot. The goals of this pilot were to see how, and how much, bibliographers would engage with web archiving for collection-building if given unfettered access to easy-to-use tools. What materials would they prioritize? What challenges would they encounter? What technical (e.g. storage) and support (e.g. training, debugging, community engagement) resources would be necessary for them to be successful? This pilot was also intended to inform the strategic direction for web archiving at Stanford moving forward.

In this talk, we will briefly present how we designed the pilot, will hear perspectives from bibliographers who participated, and we will share the pilot outcomes and future directions.



What next? An update from SUCHO

Quinn Dombrowski1, Anna Kijas2, Sebastian Majstorovic3, Ed Summers1, Andreas Segerberg4

1Stanford University, United States of America; 2Tufts University, United States of America; 3Austrian Center for Digital Humanities and Cultural Heritage, Austria; 4University of Gothenburg, Sweden

Saving Ukrainian Cultural Heritage Online (SUCHO) made headlines as an international, volunteer-run initiative archiving Ukrainian cultural heritage websites in the wake of Russia’s invasion in February 2022. Through SUCHO, over 1,500 volunteers around the world – from technologists and librarians, to retirees and children – were involved in a large-scale, rapid-response web archiving effort that developed a collection of over 5,000 websites and 50 TB of data. As a non-institutional project with the primary goal of digital repatriation, creating this collection and ensuring its security through a network of mirrors was not enough. The motivation for SUCHO was not to create a permanent archive of Ukraine that could be used as research data for scholars as the country was destroyed; instead, the hope was to hold onto the data only until the cultural heritage sector in Ukraine was ready to rebuild.

The initial web archiving phase of SUCHO’s work happened between March and August 2022. The archives came from a variety of sources: created on volunteers laptops using the command-line Browsertrix software, using Browsertrix Cloud, or even uploads of individual, highly interactive page archives using the Browsertrix Chrome plugin. In addition, while the project mostly worked from a single list of sites, the work was done in haste, and status metadata (e.g. “in progress”, “done”, “problem”) was not always accurately documented. Furthermore, while the project had full DNS records for these sites, that metadata was stored separately from the spreadsheet – as was information about site uptime and downtime over the course of the project. Creating the web archives was challenging, but it quickly became apparent that the bigger challenge would be curation.

This talk will follow up on our 2022 IIPC presentation on SUCHO, confronting the question of “What next?” for SUCHO. It will bring together a number of volunteers to discuss different facets of this curation process, including reuniting archives with different kinds of metadata, our efforts in extracting data from the archives that could be used as the foundation for rebuilding websites, and other work to curate and present what our volunteer community accomplished.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IIPC WAC 2023
Conference Software: ConfTool Pro 2.6.149
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany