Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available). To only see the sessions for 3 May's Online Day, select "Online" for location.

Please note that all times are shown in the time zone of the conference. The current conference time is: 27th Apr 2024, 05:13:35pm CEST

 
Only Sessions at Location/Venue 
 
 
Session Overview
Session
POS-2: LIGHTNING & DROP-IN TALKS
Time:
Thursday, 11/May/2023:
5:30pm - 6:10pm

Session Chair: Martin Klein, Los Alamos National Laboratory
Location: Theatre 2


1 minute drop-in talks will immediately follow lightning talks. After the session ends, lightning talk presenters will be available for questions in the atrium, where their posters will be on display.

Drop-in talk schedule:

Persistent Web IDentifier (PWID) also as URN​
Eld Zierau, Royal Danish Library

Crowdsourcing German Twitter ​
Britta Woldering, German National Library

At the end of the rainbow. Examining the Dutch LGBT+ web archive using NER and hyperlink analyses
Jesper Verhoef, Erasmus University Rotterdam

Show help for 'Increase or decrease the abstract text size'
Presentations

Sunsetting a digital institution: Web archiving and the International Museum of Women

Marie Chant

The Feminist Institute, United States of America

The Feminist Institute’s (TFI) partnership program helps feminist organizations sunset mission-aligned digital projects utilizing web archiving technology and ethnographic preservation to contextualize and honor the labor contributed to ephemeral digital initiatives. In 2021, The Feminist Institute partnered with Global Fund for Women to preserve the International Museum of Women (I.M.O.W). This digital, social change museum built award-winning digital exhibitions that explored women’s contributions to society. I.M.O.W. initially aimed to build a physical space but shifted to a digital-only presence in 2005, opting to democratize access to the museum’s work. I.M.O.W’s first exhibition, Imagining Ourselves: A Global Generation of Women, engaged and connected more than a million participants worldwide. After launching several successful digital collections, I.M.O.W. merged with Global Fund for Women in 2014. The organization did not have the means to continually migrate and maintain the websites as technology depreciated, leaving gaps in functionality and access. Working directly with stakeholders from Global Fund for Women and the International Museum of Women, TFI developed a multi-pronged preservation plan that included capturing I.M.O.W’s digital exhibitions using Webrecorder’s Browsertrix Crawler, harvesting and converting Adobe Flash assets, conducting oral histories with I.M.O.W. staff and external developers, and providing access through the TFI Digital Archive.



Visualizing web harvests with the WAVA tool

Ben O'Brien1, Frank Lee1, Hanna Koppelaar2, Sophie Ham2

1National Library of New Zealand, New Zealand; 2National Library of the Netherlands, Netherlands

Between 2020-2021, the National Library of New Zealand (NLNZ) and the National Library of the Netherlands (KB-NL) developed a new harvest visualization feature within the Web Curator Tool (WCT). This feature was demonstrated during a presentation at the 2021 IIPC WAC titled Improving the quality of web harvests using Web Curator Tool. During development it was recognised that the visualization tool could be beneficial to the web archiving community beyond WCT. This was also reflected in feedback received after the 2021 IIPC WAC.

The feature has now been ported to an accompanying stand-alone application called the WAVA tool (Web Archive Visualization and Analysis). This is a stripped down version, that contains the web harvest analysis and visualization without the WCT dependent functionality, such as patching.

The WCT harvest visualization has been designed primarily for performing quality assurance on web archives. To avoid the traditional mess of links and nodes when visualizing URLs, the tool abstracts the data to a domain level. Aggregating URLs into groups of domains gives a higher overview of a crawl and allows for quicker analysis of the relationships between content in a harvest. The visualization consists of an interactive network graph of links and nodes that can be inspected, allowing a user to drill down to the URL level for deeper analysis.

NLNZ and KB-NL believe the WAVA tool can have many uses to the web archiving community. It lowers the barrier to investigating and understanding the relationships and structure of the web content that we crawl. What can we discover in our crawls that might improve the quality of future web harvests? The WAVA tool also removes technical steps that have been a barrier in the past to researchers visualizing web archive data. How many future research questions can be aided by its use?



WARC validation, why not?

Antal Posthumus, Jacob Takema

Nationaal Archief, The Netherlands

This lightning talk would like to tempt and to challenge the participants of the IIPC Web Archiving Conference 2023 to engage in an exchange of ideas, assumptions and knowledge about the subject of validating WARC-files and the use of WARC validation tools.

In 2021 we’ve written an information sheet about WARC validation. During our (desk)research it became clear that most (inter)national colleagues who archive websites more often than not don’t use WARC validation tools. Why not?

Most heritage institutions, national libraries and archives focus on safeguarding as much online content as possible before it disappears, based on an organizational selection policy. And the other goal is to give access to the captured information as complete and quickly as possible, both to the general users and researchers. Both goals are at the core of webarchiving initiatives of course!

It seems as though little attention is given to an aspect of quality control such as the checking of the technical validity of WARC-files. Or are there other reasons not to pay much attention to this aspect?

We like to share some of our findings after deploying several tools for processing WARC-files: JHOVE, JWAT, Warcat and Warcio. More tools are available, but in our opinion these four tools are the most commonly used, mature and actively maintained tools that can check of validate WARC files.

In our research into WARC validation, we noticed that some tools are validation tools that check conformance to WARC standard ISO 28500 and others ‘only’ check block and/or payload digests. Most tools support version 1.0 of the WARC standard (of 2009). Few support version 1.1 (of 2017).

Another conclusion is that there is no one WARC validation tool ‘to rule them all’, so using a combination of tools will probably be the best strategy for now.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: IIPC WAC 2023
Conference Software: ConfTool Pro 2.6.149
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany