Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view.
The purpose of this GA session is to share members’ expertise and experience in performing national domain crawls with Heritrix. We ask members to 1) share their challenges with domain crawls, 2) discuss possible solutions, and 3) establish best practices with specific tips and tricks for others to try.
Fighting 404s with Sara Aubry, National Library of France
Annual National Domain Crawl Using AWS with Gil Hoggarth, British Library
Seedlist: Approach Emphasis and Scope Out with Tom Smyth, Library and Archives Canada
Effective Handling of Byte Limits with Thomas Smedebøl, Royal Danish Library
Use of Sitemaps in Domain Crawls with Kristinn Sigurðsson, National and University Library of Iceland
Browser-assisted Heritrix with Alex Dempsey, Internet Archive