Web Harvesting Services - Library of Congress

SOL #: 030ADV26R0010Combined Synopsis/Solicitation

Overview

Buyer

Library Of Congress
Library Of Congress
CONTRACTS SERVICES
Washington, DC, 20540, United States

Place of Performance

Washington, DC

NAICS

Computing Infrastructure Providers (518210)

PSC

Cloud Solutions Delivered As A Service. (DK10)

Set Aside

No set aside specified

Timeline

1
Posted
Jan 20, 2026
2
Last Updated
Feb 25, 2026
3
Submission Deadline
Feb 24, 2026, 10:00 PM

Qualification Details

Fit reasons
  • NAICS alignment with historical contract wins in similar service areas.
  • Scope strongly matches core technical capabilities and delivery model.
Risks
  • Past performance thresholds may require one additional teaming partner.
  • Potential clarification needed on staffing minimums before bid/no-bid.
Next steps

Validate eligibility requirements, assign capture owner, and schedule partner outreach to confirm teaming strategy before submission planning.

Quick Summary

The Library of Congress is soliciting proposals for Web Harvesting Services to systematically collect web content at-scale for preservation and public access. This is an Unrestricted competition for a Firm-Fixed-Price Indefinite-Delivery, Indefinite-Quantity (IDIQ) contract. Proposals are due February 24, 2026, at 5:00 PM EST.

Scope of Work

The contractor will provide comprehensive web harvesting services, including:

  • Systematic Content Capture: Perform crawls based on Library specifications, seed lists, and scoping instructions, capturing various digital objects (HTML, images, PDFs, multimedia) while employing politeness factors.
  • Data Packaging & Transfer: Package captured content into valid WARC files (ISO 28500_2017) with 11-field CDX indexes, transferring them to the Library's S3 bucket via secure HTTPS. Single BagIt bags should not exceed 1 TB.
  • Quality Review & Reporting: Provide an access tool for Library staff to review crawl results and generate detailed reports (ASCII text and XML) within five days of crawl completion. A Quality Control Program (QCP) must be developed and maintained.
  • Infrastructure & Security: Utilize US-based servers for crawling, maintain reliable and secure data storage, and provide a web-based communication tool. Adherence to strict information security policies, including restrictions on Generative AI use, is mandatory.
  • Key Personnel: The contractor must provide qualified Program Manager/Alternate, Crawl Engineer, and Quality Assurance Lead.
  • Estimated Volume: The anticipated crawl volume is 300-700 TB per year.

Contract Details

  • Contract Type: Firm-Fixed-Price Indefinite-Delivery, Indefinite-Quantity (IDIQ) with Task Orders.
  • Period of Performance: Base period from June 1, 2026, to May 31, 2031.
  • Estimated Value: Minimum order of $300,000.00; Maximum order of $15,000,000.00.
  • Place of Performance: Contractor's own facilities.
  • Set-Aside: Unrestricted.
  • Product Service Code: DK10 - Cloud Solutions Delivered As A Service.

Submission Requirements

  • Proposal Due Date: February 24, 2026, at 5:00 PM EST.
  • Sample Web Crawl Size Information Due: February 17, 2026, by 5:00 PM ET.
  • Past Performance Questionnaires (PPQs) Due: February 24, 2026, by noon ET. PPQs must be sent directly from the past performance reference to the Contracting Team.
  • Submission Method: Electronically via email to Jennifer Zwahlen (jzwa@loc.gov) and Colleen Daly (cdaly@loc.gov). Total email attachment size must not exceed 20MB, and no zipped files are permitted. Proposals must remain valid through June 6, 2026.
  • Proposal Content: Must include four volumes: Technical Approach (including a sample web crawl), Corporate Experience and Capabilities (including Key Personnel resumes), Past Performance (using Attachment J3), and Price (using Attachment J4).

Evaluation Criteria

Proposals will be evaluated using a Best-Value Trade-Off (BVTO) approach. Evaluation factors, in descending order of importance, are: Technical Approach, Corporate Experience and Capabilities, Past Performance, and Price. Non-price factors combined are significantly more or equally important to price. The Library may award without discussions.

Key Clarifications

The Library will not consider approaches where crawl data is written directly into Library-owned cloud infrastructure. The pricing metric will be based on the total compressed WARC size.

People

Points of Contact

Colleen DalyPRIMARY
Jennifer ZwahlenSECONDARY

Files

Files

Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download
Download

Versions

Version 8
Combined Synopsis/Solicitation
Posted: Feb 25, 2026
View
Version 7
Combined Synopsis/Solicitation
Posted: Feb 19, 2026
View
Version 6
Combined Synopsis/Solicitation
Posted: Feb 18, 2026
View
Version 5Viewing
Combined Synopsis/Solicitation
Posted: Feb 13, 2026
Version 4
Combined Synopsis/Solicitation
Posted: Feb 3, 2026
View
Version 3
Combined Synopsis/Solicitation
Posted: Jan 28, 2026
View
Version 2
Combined Synopsis/Solicitation
Posted: Jan 23, 2026
View
Version 1
Combined Synopsis/Solicitation
Posted: Jan 20, 2026
View