Web Harvesting Services - Library of Congress
Overview
Buyer
Place of Performance
NAICS
PSC
Set Aside
Original Source
Timeline
Qualification Details
Fit reasons
- NAICS alignment with historical contract wins in similar service areas.
- Scope strongly matches core technical capabilities and delivery model.
Risks
- Past performance thresholds may require one additional teaming partner.
- Potential clarification needed on staffing minimums before bid/no-bid.
Next steps
Validate eligibility requirements, assign capture owner, and schedule partner outreach to confirm teaming strategy before submission planning.
Quick Summary
The Library of Congress is soliciting proposals for Web Harvesting Services to support its digital preservation and public access initiatives. This Unrestricted opportunity is for an Indefinite-Delivery, Indefinite-Quantity (IDIQ) contract with firm-fixed-price Task Orders. The latest proposal due date is March 3, 2026, at 5:00 PM EST.
Purpose & Scope
The Library of Congress requires contract support for the systematic, at-scale harvesting of web content. This includes providing temporary access to harvested content, generating required crawl reports for quality review, and enabling the transfer of content to the Library for preservation and public access. The goal is to enrich the Library's digital collections for Congress and America's communities.
Key Requirements
- Web Content Harvesting: Perform crawls based on Library specifications, seed lists, scoping instructions, and robots.txt (generally ignored). Capture various digital objects (HTML, images, PDFs, multimedia) into valid WARC files (ISO 28500_2017 format) with 11-field CDX indexes.
- Data Packaging & Transfer: Package captured content into WARC files (target ~1 GB, BagIt bags <= 1 TB) for transfer to the Library's S3 bucket via secure internet (HTTPS).
- Quality Review & Reporting: Provide an access tool for Library staff review. Generate detailed reports (ASCII text, XML) within 5 days of crawl completion, including statistical information and performance metrics. Develop and maintain a Quality Control Program (QCP).
- Infrastructure & Security: Utilize US-based servers for crawling. Maintain reliable, secure data storage with two copies. Provide a web-based communication tool. Adhere to strict information security policies, including mandatory IT Security Training and restrictions on Generative AI use.
- Key Personnel: Provide qualified Program Manager/Alternate, Crawl Engineer, and Quality Assurance Lead with specified experience.
- Sample Web Crawl: Offerors must perform a one-time technical test crawl using a Library-provided seed list within 48 hours, delivering results in WARC format with CDX files and reports.
Contract Details
- Contract Type: Indefinite-Delivery, Indefinite-Quantity (IDIQ) with firm-fixed-price Task Orders.
- Period of Performance: Base period from June 1, 2026, to May 31, 2031.
- Estimated Value: Minimum order of $300,000.00; Maximum order of $15,000,000.00.
- Place of Performance: Contractor's own facilities.
- Set-Aside: Unrestricted.
- Product Service Code: DK10 - Cloud Solutions Delivered As A Service.
Submission & Evaluation
- Proposal Due Date: March 3, 2026, at 5:00 PM EST.
- Sample Web Crawl Information Due: February 17, 2026, at 5:00 PM EST (notification of data size/estimated size).
- Past Performance Questionnaires (PPQs) Due: February 24, 2026, at noon EST (must be sent directly by references).
- Submission Method: Electronically via email to both the Contracting Officer (jzwa@loc.gov) and Contract Specialist (cdaly@loc.gov). No zipped files; total email attachment size not to exceed 20MB. Proposals must be valid through June 6, 2026.
- Proposal Content: Must include four volumes: Technical Approach (including sample web crawl), Corporate Experience and Capabilities (including Key Personnel resumes), Past Performance (using Attachment J3), and Price (using Attachment J4).
- Evaluation Criteria: Best-Value Trade-Off (BVTO) approach. Factors in descending order of importance: Technical Approach, Corporate Experience and Capabilities, Past Performance, and Price. Non-price factors combined are significantly more or equally important to price. The Library may award without discussions.
- SAM Registration: Active SAM registration is required for award consideration and at the time of quotation submission.
Contacts
- Contract Specialist: Colleen Daly (cdaly@loc.gov)
- Contracting Officer: Jennifer Zwahlen (jzwa@loc.gov)