bpl-card-catalog / README.md
davanstrien's picture
davanstrien HF Staff
Upload folder using huggingface_hub
17a4402 verified
metadata
title: BPL Card Catalog Search
emoji: ๐Ÿ—ƒ๏ธ
colorFrom: gray
colorTo: yellow
sdk: docker
pinned: false
license: mit

BPL Card Catalog Search

Search and browse ~453,000 digitized catalog cards from the Boston Public Library's Rare Books & Manuscripts Department.

Uses AI-powered OCR (small vision-language models) to make handwritten and typewritten catalog cards searchable for the first time.

Features

  • Semantic search โ€” find cards by meaning, not just keywords
  • Keyword search โ€” full-text search across OCR transcriptions
  • Compare OCR โ€” see old Tesseract vs new VLM OCR results side by side
  • Browse by drawer โ€” navigate the physical organization of the catalog
  • Image lightbox โ€” click any card image to view full-size

Stack

  • FastAPI + HTMX + Jinja2
  • LanceDB (vector + full-text search)
  • sentence-transformers (BAAI/bge-base-en-v1.5)
  • Dataset: Lance format on HF Hub