Spaces:
Running
Running
metadata
title: BPL Card Catalog Search
emoji: ๐๏ธ
colorFrom: gray
colorTo: yellow
sdk: docker
pinned: false
license: mit
BPL Card Catalog Search
Search and browse ~453,000 digitized catalog cards from the Boston Public Library's Rare Books & Manuscripts Department.
Uses AI-powered OCR (small vision-language models) to make handwritten and typewritten catalog cards searchable for the first time.
Features
- Semantic search โ find cards by meaning, not just keywords
- Keyword search โ full-text search across OCR transcriptions
- Compare OCR โ see old Tesseract vs new VLM OCR results side by side
- Browse by drawer โ navigate the physical organization of the catalog
- Image lightbox โ click any card image to view full-size
Stack
- FastAPI + HTMX + Jinja2
- LanceDB (vector + full-text search)
- sentence-transformers (BAAI/bge-base-en-v1.5)
- Dataset: Lance format on HF Hub