Spaces:
Sleeping
Sleeping
Data Setup Guide
The API requires Church Fathers commentary embeddings to be placed in the data/ directory.
Option 1: Use Existing Embeddings (if available)
If you have access to pre-generated embeddings from the church-fathers repository:
python prepare_data.py --source /path/to/church-fathers/commentary_embeddings
Option 2: Generate Embeddings from Database
If you have the SQLite database from the Historical Christian Faith Commentaries Database:
- Clone the Commentaries-Database repository:
git clone https://github.com/HistoricalChristianFaith/Commentaries-Database.git
- Generate embeddings using the utility script from church-fathers:
# From the church-fathers directory
python util/commentary.py \
-db /path/to/Commentaries-Database/data.sqlite \
-m "BAAI/bge-large-en-v1.5" \
-o /path/to/biblos-cf-api/data
This will create JSON files organized by book in the data/ directory.
Option 3: Use prepare_data.py with Database
Alternatively, use the prepare_data.py script directly:
python prepare_data.py \
--generate \
--db /path/to/data.sqlite \
--model "BAAI/bge-large-en-v1.5"
Expected Data Structure
After preparation, your data/ directory should look like:
data/
├── matthew/
│ ├── matthew_Augustine_of_Hippo_123.json
│ ├── matthew_Origen_of_Alexandria_456.json
│ └── ...
├── john/
│ ├── john_Augustine_of_Hippo_789.json
│ └── ...
└── ...
Each JSON file should have this structure:
{
"content": "Commentary text...",
"metadata": {
"father_name": "Augustine of Hippo",
"book": "matthew",
"source_title": "Tractates on the Gospel of Matthew",
"location_start": "Mt 5:1",
"location_end": "Mt 5:12"
},
"embedding": [0.123, -0.456, ...]
}
Testing Locally
Once data is prepared:
uvicorn app:app --reload
Visit http://localhost:8000/docs to test the API.