Spaces:
Sleeping
Sleeping
File size: 2,064 Bytes
b773b72 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # Data Setup Guide
The API requires Church Fathers commentary embeddings to be placed in the `data/` directory.
## Option 1: Use Existing Embeddings (if available)
If you have access to pre-generated embeddings from the church-fathers repository:
```bash
python prepare_data.py --source /path/to/church-fathers/commentary_embeddings
```
## Option 2: Generate Embeddings from Database
If you have the SQLite database from the [Historical Christian Faith Commentaries Database](https://github.com/HistoricalChristianFaith/Commentaries-Database):
1. Clone the Commentaries-Database repository:
```bash
git clone https://github.com/HistoricalChristianFaith/Commentaries-Database.git
```
2. Generate embeddings using the utility script from church-fathers:
```bash
# From the church-fathers directory
python util/commentary.py \
-db /path/to/Commentaries-Database/data.sqlite \
-m "BAAI/bge-large-en-v1.5" \
-o /path/to/biblos-cf-api/data
```
This will create JSON files organized by book in the data/ directory.
## Option 3: Use prepare_data.py with Database
Alternatively, use the prepare_data.py script directly:
```bash
python prepare_data.py \
--generate \
--db /path/to/data.sqlite \
--model "BAAI/bge-large-en-v1.5"
```
## Expected Data Structure
After preparation, your `data/` directory should look like:
```
data/
├── matthew/
│ ├── matthew_Augustine_of_Hippo_123.json
│ ├── matthew_Origen_of_Alexandria_456.json
│ └── ...
├── john/
│ ├── john_Augustine_of_Hippo_789.json
│ └── ...
└── ...
```
Each JSON file should have this structure:
```json
{
"content": "Commentary text...",
"metadata": {
"father_name": "Augustine of Hippo",
"book": "matthew",
"source_title": "Tractates on the Gospel of Matthew",
"location_start": "Mt 5:1",
"location_end": "Mt 5:12"
},
"embedding": [0.123, -0.456, ...]
}
```
## Testing Locally
Once data is prepared:
```bash
uvicorn app:app --reload
```
Visit http://localhost:8000/docs to test the API.
|