|
|
--- |
|
|
title: Thoracic Radiology RAG System |
|
|
emoji: ๐ |
|
|
colorFrom: green |
|
|
colorTo: red |
|
|
sdk: gradio |
|
|
sdk_version: 6.1.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: 'Ask questions about thoracic radiology and get answers with ' |
|
|
--- |
|
|
|
|
|
|
|
|
## Overview |
|
|
|
|
|
This repository contains a **Hugging Face Spaces-ready** RAG (Retrieval-Augmented Generation) demo for thoracic radiology Q&A. |
|
|
|
|
|
- **Default index (prebuilt)**: `ZhangNy/radiology-index-qwen3-embedding-0.6b` |
|
|
- **Raw public dataset**: `ZhangNy/radiology-dataset` |
|
|
- **No image rendering in UI**: references link to original pages where images can be viewed. |
|
|
|
|
|
The Space uses **external APIs** for Embeddings / Reranker / LLM via **Secrets**. |
|
|
|
|
|
## Run (local) |
|
|
|
|
|
```bash |
|
|
cd LangGraphAgent/rebuild_1219 |
|
|
pip install -r requirements.txt |
|
|
|
|
|
export EMBED_API_KEY="..." |
|
|
export LLM_API_KEY="..." |
|
|
# optional: |
|
|
export RERANK_API_KEY="..." |
|
|
|
|
|
python app.py --config config/default_config.yaml --host 0.0.0.0 --port 7860 |
|
|
``` |
|
|
|
|
|
Open `http://localhost:7860`. |
|
|
|
|
|
## Required Hugging Face Space Secrets |
|
|
|
|
|
### Required |
|
|
|
|
|
- **`EMBED_API_KEY`**: embedding API key (OpenAI-compatible) |
|
|
- **`LLM_API_KEY`**: LLM API key (OpenAI-compatible) |
|
|
|
|
|
### Recommended |
|
|
|
|
|
- **`RERANK_API_KEY`**: reranker API key (OpenAI-compatible `/rerank` endpoint) |
|
|
|
|
|
### Optional (override defaults) |
|
|
|
|
|
- **`EMBED_API_BASE_URL`**, **`EMBED_MODEL_NAME`** |
|
|
- **`RERANK_API_BASE_URL`**, **`RERANK_MODEL_NAME`** |
|
|
- **`LLM_BASE_URL`**, **`LLM_MODEL_NAME`** |
|
|
- **`RAG_INDEX_REPO_ID`** (default: `ZhangNy/radiology-index-qwen3-embedding-0.6b`) |
|
|
- **`RAG_STORAGE_DIR`** (default: `/data/radiology_rag` if `/data` exists, else `./storage`) |
|
|
|
|
|
## Advanced: rebuild your own index (offline) |
|
|
|
|
|
Install dev deps: |
|
|
|
|
|
```bash |
|
|
pip install -r requirements-dev.txt |
|
|
``` |
|
|
|
|
|
The `scripts/` folder (to be used locally) will support: |
|
|
- Downloading `ZhangNy/radiology-dataset` to `./hf_dataset_prepared` |
|
|
- Building a new index with a different embedding model |
|
|
- Publishing that index as a Hugging Face dataset repo |
|
|
|
|
|
### Fast path (no rebuild): publish your existing local index |
|
|
|
|
|
If you already have a built index locally (e.g. `rebuild_1217/storage` contains `chroma_db/` + `doc_store.db`), |
|
|
you can **package it without images** and upload it: |
|
|
|
|
|
```bash |
|
|
python scripts/package_existing_storage.py \ |
|
|
--storage /home/zny/codes/radioagent_prepare/LangGraphAgent/rebuild_1217/storage \ |
|
|
--output-dir ./index_out \ |
|
|
--overwrite |
|
|
|
|
|
python scripts/publish_index_to_hf.py \ |
|
|
--repo ZhangNy/radiology-index-qwen3-embedding-0.6b \ |
|
|
--folder ./index_out \ |
|
|
--token $HF_TOKEN |
|
|
``` |
|
|
|
|
|
## Notes |
|
|
|
|
|
- **Do not commit API keys**. This repo is configured to read them from environment variables / Space Secrets. |
|
|
- **Index compatibility**: query-time embedding model should match the index embedding model for best retrieval quality. |
|
|
|
|
|
|
|
|
|
|
|
|