File size: 2,773 Bytes
269a91a
 
 
 
 
 
 
 
 
 
 
 
 
75db650
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
title: Thoracic Radiology RAG System
emoji: 👀
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 6.1.0
app_file: app.py
pinned: false
license: mit
short_description: 'Ask questions about thoracic radiology and get answers with '
---


## Overview

This repository contains a **Hugging Face Spaces-ready** RAG (Retrieval-Augmented Generation) demo for thoracic radiology Q&A.

- **Default index (prebuilt)**: `ZhangNy/radiology-index-qwen3-embedding-0.6b`
- **Raw public dataset**: `ZhangNy/radiology-dataset`
- **No image rendering in UI**: references link to original pages where images can be viewed.

The Space uses **external APIs** for Embeddings / Reranker / LLM via **Secrets**.

## Run (local)

```bash
cd LangGraphAgent/rebuild_1219
pip install -r requirements.txt

export EMBED_API_KEY="..."
export LLM_API_KEY="..."
# optional:
export RERANK_API_KEY="..."

python app.py --config config/default_config.yaml --host 0.0.0.0 --port 7860
```

Open `http://localhost:7860`.

## Required Hugging Face Space Secrets

### Required

- **`EMBED_API_KEY`**: embedding API key (OpenAI-compatible)
- **`LLM_API_KEY`**: LLM API key (OpenAI-compatible)

### Recommended

- **`RERANK_API_KEY`**: reranker API key (OpenAI-compatible `/rerank` endpoint)

### Optional (override defaults)

- **`EMBED_API_BASE_URL`**, **`EMBED_MODEL_NAME`**
- **`RERANK_API_BASE_URL`**, **`RERANK_MODEL_NAME`**
- **`LLM_BASE_URL`**, **`LLM_MODEL_NAME`**
- **`RAG_INDEX_REPO_ID`** (default: `ZhangNy/radiology-index-qwen3-embedding-0.6b`)
- **`RAG_STORAGE_DIR`** (default: `/data/radiology_rag` if `/data` exists, else `./storage`)

## Advanced: rebuild your own index (offline)

Install dev deps:

```bash
pip install -r requirements-dev.txt
```

The `scripts/` folder (to be used locally) will support:
- Downloading `ZhangNy/radiology-dataset` to `./hf_dataset_prepared`
- Building a new index with a different embedding model
- Publishing that index as a Hugging Face dataset repo

### Fast path (no rebuild): publish your existing local index

If you already have a built index locally (e.g. `rebuild_1217/storage` contains `chroma_db/` + `doc_store.db`),
you can **package it without images** and upload it:

```bash
python scripts/package_existing_storage.py \
  --storage /home/zny/codes/radioagent_prepare/LangGraphAgent/rebuild_1217/storage \
  --output-dir ./index_out \
  --overwrite

python scripts/publish_index_to_hf.py \
  --repo ZhangNy/radiology-index-qwen3-embedding-0.6b \
  --folder ./index_out \
  --token $HF_TOKEN
```

## Notes

- **Do not commit API keys**. This repo is configured to read them from environment variables / Space Secrets.
- **Index compatibility**: query-time embedding model should match the index embedding model for best retrieval quality.