Spaces:
Running
Running
| title: RadExtract | |
| emoji: ποΈ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| pinned: false | |
| license: apache-2.0 | |
| header: mini | |
| app_port: 7870 | |
| tags: | |
| - medical | |
| - nlp | |
| - radiology | |
| - langextract | |
| - gemini | |
| - structured-data | |
| # RadExtract: Radiology Report Structuring Demo | |
| [](https://huggingface.co/spaces/google/radextract) | |
| [](https://github.com/google/langextract) | |
| [](https://opensource.org/licenses/Apache-2.0) | |
| A demonstration application powered by [LangExtract](https://github.com/google/langextract) that structures radiology reports using Gemini models. Transform unstructured radiology text into organized, interactive segments with clinical significance annotations. | |
| ## Try the Demo | |
| **[Launch RadExtract Demo](https://huggingface.co/spaces/google/radextract)** | |
| Transform unstructured radiology reports into structured data with highlighted findings that are precisely mapped back to the original source text. | |
| ## Key Features | |
| - **Structured Output**: Organizes reports into anatomical sections with clinical significance | |
| - **Interactive Highlighting**: Click any finding to see its exact source in the original text | |
| - **Clinical Significance**: Annotates findings as minor, significant, or grounding | |
| - **Character-Level Mapping**: Precise attribution back to source text | |
| - **Multi-Model Support**: Gemini 2.5 Flash (fast) and Pro (comprehensive) | |
| ## Quick Start | |
| ### Setup | |
| ```bash | |
| git clone https://huggingface.co/spaces/google/radextract | |
| cd radextract | |
| python -m venv venv | |
| source venv/bin/activate | |
| pip install -e ".[dev]" | |
| cp env.list.example env.list | |
| # Edit env.list and set KEY=your_gemini_api_key_here | |
| ``` | |
| ### Local Development | |
| ```bash | |
| source venv/bin/activate | |
| export KEY=your_gemini_api_key_here | |
| python app.py | |
| ``` | |
| Access at: http://localhost:7870 | |
| ## API Usage | |
| ### Example Request | |
| ```bash | |
| curl -X POST \ | |
| -H 'X-Model-ID: gemini-2.5-flash' \ | |
| -H 'X-Use-Cache: true' \ | |
| -d 'FINDINGS: Normal heart and lungs. IMPRESSION: Normal study.' \ | |
| http://localhost:7870/predict | |
| ``` | |
| ### Response Format | |
| ```json | |
| { | |
| "segments": [{ | |
| "type": "body", | |
| "label": "Chest", | |
| "content": "Normal heart and lungs", | |
| "intervals": [{"startPos": 10, "endPos": 32}], | |
| "significance": "minor" | |
| }], | |
| "text": "Chest:\n- Normal heart and lungs", | |
| "annotated_document_json": {...} | |
| } | |
| ``` | |
| ## Architecture | |
| - **Backend**: Flask + Python 3.10+ with full type safety | |
| - **NLP Engine**: [LangExtract](https://github.com/google/langextract) for structured extraction | |
| - **AI Models**: Google Gemini 2.5 (Flash/Pro) | |
| - **Frontend**: Vanilla JavaScript with interactive UI | |
| - **Deployment**: Docker + Hugging Face Spaces | |
| - **Package Details**: See [pyproject.toml](https://huggingface.co/spaces/google/radextract/blob/main/pyproject.toml) for dependencies, metadata, and tooling | |
| ## Project Structure | |
| ``` | |
| radextract/ | |
| βββ app.py # Flask API endpoints | |
| βββ structure_report.py # Core structuring logic | |
| βββ sanitize.py # Text preprocessing & normalization | |
| βββ prompt_instruction.py # LangExtract prompt | |
| βββ cache_manager.py # Response caching | |
| βββ static/ # Frontend assets | |
| βββ templates/ # HTML templates | |
| ``` | |
| ## Development | |
| ### Setup | |
| ```bash | |
| git clone https://huggingface.co/spaces/google/radextract | |
| cd radextract | |
| python -m venv venv | |
| source venv/bin/activate | |
| pip install -e ".[dev]" | |
| ``` | |
| ### Code Quality | |
| ```bash | |
| # Format code | |
| pyink . && isort . | |
| # Type checking | |
| mypy . --ignore-missing-imports | |
| # Run tests | |
| pytest | |
| ``` | |
| ### Docker | |
| ```bash | |
| # Build and run | |
| docker build -t radextract . | |
| docker run -p 7870:7870 --env-file env.list radextract | |
| ``` | |
| ## License | |
| Apache License 2.0 - see [LICENSE](LICENSE) for details. | |
| ## Related Projects | |
| - **[LangExtract](https://github.com/google/langextract)**: Core NLP library | |
| --- | |
| **Built for the medical AI community** | **Hosted on Hugging Face Spaces** | |
| ## Disclaimer | |
| This is not an officially supported Google product. If you use RadExtract or LangExtract in production or publications, please cite accordingly and acknowledge usage. Use is subject to the [Apache 2.0 License](LICENSE). For health-related applications, use of LangExtract is also subject to the [Health AI Developer Foundations Terms of Use](https://developers.google.com/health-ai-foundations/terms). | |