Spaces:
Sleeping
title: Radiology Report NER API
emoji: π©Ί
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: apache-2.0
π©Ί Radiology Report NER API
Secure, encrypted medical document analysis with Named Entity Recognition and OCR
π Features
- π End-to-End Encryption using NaCl (XSalsa20-Poly1305)
- π 99.94% F-Score NER model accuracy
- π PDF & Image Support with intelligent text extraction
- πΌοΈ Embedded Image Extraction from medical PDFs
- π― Entity Detection: ANATOMY & OBSERVATION
- β οΈ Critical Finding Detection
- π Clinical Recommendations
- π¦ Gzip Compression (25% bandwidth savings)
- β‘ EasyOCR Integration for scanned documents
ποΈ Architecture
Client Application
β (Encrypt + Compress)
FastAPI Server
β (Decrypt + Decompress)
Text Extraction (PyMuPDF/EasyOCR)
β
spaCy NER Model (99.94% F-score)
β
Post-Processing \& Analysis
β (Encrypt)
Structured JSON Response
π‘ API Endpoints
POST /analyze-secure
Secure encrypted endpoint for medical document analysis.
Request Format:
{
"ciphertext": "base64_encrypted_data",
"nonce": "base64_nonce"
}
Encrypted Payload Structure:
{
"filename": "report.pdf",
"file_data": "base64_encoded_file",
"file_type": "pdf"
}
Response Format:
{
"status": "success",
"ciphertext": "base64_encrypted_response",
"nonce": "base64_nonce"
}
Decrypted Response Structure:
{
"status": "success",
"processing_time": 57.721,
"filename": "xray_report.pdf",
"input_type": "pdf",
"ocr_used": true,
"ocr_engine": "EasyOCR",
"raw_text": "Patient report...",
"text_length": 1022,
"entities": [
{
"text": "lung",
"label": "ANATOMY",
"start": 45,
"end": 49,
"confidence": 0.998
}
],
"images": [
{
"page": 1,
"format": "JPEG",
"width": 800,
"height": 600,
"data": "data:image/jpeg;base64,..."
}
],
"structured_report": {
"anatomy": ["lung", "heart", "chest"],
"all_observations": ["clear", "normal"],
"positive_findings": [],
"negative_findings": ["clear", "normal"],
"critical_findings": []
},
"summary": {
"total_entities": 12,
"anatomy_count": 6,
"observations_count": 6,
"has_critical_findings": false,
"has_abnormalities": false
},
"recommendations": [
"No significant abnormalities detected"
]
}
GET /health
Health check endpoint.
Response:
{
"status": "healthy",
"model_loaded": true,
"model_pipeline": ["tok2vec", "ner"],
"model_labels": ["ANATOMY", "OBSERVATION"],
"ocr_engine": "EasyOCR",
"encryption": "NaCl (XSalsa20-Poly1305)",
"compression": "gzip",
"version": "1.0.0"
}
π Security
Encryption Details
- Algorithm: NaCl (Networking and Cryptography library)
- Cipher: XSalsa20 stream cipher
- Authentication: Poly1305 MAC
- Key Derivation: PBKDF2 with SHA-256
- Nonce: 24 bytes (randomly generated per request)
Compression
- Algorithm: gzip
- Average Savings: 25-30% bandwidth reduction
- Applied: Before encryption on client, after decryption on server
Data Flow
- Client compresses payload with gzip
- Client encrypts compressed data with NaCl
- Server decrypts and decompresses
- Server processes medical document
- Server encrypts response
- Client decrypts response
π Deployment
HuggingFace Spaces
This API is deployed on HuggingFace Spaces using Docker.
Local Development
# Clone repository
git clone <your-repo-url>
cd radiology-ner-api
# Create virtual environment
python -m venv venv
source venv/bin/activate \# On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy model
python -m spacy download en_core_web_sm
# Run server
uvicorn app.main:app --host 0.0.0.0 --port 7860
Environment Variables
ENCRYPTION_KEY=your-secret-encryption-key-min-32-chars
MODEL_PATH=./models/xray_ner_best
HOST=0.0.0.0
PORT=7860
π Model Performance
| Metric | Score |
|---|---|
| F-Score | 99.94% |
| Precision | 99.92% |
| Recall | 99.96% |
| Training Samples | 2,674 reports |
| Entity Types | 2 (ANATOMY, OBSERVATION) |
Training Data
- Dataset: Indiana University Chest X-Ray Collection
- Reports: 2,674 radiology reports
- Annotations: Manual entity labeling
- Framework: spaCy v3.7
- Architecture: HashEmbedCNN
π οΈ Technology Stack
- Backend: FastAPI 0.104.1
- NER: spaCy 3.7
- OCR: EasyOCR
- PDF Processing: PyMuPDF (fitz)
- Image Processing: OpenCV, Pillow
- Encryption: PyNaCl
- Compression: gzip
- Deployment: Docker, HuggingFace Spaces
π Client Implementation Example
import base64
import gzip
import json
import requests
from nacl.secret import SecretBox
from nacl.utils import random
SECRET_KEY = "your-encryption-key"
def encrypt_file(file_path, file_type):
\# Read file
with open(file_path, 'rb') as f:
file_data = base64.b64encode(f.read()).decode()
# Create payload
payload = {
"filename": file_path.split('/')[-1],
"file_data": file_data,
"file_type": file_type
}
# Compress
compressed = gzip.compress(json.dumps(payload).encode())
compressed_b64 = base64.b64encode(compressed).decode()
# Encrypt
key = derive_key(SECRET_KEY)
box = SecretBox(key)
nonce = random(SecretBox.NONCE_SIZE)
encrypted = box.encrypt(compressed_b64.encode(), nonce)
return {
"ciphertext": base64.b64encode(encrypted[24:]).decode(),
"nonce": base64.b64encode(nonce).decode()
}
# Send request
encrypted_payload = encrypt_file("report.pdf", "pdf")
response = requests.post(
"https://your-space.hf.space/analyze-secure",
json=encrypted_payload
)
# Decrypt response
result = decrypt_response(response.json(), SECRET_KEY)
print(result)
π― Use Cases
- Clinical Decision Support: Extract structured data from radiology reports
- Medical Record Digitization: OCR for scanned medical documents
- Research Analytics: Automated entity extraction for medical research
- DPDPA Compliance: Secure processing of sensitive medical data (Digital Personal Data Protection Act, 2023)
- Telemedicine: Remote radiology report analysis under Telemedicine Practice Guidelines
- Healthcare AI Research: Supporting AI/ML research in Indian healthcare sector
π License & Compliance
License: Apache License 2.0 - see LICENSE file for details
Indian Regulations Compliance:
- Digital Personal Data Protection Act (DPDPA), 2023: End-to-end encryption ensures data protection
- Information Technology Act, 2000: Secure data transmission and storage
- Telemedicine Practice Guidelines, 2020: Compliant with MCI telemedicine regulations
- Clinical Establishments Act, 2010: Suitable for registered clinical establishments
Note: Users must ensure compliance with:
- State-specific medical data regulations
- Medical Council of India (MCI) guidelines
- National Health Authority (NHA) standards for digital health
- Ayushman Bharat Digital Mission (ABDM) integration requirements
β οΈ Disclaimer
This API is designed for research, educational, and assistive purposes only. It is compliant with Indian digital health regulations including DPDPA 2023 and Telemedicine Practice Guidelines.
Important Notice:
- This tool does NOT replace professional medical diagnosis or treatment
- All analysis must be reviewed and validated by qualified medical practitioners registered with Medical Council of India (MCI) or State Medical Councils
- Users must comply with Indian medical data protection laws (DPDPA 2023)
- Healthcare providers must maintain proper patient consent as per Indian regulations
- For clinical use, ensure compliance with Clinical Establishments Act and relevant state laws
Always consult qualified and registered healthcare professionals for medical advice and diagnosis.
Built with β€οΈ for the medical AI community