Spaces:
Sleeping
title: Innocence Claim API
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
Innocence Claim API ⚖️
A FastAPI-based service that analyzes PDF documents to detect and assess innocence claims using a fine-tuned BERT model. The API processes legal documents, extracts sentences, and evaluates their confidence scores to provide reliability metrics.
Overview
This API uses natural language processing and machine learning to:
- Extract text from PDF documents
- Analyze sentences for innocence claims
- Calculate reliability scores based on confidence thresholds
- Provide tiered assessments (High/Medium/Low)
The model is built on BERT (Bidirectional Encoder Representations from Transformers) and has been fine-tuned specifically for identifying innocence-related statements in legal documents.
API Endpoints
POST /predict
Analyzes a PDF document for innocence claims and returns reliability metrics.
Request:
- Method:
POST - Content-Type:
multipart/form-data - Parameters:
file(required): PDF file to analyzecutoff(optional): Confidence threshold (default: 0.7, range: 0.0-1.0)
Response:
{
"reliability_percent": 75.3,
"tier": "Medium"
}
Example using cURL:
curl -X POST https://[your-space-name].hf.space/predict \
-F "file=@document.pdf" \
-F "cutoff=0.7"
Example using Python:
import requests
url = "https://[your-space-name].hf.space/predict"
files = {"file": open("document.pdf", "rb")}
data = {"cutoff": 0.7}
response = requests.post(url, files=files, data=data)
print(response.json())
GET /health
Health check endpoint to verify the API is running.
Request:
- Method:
GET
Response:
{
"status": "ok"
}
Example:
curl https://[your-space-name].hf.space/health
Deployment to Hugging Face Spaces
Prerequisites
- Hugging Face account
- Git installed locally
- Docker (for local testing)
Step-by-Step Deployment
Create a new Space on Hugging Face:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Choose a name for your Space
- Select "Docker" as the SDK
- Set visibility (Public or Private)
Clone your Space repository:
git clone https://huggingface.co/spaces/[your-username]/[your-space-name] cd [your-space-name]Copy the application files:
# Copy all necessary files to your Space directory cp -r app/ [your-space-name]/ cp -r models/ [your-space-name]/ cp app.py [your-space-name]/ cp Dockerfile [your-space-name]/ cp requirements.txt [your-space-name]/ cp start.sh [your-space-name]/ cp README.md [your-space-name]/Handle the model file:
The
innocence_pipeline.pklmodel file is located in themodels/directory.Option A: Include model in repository (if size < 10MB)
git add models/innocence_pipeline.pklOption B: Use Git LFS for large files (if size > 10MB)
# Install Git LFS if not already installed git lfs install # Track the model file with LFS git lfs track "models/*.pkl" git add .gitattributes git add models/innocence_pipeline.pklOption C: Upload via Hugging Face UI
- Navigate to your Space on huggingface.co
- Click "Files and versions"
- Click "Add file" → "Upload files"
- Upload the
innocence_pipeline.pklfile to themodels/directory
Commit and push to Hugging Face:
git add . git commit -m "Initial deployment of Innocence Claim API" git pushMonitor the build:
- Go to your Space page on Hugging Face
- Watch the build logs in the "Logs" tab
- The build process will:
- Build the Docker container
- Install dependencies
- Download the spaCy model
- Start the API server on port 7860
Test your deployed API:
# Health check curl https://[your-username]-[your-space-name].hf.space/health # Prediction curl -X POST https://[your-username]-[your-space-name].hf.space/predict \ -F "file=@test.pdf" \ -F "cutoff=0.7"
Environment Variables and Secrets
If you need to configure environment variables or secrets:
- Go to your Space settings on Hugging Face
- Navigate to "Settings" → "Variables and secrets"
- Add any required variables (currently none are required for basic operation)
Model Upload Process
The pre-trained model (innocence_pipeline.pkl) contains:
- Fine-tuned BERT tokenizer
- Fine-tuned BERT model weights
Model size considerations:
- Check the size of
models/innocence_pipeline.pkl - If < 10MB: Include directly in git repository
- If 10MB - 5GB: Use Git LFS (recommended)
- If > 5GB: Consider model compression or hosting externally
Local Testing
Before deploying to Hugging Face, thoroughly test the application locally to ensure everything works correctly.
Prerequisites for Local Testing
- Docker installed (version 20.10 or higher recommended)
- At least 2GB free disk space for Docker image
- Sample PDF file for testing
- Terminal/command line access
Method 1: Testing with Docker (Recommended)
This method replicates the Hugging Face Spaces environment most accurately.
Step 1: Build the Docker Image
docker build -t innocence-api .
Expected output:
- You should see multiple steps executing (installing Python packages, downloading spaCy model, etc.)
- Build should complete without errors
- Final message: "Successfully tagged innocence-api:latest"
Build time: 3-5 minutes on first build (depending on internet speed)
Step 2: Run the Container
docker run -p 7860:7860 innocence-api
Expected output:
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
Alternative: Run in detached mode (background)
docker run -d -p 7860:7860 --name innocence-api-test innocence-api
To view logs:
docker logs -f innocence-api-test
To stop the container:
docker stop innocence-api-test
docker rm innocence-api-test
Step 3: Test the Health Endpoint
Open a new terminal and run:
curl http://localhost:7860/health
Expected response:
{"status":"ok"}
Alternative: Test in browser
- Open http://localhost:7860/health in your web browser
- You should see the JSON response
Step 4: Test the Prediction Endpoint
Basic test with default cutoff (0.7):
curl -X POST http://localhost:7860/predict \
-F "file=@path/to/your/test.pdf"
Test with custom cutoff:
curl -X POST http://localhost:7860/predict \
-F "file=@path/to/your/test.pdf" \
-F "cutoff=0.8"
Expected response:
{
"reliability_percent": 75.3,
"tier": "Medium"
}
Test with verbose output:
curl -v -X POST http://localhost:7860/predict \
-F "file=@path/to/your/test.pdf" \
-F "cutoff=0.7"
Test different cutoff values:
# High confidence threshold
curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=0.9"
# Low confidence threshold
curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=0.5"
# Edge cases
curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=0.0"
curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=1.0"
Step 5: Test Error Handling
Test with non-PDF file:
curl -X POST http://localhost:7860/predict \
-F "file=@test.txt" \
-F "cutoff=0.7"
Expected response:
{
"detail": "PDF required"
}
Test with missing file:
curl -X POST http://localhost:7860/predict \
-F "cutoff=0.7"
Test with invalid cutoff:
curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=invalid"
Method 2: Testing with Python Directly
This method is faster for development but doesn't test the Docker configuration.
Step 1: Set Up Python Environment
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy model
python -m spacy download en_core_web_sm
Step 2: Verify Model File Exists
ls -lh models/innocence_pipeline.pkl
You should see the model file with its size. If missing, the application will fail to start.
Step 3: Run the Application
python app.py
Expected output:
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
Step 4: Test Endpoints
Use the same curl commands as in Method 1, Step 3 and Step 4.
Method 3: Testing with Python Requests Library
Create a test script test_api.py:
import requests
import sys
def test_health():
"""Test the health endpoint"""
print("Testing /health endpoint...")
response = requests.get("http://localhost:7860/health")
print(f"Status Code: {response.status_code}")
print(f"Response: {response.json()}")
assert response.status_code == 200
assert response.json()["status"] == "ok"
print("✓ Health check passed\n")
def test_predict(pdf_path, cutoff=0.7):
"""Test the predict endpoint"""
print(f"Testing /predict endpoint with {pdf_path}...")
with open(pdf_path, "rb") as f:
files = {"file": f}
data = {"cutoff": cutoff}
response = requests.post("http://localhost:7860/predict", files=files, data=data)
print(f"Status Code: {response.status_code}")
print(f"Response: {response.json()}")
if response.status_code == 200:
result = response.json()
assert "reliability_percent" in result
assert "tier" in result
assert result["tier"] in ["High", "Medium", "Low"]
print("✓ Prediction test passed\n")
else:
print("✗ Prediction test failed\n")
return False
return True
def test_error_handling():
"""Test error handling with invalid input"""
print("Testing error handling...")
# Test with missing file
response = requests.post("http://localhost:7860/predict", data={"cutoff": 0.7})
print(f"Missing file - Status Code: {response.status_code}")
assert response.status_code == 422 # Unprocessable Entity
print("✓ Missing file handling passed\n")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python test_api.py <path_to_test_pdf>")
sys.exit(1)
pdf_path = sys.argv[1]
try:
test_health()
test_predict(pdf_path)
test_predict(pdf_path, cutoff=0.9)
test_error_handling()
print("All tests passed! ✓")
except Exception as e:
print(f"Test failed: {e}")
sys.exit(1)
Run the test script:
python test_api.py path/to/test.pdf
Performance Testing
Test Response Times
# Test cold start (first request)
time curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=0.7"
# Test warm request (subsequent requests)
time curl -X POST http://localhost:7860/predict \
-F "file=@test.pdf" \
-F "cutoff=0.7"
Test with Different PDF Sizes
# Small PDF (1-5 pages)
time curl -X POST http://localhost:7860/predict -F "file=@small.pdf"
# Medium PDF (10-20 pages)
time curl -X POST http://localhost:7860/predict -F "file=@medium.pdf"
# Large PDF (50+ pages)
time curl -X POST http://localhost:7860/predict -F "file=@large.pdf"
Troubleshooting Local Testing Issues
Issue: Docker build fails with "No space left on device"
Symptoms:
ERROR: failed to solve: write /var/lib/docker/...: no space left on device
Solutions:
Clean up Docker resources:
docker system prune -a docker volume pruneCheck available disk space:
df -hRemove unused Docker images:
docker images docker rmi <image-id>
Issue: Port 7860 already in use
Symptoms:
Error: bind: address already in use
Solutions:
Find and stop the process using port 7860:
# On macOS/Linux lsof -i :7860 kill -9 <PID> # On Windows netstat -ano | findstr :7860 taskkill /PID <PID> /FUse a different port:
docker run -p 8080:7860 innocence-api # Then test with: curl http://localhost:8080/health
Issue: Model file not found
Symptoms:
FileNotFoundError: [Errno 2] No such file or directory: 'models/innocence_pipeline.pkl'
Solutions:
Verify model file exists:
ls -la models/Check if Git LFS is needed:
file models/innocence_pipeline.pkl # If it shows "ASCII text", it's an LFS pointer, not the actual filePull LFS files:
git lfs pullRebuild Docker image after ensuring model is present:
docker build --no-cache -t innocence-api .
Issue: spaCy model download fails
Symptoms:
OSError: [E050] Can't find model 'en_core_web_sm'
Solutions:
Manually download the model:
python -m spacy download en_core_web_smCheck internet connectivity during Docker build
Use a mirror or download the model separately:
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.0/en_core_web_sm-3.7.0-py3-none-any.whl
Issue: Container starts but API doesn't respond
Symptoms:
- Container is running but curl requests timeout or fail
- No error messages in logs
Solutions:
Check container logs:
docker logs innocence-api-testVerify container is running:
docker psCheck if the application is listening on the correct port:
docker exec innocence-api-test netstat -tuln | grep 7860Test from inside the container:
docker exec innocence-api-test curl http://localhost:7860/healthRestart the container:
docker restart innocence-api-test
Issue: PDF processing fails or returns errors
Symptoms:
{"detail": "Internal server error"}
Solutions:
Check if PDF is valid:
file test.pdf # Should show: "PDF document, version X.X"Try with a different PDF file
Check container logs for detailed error:
docker logs innocence-api-testTest with a simple PDF:
- Create a test PDF with just a few sentences
- Verify it processes successfully
Check PDF text extraction:
import pdfplumber with pdfplumber.open("test.pdf") as pdf: for page in pdf.pages: print(page.extract_text())
Issue: Slow prediction times
Symptoms:
- Requests take more than 30 seconds
- Timeout errors
Solutions:
Check PDF size and page count:
pdfinfo test.pdf # If pdfinfo is installedTest with smaller PDFs first
Monitor resource usage:
docker stats innocence-api-testIncrease Docker memory allocation (Docker Desktop settings)
Consider using GPU acceleration for faster inference (requires CUDA setup)
Issue: CORS errors when testing from browser
Symptoms:
Access to fetch at 'http://localhost:7860/predict' from origin 'http://localhost:3000' has been blocked by CORS policy
Solutions:
Check CORS configuration in
app/main_bert.pyVerify allowed origins include your test origin
Use curl or Postman instead of browser for testing
Add your origin to CORS middleware temporarily for testing
Issue: Dependencies fail to install
Symptoms:
ERROR: Could not find a version that satisfies the requirement torch==X.X.X
Solutions:
Check Python version:
python --version # Should be 3.11 or compatible with requirements.txtUpdate pip:
pip install --upgrade pipInstall dependencies one by one to identify the problematic package:
pip install torch pip install transformers # etc.Check for platform-specific issues (especially with PyTorch on different OS)
Validation Checklist
Before deploying to Hugging Face, ensure:
- Docker image builds successfully without errors
- Container starts and shows "Uvicorn running" message
-
/healthendpoint returns{"status":"ok"} -
/predictendpoint accepts PDF and returns valid JSON - Different cutoff values (0.5, 0.7, 0.9) work correctly
- Error handling works (non-PDF file returns 400 error)
- Model loads correctly (check logs for model loading messages)
- Response times are reasonable (< 30 seconds for small PDFs)
- No error messages in Docker logs during normal operation
- Container can be stopped and restarted without issues
Next Steps After Successful Local Testing
Once all local tests pass:
- Commit your changes to git
- Push to Hugging Face Spaces repository
- Monitor the build logs on Hugging Face
- Test the deployed API using the Hugging Face Space URL
- Compare local and deployed behavior to ensure consistency
Technical Details
Dependencies
- FastAPI: Web framework for building the API
- PyTorch: Deep learning framework for running the BERT model
- Transformers: Hugging Face library for BERT tokenizer and model
- pdfplumber: PDF text extraction
- spaCy: Natural language processing for sentence segmentation
- uvicorn: ASGI server for running FastAPI
Model Architecture
- Base model: BERT (Bidirectional Encoder Representations from Transformers)
- Task: Binary classification (innocence claim detection)
- Input: Text sentences (max 128 tokens)
- Output: Confidence score (0.0 - 1.0)
Processing Pipeline
- Extract text from PDF pages using pdfplumber
- Segment text into sentences using spaCy
- Filter sentences (length between 10-500 characters)
- Tokenize and encode sentences using BERT tokenizer
- Run inference with fine-tuned BERT model
- Calculate confidence scores and aggregate metrics
- Return reliability percentage and tier classification
Limitations and Considerations
- Cold starts: First request after inactivity may take 30-60 seconds as the model loads
- Processing time: Large PDFs (>50 pages) may take several minutes to process
- Memory usage: The BERT model requires ~500MB RAM minimum
- Concurrent requests: Free tier Spaces have limited concurrency (1-2 requests)
- Timeout: Very large documents may timeout on free tier (consider upgrading to paid tier)
- GPU support: Currently configured for CPU inference (GPU can be enabled in paid tiers)
Troubleshooting
Build fails with "No space left on device"
- The Docker image may be too large
- Consider removing unnecessary files or using a smaller base image
Model fails to load
- Verify
models/innocence_pipeline.pklexists and is not corrupted - Check that Git LFS properly tracked and uploaded the file
- Review build logs for pickle/torch loading errors
API returns 400 "PDF required"
- Ensure you're sending
Content-Type: multipart/form-data - Verify the file parameter is named
file - Check that the uploaded file is a valid PDF
Slow response times
- First request (cold start) is always slower
- Consider upgrading to a paid tier for better performance
- Optimize the model or use quantization for faster inference
Support and Contributing
For issues, questions, or contributions, please visit the Space's discussion page or repository.
License
[Add your license information here]