FinEE Production Readiness Report
Current Status vs Production Target
| Aspect | Current Status | Target | Gap |
|---|---|---|---|
| Training Data | 137,267 samples | 50,000+ | ✅ EXCEEDED |
| Bank Coverage | 8 banks | 15+ banks | ⚠️ Need 7 more |
| Document Types | Email, SMS | Email, PDF, SMS, Images | ⚠️ PDF/Image parsers added |
| Evaluation | F1=56.8% (regex) | F1 > 95% | ❌ LLM fine-tuning needed |
| Deployment | Mac (MLX) | Cloud + Mobile + Edge | ⚠️ Export scripts ready |
| Users | 0 external | 10+ active | ❌ Beta testing needed |
Benchmark Results
Regex Extractor (Baseline)
| Field | Accuracy | Status |
|---|---|---|
| Amount | 85.8% | ✅ Good |
| Type | 65.0% | ⚠️ Needs improvement |
| Bank | 100% | ✅ Excellent |
| Merchant | 28.3% | ❌ LLM needed |
| Category | 15.8% | ❌ LLM needed |
| Overall | 56.8% | ❌ Below target |
Expected with LLM Fine-tuning
| Field | With LLM | Target |
|---|---|---|
| Amount | ~98% | 99% |
| Type | ~97% | 98% |
| Bank | 100% | 100% |
| Merchant | ~92% | 95% |
| Category | ~88% | 90% |
| Overall | ~95% | 95% |
Priority Actions
High Priority (This Week)
Fine-tune LLM on 137K dataset
- Use
scripts/finetune.pywith MLX or PyTorch - Target: Phi-3 or Llama 3.1 8B
- Expected improvement: +40% F1
- Use
Add remaining banks
- BOB, Canara, Union, IDBI, Federal, South Indian, Karur Vysya
- Update
scripts/data_pipeline/generate_synthetic.py
Test PDF parsing
- Collect sample bank statements
- Test with
src/finee/pdf_parser.py
Medium Priority (This Month)
Export to ONNX
- Run
scripts/export_model.py --format onnx - Test inference speed
- Run
Deploy to HF Inference
- Push model to Hugging Face
- Enable Inference API
Get beta users
- Share demo: https://huggingface.co/spaces/Ranjit0034/finee-demo
- Collect feedback
Low Priority (Next Month)
- Mobile deployment (GGUF)
- Multi-turn agent
- Knowledge graph integration
Files Added
| File | Description |
|---|---|
src/finee/rag.py |
RAG engine with 50+ merchants |
src/finee/api.py |
FastAPI backend (8 endpoints) |
src/finee/ui.py |
Gradio web interface |
src/finee/pdf_parser.py |
PDF/Image parsing |
scripts/benchmark.py |
Production benchmark suite |
scripts/export_model.py |
ONNX/GGUF/CoreML export |
tests/test_rag.py |
33 comprehensive tests |
Commands
# Run benchmark
python scripts/benchmark.py --test-file data/instruction/test.jsonl --max-samples 1000
# Fine-tune LLM
python scripts/finetune.py --backend mlx --model microsoft/phi-3-mini-4k-instruct
# Export to ONNX
python scripts/export_model.py models/finetuned --format onnx
# Start API server
python -m finee.api --port 8000
# Launch Gradio UI
python -m finee.ui --port 7860
Links
- Demo: https://huggingface.co/spaces/Ranjit0034/finee-demo
- Dataset: https://huggingface.co/datasets/Ranjit0034/finee-dataset
- Model: https://huggingface.co/Ranjit0034/finee-phi3-4b
- Code: https://huggingface.co/Ranjit0034/finance-entity-extractor
Last updated: 2026-01-14