sms-classifier-api / ARCHITECTURE.md
cmeneses99's picture
Rewrite all docs in English
aea087a

Architecture

Deployment

GitHub (source code)
    β”‚
    └─► Hugging Face Spaces (Docker runtime)
            β”‚  builds and runs the FastAPI container
            β”‚
            β”œβ”€β–Ί on startup: downloads model from HF Hub
            β”‚       huggingface.co/cmeneses99/sms-classifier
            β”‚       (model.safetensors, tokenizer, config β€” ~520MB)
            β”‚
            └─► serves API on port 7860
                    https://cmeneses99-sms-classifier-api.hf.space

cron-job.org ──GET /health every 10min──► HF Spaces (keep-alive)

Request flow

Client
  β”‚
  β–Ό
FastAPI
  β”‚
  β”œβ”€β”€ web/pages.py      β†’ HTML responses (/, /classify, /classify/batch, /categories)
  β”œβ”€β”€ api/inference.py  β†’ POST /classify, POST /classify/batch
  └── api/meta.py       β†’ GET /health, GET /api/categories
        β”‚
        β–Ό
services/classifier.py
  β”‚
  β”œβ”€β”€ LRU Cache (core/cache.py) ──hit──► return cached response
  β”‚
  └── miss ──► core/model_loader.py (HuggingFace pipeline)
                    └── distilbert-base-multilingual-cased (fine-tuned)
                            └── top_k=3 predictions β†’ PredictResponse

Model

Detail Value
Base model distilbert-base-multilingual-cased
Task Sequence classification
Categories 9
Training data 3,150 synthetic examples (350/category, ES + EN)
Training 5 epochs, fine-tuned with HuggingFace Trainer API
Runtime CPU-only (PyTorch CPU build)
Cache LRU, max 512 entries, thread-safe

Project structure

app/
β”œβ”€β”€ main.py                      # Lifespan + router registration
β”œβ”€β”€ utils.py                     # normalize(), read_static()
β”œβ”€β”€ core/                        # Shared infrastructure
β”‚   β”œβ”€β”€ cache.py                 # Thread-safe LRU cache
β”‚   β”œβ”€β”€ model_loader.py          # Downloads model from HF Hub on startup
β”‚   β”œβ”€β”€ schemas.py               # Pydantic v2 request/response models
β”‚   └── category_meta.py         # Labels, colors and examples per category
β”œβ”€β”€ services/
β”‚   └── classifier.py            # Inference logic with cache integration
β”œβ”€β”€ api/                         # JSON endpoints
β”‚   β”œβ”€β”€ inference.py             # POST /classify, POST /classify/batch
β”‚   └── meta.py                  # GET /health, GET /api/categories
β”œβ”€β”€ web/                         # HTML endpoints
β”‚   └── pages.py                 # UI routes
└── templates/                   # HTML files
    β”œβ”€β”€ home.html
    β”œβ”€β”€ index.html                # Single classifier UI
    β”œβ”€β”€ batch.html                # Batch classifier UI
    └── categories.html
training/
β”œβ”€β”€ config.py
β”œβ”€β”€ generate_dataset.py
β”œβ”€β”€ train.py
└── eval_report.py