Spaces:
Sleeping
Sleeping
metadata
title: LLM Error Classifier API
emoji: 🚀
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 20.10.24
app_file: main.py
pinned: false
license: mit
LLM Error Classifier API
FastAPI backend serving the fine-tuned Llama-3.2-3B model for tool-use error classification.
API Endpoints
POST /api/classify- Classify a tool callGET /api/examples- Get example inputsGET /health- Health check
Model
Model: daoqm123/llm-error-classifier
Usage
The API will automatically load the model from HuggingFace Hub on startup.
Deploying to Hugging Face Spaces
Create a Space
- Go to https://huggingface.co/spaces/new and choose
Dockeras the SDK (this repo already contains a Dockerfile). - Give the space a name such as
llm-error-classifier-apiand select the desired hardware (CPU is fine unless you need GPU acceleration). - After the space is created, copy the Git commands shown in the “Files” tab; you will push the contents of this
api/folder there.
- Go to https://huggingface.co/spaces/new and choose
Authenticate locally
pip install -U "huggingface_hub[cli]" huggingface-cli loginUse a write token from https://huggingface.co/settings/tokens.
Push the backend code
cd /work/cssema416/202610/12/llm-frontend-for-quang\ \(1\)/api rm -rf .git git init git remote add origin https://huggingface.co/spaces/<username>/<space-name> git add . git commit -m "Deploy FastAPI backend" git push origin mainReplace
<username>and<space-name>with your actual values. Hugging Face will build the Docker image automatically; the server becomes available athttps://<space-name>.<username>.hf.space.Configure runtime behavior (optional)
- Set a custom
MODEL_PATHor other environment variables from the “Settings → Repository secrets” tab inside the Space. - If you need GPU, request the proper hardware tier in the hardware selector.
- Set a custom
Wire up the Vercel frontend
- In
frontend/lib/api.tsthe app readsprocess.env.NEXT_PUBLIC_API_URL. - On Vercel, set
NEXT_PUBLIC_API_URL=https://<space-name>.<username>.hf.space(no trailing slash) and redeploy the frontend so calls go directly to the Space backend.
- In
Verify
- Open the Space URL to confirm the FastAPI app is live (you should see the default 404 JSON from FastAPI or add a
/healthsuffix). - Visit your Vercel deployment and ensure inference requests succeed using the new backend endpoint.
- Open the Space URL to confirm the FastAPI app is live (you should see the default 404 JSON from FastAPI or add a