Spaces:
Running
Running
| # Advanced Coding LLM - Complete Instructions | |
| This document provides full setup, run, validation, optimization, and deployment steps for the `coding-llm` project. | |
| ## 1) Prerequisites | |
| - Python 3.10+ (recommended 3.11/3.12) | |
| - Git | |
| - Internet access for first model download | |
| - Optional: Docker Desktop | |
| - Optional: Hugging Face account and access token | |
| ## 2) Project Setup | |
| From project root: | |
| ```bash | |
| cd "C:\Users\GIRISH\OneDrive\Desktop\AI model_14_04_26\coding-llm" | |
| ``` | |
| Create environment file: | |
| ```bash | |
| copy .env.example .env | |
| ``` | |
| Install dependencies: | |
| ```bash | |
| python tasks.py install | |
| ``` | |
| ## 3) Configure `.env` | |
| Open `.env` and set values: | |
| - `MODEL_NAME=Qwen/Qwen2.5-Coder-1.5B-Instruct` | |
| - `FALLBACK_MODEL_NAME=Qwen/Qwen2.5-Coder-0.5B-Instruct` | |
| - `FINAL_FALLBACK_MODEL_NAME=sshleifer/tiny-gpt2` (optional emergency fallback) | |
| - `FORCE_MOCK_MODE=false` (true for instant test mode) | |
| - `API_KEY=<your_secret_key>` | |
| - `RATE_LIMIT_PER_MINUTE=30` | |
| - `USE_RAG=true` | |
| ## 4) Run API Locally | |
| ```bash | |
| python tasks.py run | |
| ``` | |
| Server runs at: | |
| - `http://127.0.0.1:8000` | |
| Health endpoint: | |
| - `GET http://127.0.0.1:8000/health` | |
| ## 5) Run Smoke Tests | |
| ### Full smoke test | |
| ```bash | |
| python smoke_test.py | |
| ``` | |
| ### Health-only smoke test | |
| ```bash | |
| set SMOKE_SKIP_GENERATE=true | |
| python smoke_test.py | |
| ``` | |
| ### Combined run-and-test command | |
| ```bash | |
| python tasks.py serve-smoke | |
| ``` | |
| This starts server, executes smoke test, and shuts server down automatically. | |
| ## 6) If Generation Is Slow on First Run | |
| First `/generate` may take long due to model download/warmup. | |
| Options: | |
| - Increase timeout: | |
| - `set SMOKE_TIMEOUT=900` | |
| - Use mock mode for quick validation: | |
| - set `FORCE_MOCK_MODE=true` | |
| - Run full mode after model cache is ready. | |
| ## 7) API Usage | |
| ### Endpoint | |
| - `POST /generate` | |
| ### Input JSON | |
| ```json | |
| { | |
| "instruction": "Fix this code", | |
| "input": "def add(a,b) return a+b" | |
| } | |
| ``` | |
| ### Required Header (if API key enabled) | |
| - `x-api-key: <API_KEY>` | |
| ### Output JSON | |
| ```json | |
| { | |
| "code": "...", | |
| "explanation": "...", | |
| "confidence": 0.0, | |
| "important_tokens": ["..."], | |
| "relevancy_score": 0.0, | |
| "hallucination": false, | |
| "latency_ms": 0 | |
| } | |
| ``` | |
| ## 8) Docker Deployment | |
| ```bash | |
| copy .env.example .env | |
| docker compose up --build -d | |
| ``` | |
| Validate: | |
| ```bash | |
| python smoke_test.py | |
| ``` | |
| Stop: | |
| ```bash | |
| docker compose down | |
| ``` | |
| ## 9) Hugging Face Space Deployment | |
| Create HF token (write permission), then: | |
| ```bash | |
| python tasks.py hf-upload --repo-id <username/coding-llm-space> --token <HF_TOKEN> | |
| ``` | |
| After upload, configure Space variables/secrets: | |
| - `MODEL_NAME` | |
| - `FALLBACK_MODEL_NAME` | |
| - `FORCE_MOCK_MODE` | |
| - `API_KEY` (if needed in your architecture) | |
| ## 10) Production Hardening Checklist | |
| - Keep `API_KEY` enabled | |
| - Keep rate limiting enabled (`RATE_LIMIT_PER_MINUTE`) | |
| - Put API behind HTTPS reverse proxy | |
| - Add logging and monitoring | |
| - Pin model versions if strict reproducibility required | |
| - Use `FORCE_MOCK_MODE=false` in production | |
| ## 11) Common Troubleshooting | |
| - `WinError 10061`: | |
| - API server is not running. Start with `python tasks.py run`. | |
| - `401 Unauthorized`: | |
| - `x-api-key` does not match server `API_KEY`. | |
| - Health works but generate times out: | |
| - model is still downloading/warming up. | |
| - Low-quality gibberish output: | |
| - likely fallback model path used; verify `.env` model names. | |
| ## 12) Recommended Daily Commands | |
| - Install/update: `python tasks.py install` | |
| - Run API: `python tasks.py run` | |
| - Smoke: `python tasks.py smoke` | |
| - Run+smoke: `python tasks.py serve-smoke` | |
| - Docker up/down: `python tasks.py docker-up` / `python tasks.py docker-down` | |