Spaces:

girish00
/

coding-llm-space

Running

App Files Files Community

coding-llm-space / instruction.md

girish00

Upload folder using huggingface_hub

07a91a1 verified about 1 month ago

preview code

raw

history blame contribute delete

3.75 kB

	# Advanced Coding LLM - Complete Instructions

	This document provides full setup, run, validation, optimization, and deployment steps for the `coding-llm` project.

	## 1) Prerequisites

	- Python 3.10+ (recommended 3.11/3.12)
	- Git
	- Internet access for first model download
	- Optional: Docker Desktop
	- Optional: Hugging Face account and access token

	## 2) Project Setup

	From project root:

	```bash
	cd "C:\Users\GIRISH\OneDrive\Desktop\AI model_14_04_26\coding-llm"
	```

	Create environment file:

	```bash
	copy .env.example .env
	```

	Install dependencies:

	```bash
	python tasks.py install
	```

	## 3) Configure `.env`

	Open `.env` and set values:

	- `MODEL_NAME=Qwen/Qwen2.5-Coder-1.5B-Instruct`
	- `FALLBACK_MODEL_NAME=Qwen/Qwen2.5-Coder-0.5B-Instruct`
	- `FINAL_FALLBACK_MODEL_NAME=sshleifer/tiny-gpt2` (optional emergency fallback)
	- `FORCE_MOCK_MODE=false` (true for instant test mode)
	- `API_KEY=<your_secret_key>`
	- `RATE_LIMIT_PER_MINUTE=30`
	- `USE_RAG=true`

	## 4) Run API Locally

	```bash
	python tasks.py run
	```

	Server runs at:

	- `http://127.0.0.1:8000`

	Health endpoint:

	- `GET http://127.0.0.1:8000/health`

	## 5) Run Smoke Tests

	### Full smoke test

	```bash
	python smoke_test.py
	```

	### Health-only smoke test

	```bash
	set SMOKE_SKIP_GENERATE=true
	python smoke_test.py
	```

	### Combined run-and-test command

	```bash
	python tasks.py serve-smoke
	```

	This starts server, executes smoke test, and shuts server down automatically.

	## 6) If Generation Is Slow on First Run

	First `/generate` may take long due to model download/warmup.

	Options:

	- Increase timeout:
	- `set SMOKE_TIMEOUT=900`
	- Use mock mode for quick validation:
	- set `FORCE_MOCK_MODE=true`
	- Run full mode after model cache is ready.

	## 7) API Usage

	### Endpoint

	- `POST /generate`

	### Input JSON

	```json
	{
	"instruction": "Fix this code",
	"input": "def add(a,b) return a+b"
	}
	```

	### Required Header (if API key enabled)

	- `x-api-key: <API_KEY>`

	### Output JSON

	```json
	{
	"code": "...",
	"explanation": "...",
	"confidence": 0.0,
	"important_tokens": ["..."],
	"relevancy_score": 0.0,
	"hallucination": false,
	"latency_ms": 0
	}
	```

	## 8) Docker Deployment

	```bash
	copy .env.example .env
	docker compose up --build -d
	```

	Validate:

	```bash
	python smoke_test.py
	```

	Stop:

	```bash
	docker compose down
	```

	## 9) Hugging Face Space Deployment

	Create HF token (write permission), then:

	```bash
	python tasks.py hf-upload --repo-id <username/coding-llm-space> --token <HF_TOKEN>
	```

	After upload, configure Space variables/secrets:

	- `MODEL_NAME`
	- `FALLBACK_MODEL_NAME`
	- `FORCE_MOCK_MODE`
	- `API_KEY` (if needed in your architecture)

	## 10) Production Hardening Checklist

	- Keep `API_KEY` enabled
	- Keep rate limiting enabled (`RATE_LIMIT_PER_MINUTE`)
	- Put API behind HTTPS reverse proxy
	- Add logging and monitoring
	- Pin model versions if strict reproducibility required
	- Use `FORCE_MOCK_MODE=false` in production

	## 11) Common Troubleshooting

	- `WinError 10061`:
	- API server is not running. Start with `python tasks.py run`.
	- `401 Unauthorized`:
	- `x-api-key` does not match server `API_KEY`.
	- Health works but generate times out:
	- model is still downloading/warming up.
	- Low-quality gibberish output:
	- likely fallback model path used; verify `.env` model names.

	## 12) Recommended Daily Commands

	- Install/update: `python tasks.py install`
	- Run API: `python tasks.py run`
	- Smoke: `python tasks.py smoke`
	- Run+smoke: `python tasks.py serve-smoke`
	- Docker up/down: `python tasks.py docker-up` / `python tasks.py docker-down`