# CodeWraith

Module-to-Spec Transformer -- fine-tuned LLM that generates technical specifications from Python source code.

## Pipeline State

Read `.claude/pipeline_state.json` at session start to know where the ML pipeline left off. Update it after completing any pipeline stage (generation, cleaning, training, evaluation, upload).

## Process Monitoring

- When monitoring long-running processes (vLLM serving, dataset generation, model training, uploads), check status at **5-minute intervals minimum**. Do NOT poll more frequently unless explicitly asked.
- Before killing any long-running process, **always confirm with the user first**. Never assume a process is stuck without evidence of zero progress over multiple checks.
- For HuggingFace uploads of large models (>10GB), prefer `hf upload` CLI over Python `push_to_hub()`. The CLI handles resumption better.

## Environment

- Python 3.12, managed with `uv`
- Use `uv sync` / `uv run`, never `uv pip install`
- Tests: `uv run pytest`
- Lint: `uv run ruff check`
- GPU: NVIDIA RTX 5090 (32GB VRAM)
- Teacher models served via vLLM at 192.168.13.21:8081

## Commits

- Use Angular/Conventional Commits format
- No Co-Authored-By lines
- Commit at every meaningful milestone