# Validation Guide ## GPU preflight Before training, run: ```bash python scripts/preflight_gpu.py ``` or: ```bash make preflight ``` Expected output: ```text GPU_PREFLIGHT_PASS ``` This checks: - PyTorch version - CUDA availability - GPU name - total VRAM - compute capability - minimum T4-class VRAM ## GPU smoke test The intended GPU smoke test is: ```bash python scripts/smoke_test.py ``` This loads `Qwen/Qwen2.5-1.5B-Instruct`, runs 15 LoRA SFT steps on 100 examples, generates one candidate, and prints `SMOKE_PASS`. From the agent environment this could not be executed because GPU/HF Jobs execution was repeatedly rejected. ## CPU static validation For environments without a GPU, run: ```bash git clone https://huggingface.co/krystv/nomen-ai cd nomen-ai pip install -e . datasets trl peft transformers rapidfuzz pyphen PyYAML python tests/test_static.py ``` This validates: - 20+ root families are present. - Control token prompt construction. - Syllable/character utilities. - Anti-duplication matrix. - Synthetic example generation. - SFT dataset schema. - DPO dataset schema. - Current TRL `SFTConfig` and `DPOConfig` argument names used by the training scripts. Expected output: ```text CPU_STATIC_VALIDATION_PASS ```