Instructions to use krystv/nomen-ai with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use krystv/nomen-ai with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| # Validation Guide | |
| ## GPU preflight | |
| Before training, run: | |
| ```bash | |
| python scripts/preflight_gpu.py | |
| ``` | |
| or: | |
| ```bash | |
| make preflight | |
| ``` | |
| Expected output: | |
| ```text | |
| GPU_PREFLIGHT_PASS | |
| ``` | |
| This checks: | |
| - PyTorch version | |
| - CUDA availability | |
| - GPU name | |
| - total VRAM | |
| - compute capability | |
| - minimum T4-class VRAM | |
| ## GPU smoke test | |
| The intended GPU smoke test is: | |
| ```bash | |
| python scripts/smoke_test.py | |
| ``` | |
| This loads `Qwen/Qwen2.5-1.5B-Instruct`, runs 15 LoRA SFT steps on 100 examples, generates one candidate, and prints `SMOKE_PASS`. | |
| From the agent environment this could not be executed because GPU/HF Jobs execution was repeatedly rejected. | |
| ## CPU static validation | |
| For environments without a GPU, run: | |
| ```bash | |
| git clone https://huggingface.co/krystv/nomen-ai | |
| cd nomen-ai | |
| pip install -e . datasets trl peft transformers rapidfuzz pyphen PyYAML | |
| python tests/test_static.py | |
| ``` | |
| This validates: | |
| - 20+ root families are present. | |
| - Control token prompt construction. | |
| - Syllable/character utilities. | |
| - Anti-duplication matrix. | |
| - Synthetic example generation. | |
| - SFT dataset schema. | |
| - DPO dataset schema. | |
| - Current TRL `SFTConfig` and `DPOConfig` argument names used by the training scripts. | |
| Expected output: | |
| ```text | |
| CPU_STATIC_VALIDATION_PASS | |
| ``` | |