sound-broken / push_lora.py
mitvho09's picture
Upload Space app
edb671a verified
Raw
History Blame Contribute Delete
3.36 kB
"""Push the trained LoRA adapter to HuggingFace Hub.
Usage:
huggingface-cli login # ensure write token
python push_lora.py # pushes to mitvho09/sound-broken-nemotron-lora
The adapter trains Nemotron-Nano-4B to produce grounded JSON diagnoses
from deterministic audio features + rule-engine candidates. Trained on
DCASE 2025 Task 2 (real industrial machine audio).
"""
from __future__ import annotations
import os
import json
from pathlib import Path
REPO_ID = "mitvho09/sound-broken-nemotron-lora"
ADAPTER_DIR = Path(__file__).parent / "lora-adapter"
MODEL_CARD = """---
base_model: nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
- lora
- transformers
- hackathon
- build-small
- audio
- appliance-diagnosis
---
# Sound-Broken Nemotron LoRA Adapter
Fine-tuned LoRA adapter for **"Does It Sound Broken?"** β€” an appliance fault
diagnosis app from the Build Small Hackathon 2026.
## What it does
This adapter teaches Nemotron-Nano-4B to produce **grounded JSON diagnoses** from
deterministic audio features + rule-engine candidates. The model never hears raw
audio β€” it reasons over 14 measured acoustic features (spectral centroid, RMS,
onset rate, harmonic ratio, etc.) extracted by librosa.
## Training
- **Base model:** `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16`
- **Dataset:** DCASE 2025 Task 2 (real industrial machine audio, 7 machine types)
- **Training pairs:** 300 (feature descriptions + rule candidates β†’ JSON responses)
- **Epochs:** 2
- **LoRA config:** r=16, alpha=32, dropout=0.05, targets: q/k/v/o_proj
- **Loss convergence:** 1.85 β†’ 0.72 (epoch 1, step 100)
## Usage with PEFT
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16", torch_dtype="auto"
)
model = PeftModel.from_pretrained(base, "mitvho09/sound-broken-nemotron-lora")
tok = AutoTokenizer.from_pretrained("mitvho09/sound-broken-nemotron-lora")
```
## Context
Part of the **"Does It Sound Broken?"** app β€” record 10 seconds of an appliance,
get a diagnosis. The full pipeline:
1. **librosa** extracts 14 deterministic features (CPU)
2. **Rule engine** ranks candidate faults (CPU, transparent)
3. **This LoRA** (on Nemotron-4B) produces grounded JSON explanation (GPU)
4. **json_guard** validates output grounding (CPU)
The rule engine is the floor; this adapter makes the LLM narration more accurate.
"""
def main():
from huggingface_hub import HfApi, upload_folder
if not ADAPTER_DIR.exists():
print(f"ERROR: adapter not found at {ADAPTER_DIR}")
print("Run: modal run finetune/train.py::train_lora")
return
print(f"Adapter files: {[f.name for f in ADAPTER_DIR.iterdir()]}")
# Write the model card
readme_path = ADAPTER_DIR / "README.md"
readme_path.write_text(MODEL_CARD)
api = HfApi()
api.create_repo(REPO_ID, repo_type="model", exist_ok=True)
print(f"Repo: https://huggingface.co/{REPO_ID}")
upload_folder(
repo_id=REPO_ID,
folder_path=str(ADAPTER_DIR),
repo_type="model",
)
print(f"Uploaded to https://huggingface.co/{REPO_ID}")
if __name__ == "__main__":
main()