LLMic_v2 LoRA for Romanian Diacritic Restoration
LoRA adapters for LLMic_v2 (3B Romanian-English model).
Warning: Pretraining Contamination
This model produces systematic autoResizeIframe JavaScript callback prefixes due to FuLG pretraining corpus contamination. Not usable for diacritic restoration. Released for reproducibility only.
| Metric | Value |
|---|---|
| Word Accuracy | 1.45% |
| DER | 0.999 |
Training
- Base: faur-ai/LLMic_v2, LoRA rank 16, alpha 32
- 10,000 iterations (loss plateaued at step 5,000)
- Hardware: Apple M3 Ultra (MLX)
Resources
- Dataset: klusai/diacritics-ro
- Code: github.com/klusai/diacritics-finetuning-code
Model tree for klusai/diacritics-llmic-v2-lora
Base model
faur-ai/LLMic_v2