MiniCPM5-1B-eyewitness — testimony→attribute parser

LoRA fine-tune of openbmb/MiniCPM5-1B that turns messy bilingual (EN/ES) eyewitness descriptions ("caterpillar eyebrows, pinta de no haber dormido, un gorro de lana") into strict attribute JSON over a closed vocabulary. Powers the EYEWITNESS game.

  • Training data: 4,000 synthetic (testimony → labels) pairs generated by the game engine itself — ground truth by construction, no human labeling (generator: train/gen_dataset.py in the Space repo).
  • Recipe: LoRA r=16 α=32 on all-linear, 2 epochs, bf16, TRL SFTTrainer, trained on Modal (A10G).
  • Eval loss: 0.2846 (2% held-out split).

Built for the Build Small Hackathon (Thousand Token Wood track).

Downloads last month
117
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fcabla/MiniCPM5-1B-eyewitness

Adapter
(34)
this model

Space using Fcabla/MiniCPM5-1B-eyewitness 1