metadata
license: apache-2.0
base_model: HuggingFaceTB/SmolLM2-135M-Instruct
tags:
- code-tape
- subtitle
- lora
- text-generation
code-tape Subtitle Postprocessor LoRA v12
LoRA adapter for the code-tape browser-local subtitle postprocessor. It is trained to correct ASR subtitles for frontend/code terminology and generate playback chapter jump points.
Contract
Input messages contain:
context: file name, code/runtime snippets, and glossary.inputSegments: subtitleidandtextonly.timeline: subtitleid,startMs, andendMs.
Output must be one JSON object:
{"segments":[{"id":"subtitle-1","text":"这里用 useState 维护 count"}],"chapters":[{"title":"状态设计","startMs":0,"endMs":1000}]}
segments should be sparse and contain only changed subtitles.
Training Notes
- Base:
HuggingFaceTB/SmolLM2-135M-Instruct - Records: 450 curated/distilled examples
- Epochs: 2
- Final train loss: 0.2545
- Corpus gates: JSON valid rate 1.0, sparse output rate 0.9333, unknown segment reference rate 0