File size: 6,030 Bytes
a97fe9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c09ddd
 
 
 
 
 
 
a97fe9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c09ddd
a97fe9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c09ddd
 
a97fe9c
 
 
 
 
 
 
 
 
 
 
4c09ddd
a97fe9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c09ddd
 
 
 
 
 
 
 
 
 
a97fe9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
---
license: apache-2.0
base_model: ceilf6/code-tape-subtitle-postprocessor-merged
library_name: transformers.js
pipeline_tag: text-generation
language:
- zh
- en
tags:
- onnx
- transformers.js
- webgpu
- wasm
- code-tape
- subtitle-correction
- chapter-generation
---

# code-tape subtitle postprocessor ONNX

This is the browser-local ONNX export of the code-tape subtitle post-processing model. It is the default LLM used by the code-tape web app for the "็บ ้”™ๅนถ็”Ÿๆˆ็ซ ่Š‚" workflow.

The model receives ASR subtitle segments plus code context and returns strict JSON:

- sparse subtitle corrections for frontend/code terminology;
- playback chapter jump points derived from subtitle timestamps;
- no Markdown, no explanation, no extra wrapper text.

This model is not ASR. In code-tape, ASR is handled separately by Whisper; this ONNX model only post-processes the resulting subtitle text.

## Repository role

code-tape publishes this model family in three forms:

| Repository | Purpose |
| --- | --- |
| [`ceilf6/code-tape-subtitle-postprocessor-lora`](https://huggingface.co/ceilf6/code-tape-subtitle-postprocessor-lora) | LoRA adapter for reproducibility and continued fine-tuning. |
| [`ceilf6/code-tape-subtitle-postprocessor-merged`](https://huggingface.co/ceilf6/code-tape-subtitle-postprocessor-merged) | Full merged Hugging Face model. |
| [`ceilf6/code-tape-subtitle-postprocessor-onnx`](https://huggingface.co/ceilf6/code-tape-subtitle-postprocessor-onnx) | This Transformers.js-compatible ONNX export for browser-local inference. |

Use this repository when integrating with the browser app.

## Intended contract

Input payload:

```json
{
  "context": {
    "fileName": "SubtitlePanel.tsx",
    "code": "await postProcessor.process({ track, context });",
    "runtimeOutput": "",
    "glossary": ["SubtitlePanel", "postProcessor", "chapters"]
  },
  "inputSegments": [
    { "id": "subtitle-1", "text": "่ฟ™้‡Œๅˆ›ๅปบ hugging face ๅญ—ๅน• post processor" },
    { "id": "subtitle-2", "text": "ๆœ€ๅŽ็”Ÿๆˆ corrections ๅ’Œ chapters" }
  ],
  "timeline": [
    { "id": "subtitle-1", "startMs": 0, "endMs": 1600 },
    { "id": "subtitle-2", "startMs": 1600, "endMs": 3300 }
  ]
}
```

Expected output shape:

```json
{
  "segments": [
    { "id": "subtitle-1", "text": "่ฟ™้‡Œๅˆ›ๅปบ Hugging Face ๅญ—ๅน• postProcessor" }
  ],
  "chapters": [
    { "title": "ๅˆ›ๅปบๅญ—ๅน•ๅŽๅค„็†ๅ™จ", "startMs": 0, "endMs": 1600 },
    { "title": "็”Ÿๆˆ็บ ้”™ๅ’Œ็ซ ่Š‚", "startMs": 1600, "endMs": 3300 }
  ]
}
```

`segments` is a sparse change set. Omitted subtitle segments are treated as unchanged by the application.

## Browser usage

```javascript
import { pipeline } from "@huggingface/transformers";

const generator = await pipeline(
  "text-generation",
  "ceilf6/code-tape-subtitle-postprocessor-onnx",
  { device: "wasm", dtype: "q8" },
);

const messages = [
  {
    role: "system",
    content: [
      "You are the code-tape subtitle post-processing model.",
      "Only output one JSON object.",
      "Goal: correct ASR subtitle text for frontend/code terms and create playback chapter jump points.",
      'Output shape: {"segments":[{"id":"subtitle-1","text":"corrected text"}],"chapters":[{"title":"้—ฎ้ข˜ๅˆ†ๆž","startMs":0,"endMs":1000}]}',
    ].join("\n"),
  },
  {
    role: "user",
    content: JSON.stringify({
      context: { fileName: "Counter.tsx", code: "", runtimeOutput: "", glossary: ["useState"] },
      inputSegments: [{ id: "subtitle-1", text: "่ฟ™้‡Œ็”จ use state" }],
      timeline: [{ id: "subtitle-1", startMs: 0, endMs: 1200 }],
    }),
  },
];

const output = await generator(messages, {
  max_new_tokens: 384,
  do_sample: false,
  return_full_text: false,
});
```

In production, code-tape loads the validated WASM q8 path directly. The q4/q4f16 exports were not published for the current v12 artifact because local Transformers.js smoke testing produced malformed JSON. The application also handles browser cache write failures and validates every model response before applying it.

## Integration notes

- Public browser loading does not require a Hugging Face token.
- Keep prompts short. The code-tape app budgets source code, runtime output, and output token count to keep local inference responsive.
- Validate JSON before use. Invalid JSON, unknown segment ids, duplicate ids, empty text, overlapping chapters, or chapters outside the subtitle timeline must fall back safely.
- This model should run after ASR, not before ASR.

## Training and export lineage

1. Fine-tune a LoRA adapter from `HuggingFaceTB/SmolLM2-135M-Instruct`.
2. Merge the adapter into a full Hugging Face model.
3. Export/quantize the merged model to ONNX for `@huggingface/transformers` browser inference.

## Evaluation

code-tape evaluates this model family with project-specific checks:

- JSON parseability;
- sparse segment reference validity;
- glossary preservation after sparse corrections are applied to the source subtitles;
- chapter ordering, overlap, and bounds within the subtitle timeline.

No broad general-purpose benchmark score is claimed.

## Current v12 smoke result

On the code-tape validation prompt with the `inputSegments` plus `timeline` contract:

- q8 load: 651 ms;
- q8 generation: 1274 ms;
- JSON valid: yes;
- unknown segment ids: 0;
- extra timing fields inside `segments`: 0.

## Limitations

- The model is small and domain-specific; malformed JSON is possible.
- It is optimized for frontend/code explanation subtitles, not arbitrary subtitles.
- It cannot transcribe audio.
- Long subtitle tracks should be split before local browser inference.

## Privacy and security

The intended path is browser-local inference. Audio transcription, subtitle correction, and chapter generation can run without sending media or subtitles to a hosted inference API.

Do not include secrets, private source code, credentials, or access tokens in prompts unless you control the full runtime and storage environment.

## License

Apache-2.0, following the base model license.