fullstop-punctuation-coreml-fp16

oliverguhr/fullstop-punctuation-multilang-large (XLM-RoBERTa-large punctuation restoration, trained on Europarl speech transcripts) converted to Core ML for on-device use.

punctuation.mlmodelc/ — compiled Core ML model, fp16 activations, fp16 weights (~1.0 GB)
sentencepiece.bpe.model — the XLM-RoBERTa SentencePiece tokenizer model

Model details

Inputs: input_ids [1, 256] int32, attention_mask [1, 256] int32 (HF XLM-RoBERTa id scheme, pad=1)
Outputs: label_preds [1, 256] int32 (argmax label per subtoken), label_logits [1, 256, 6] (raw scores)
Labels: 0 (none), ., ,, ?, -, : — the mark to append after the word ending at each subtoken
Conversion verified by exact-label parity with the PyTorch fp32 reference
~54 ms per 256-token window on Apple Silicon GPU; loads in ~0.3–3 s (the compile is cached by macOS after the first launch)

Plain fp16 was chosen deliberately: int8 per-block weight compression halves the size but its inline dequantize ops cost ~12 s of uncached GPU shader compilation on every process launch.

Converted for the Babble dictation app.

License

MIT, following the original model.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for soloish90/fullstop-punctuation-coreml-fp16

Base model

oliverguhr/fullstop-punctuation-multilang-large

Finetuned

(1)

this model

soloish90
/

fullstop-punctuation-coreml-fp16

fullstop-punctuation-coreml-fp16

Contents

Model details

License

Model tree for soloish90/fullstop-punctuation-coreml-fp16