simst's picture
MiniCheck-RoBERTa-Large Core ML (ANE) + tokenizer for MMAI faithfulness judge
93a7119 verified
|
Raw
History Blame Contribute Delete
1.4 kB
---
license: mit
base_model: lytang/MiniCheck-RoBERTa-Large
tags:
- coreml
- text-classification
- fact-checking
- grounding
language:
- en
---
# MiniCheck-RoBERTa-Large — Core ML (Apple Neural Engine)
Core ML conversion of [lytang/MiniCheck-RoBERTa-Large](https://huggingface.co/lytang/MiniCheck-RoBERTa-Large)
(MIT) — a specialized grounding / fact-verification model — for in-app use on the **Apple Neural Engine**
via Core ML. Used by Marvel Mirror AI as a claim-by-claim faithfulness judge: *does a source support a claim?*
## Contents
- `MiniCheckRoBERTa.mlpackage` — the Core ML model (fp16 weights). Inputs: `input_ids`, `attention_mask`
(int32, length 512). Output: `support_prob` = probability the claim is supported (class 1).
- RoBERTa fast-tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`, `tokenizer_config.json`,
`special_tokens_map.json`).
## Input format
`doc + </s> + claim`, tokenized with the RoBERTa tokenizer (`max_length` 512, padded). `support_prob > 0.5`
= supported.
## Provenance
Converted with coremltools 9.0 (torch 2.7.0 / transformers 4.46.3), targeting CPU + Neural Engine.
~97% of compute-bearing ops run on the ANE. Verdict-parity with the PyTorch source (max probability
diff < 0.007); reproduces the source's full-set accuracy (21/21 fabrications caught, incl. all
meaning- and numeric-inversions). ~60 ms/check on the ANE.