| license: mit | |
| base_model: lytang/MiniCheck-RoBERTa-Large | |
| tags: | |
| - coreml | |
| - text-classification | |
| - fact-checking | |
| - grounding | |
| language: | |
| - en | |
| # MiniCheck-RoBERTa-Large — Core ML (Apple Neural Engine) | |
| Core ML conversion of [lytang/MiniCheck-RoBERTa-Large](https://huggingface.co/lytang/MiniCheck-RoBERTa-Large) | |
| (MIT) — a specialized grounding / fact-verification model — for in-app use on the **Apple Neural Engine** | |
| via Core ML. Used by Marvel Mirror AI as a claim-by-claim faithfulness judge: *does a source support a claim?* | |
| ## Contents | |
| - `MiniCheckRoBERTa.mlpackage` — the Core ML model (fp16 weights). Inputs: `input_ids`, `attention_mask` | |
| (int32, length 512). Output: `support_prob` = probability the claim is supported (class 1). | |
| - RoBERTa fast-tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`, `tokenizer_config.json`, | |
| `special_tokens_map.json`). | |
| ## Input format | |
| `doc + </s> + claim`, tokenized with the RoBERTa tokenizer (`max_length` 512, padded). `support_prob > 0.5` | |
| = supported. | |
| ## Provenance | |
| Converted with coremltools 9.0 (torch 2.7.0 / transformers 4.46.3), targeting CPU + Neural Engine. | |
| ~97% of compute-bearing ops run on the ANE. Verdict-parity with the PyTorch source (max probability | |
| diff < 0.007); reproduces the source's full-set accuracy (21/21 fabrications caught, incl. all | |
| meaning- and numeric-inversions). ~60 ms/check on the ANE. | |