--- license: mit library_name: onnx tags: - alignscore - onnx - text-classification - natural-language-inference - roberta base_model: yzha/AlignScore pipeline_tag: text-classification --- # AlignScore-large (ONNX) An ONNX export of [**AlignScore-large**](https://huggingface.co/yzha/AlignScore) (RoBERTa-large with AlignScore's 3-way and regression heads), for in-process inference without a Python/PyTorch runtime. ## Why this exists Upstream AlignScore ships only PyTorch Lightning checkpoints built from a custom `BERTAlignModel` module - no `config.json`, no `safetensors` - so `optimum-cli export onnx` cannot consume it, and no ONNX build existed. This is that build, so anyone who wants to experiment with AlignScore-large can, without standing up a PyTorch runtime or writing a custom export pathway. It reflects a Familiar Tools belief: a specialized, right-sized model that runs efficiently and in-process beats reaching for a large, general, resource-hungry one. Exporting a focused model to ONNX is part of that - it makes the model cheap to run, easy to embed, and light on dependencies. Custom, deliberately engineered solutions tend to be more efficient and more resource-aware than general-purpose defaults. ## What this is This is a faithful ONNX export of the encoder + pooler + the two heads used for alignment scoring: - a RoBERTa-large encoder with pooling layer (`pooler_output = tanh(dense(h[:,0]))`) - `tri_layer`: `Linear(hidden, 3)` -> raw 3-way logits - `reg_layer`: `Linear(hidden, 1)` -> raw regression logit Activations are **not** baked into the graph: it emits raw logits, and the caller applies `softmax` to the 3-way head and `sigmoid` to the regression head. ### Graph I/O | Tensor | Direction | Type | Shape | |--------|-----------|------|-------| | `input_ids` | input | int64 | `[batch, seq]` (dynamic) | | `attention_mask` | input | int64 | `[batch, seq]` (dynamic) | | `tri_logits` | output | float32 | `[batch, 3]` -> softmax -> `[p_aligned, p_neutral, p_contradict]` | | `reg_logit` | output | float32 | `[batch, 1]` -> sigmoid -> `p_aligned_reg` | There is no `token_type_ids` input: a sentence pair is encoded into a single `input_ids` sequence with `` separators, exactly as `AutoTokenizer("roberta-large")(context, claim)` produces. The bundled `tokenizer.json` is the matching fast tokenizer. Use `max_length=512`. Opset 17. Exported with PyTorch (TorchScript exporter). ## Files - `alignscore-large.onnx` - the model (~1.36 GB) - `tokenizer.json` - roberta-large fast tokenizer with the pair post-processor ## Parity Verified against the original PyTorch model's scores on a 136-pair corpus: **max absolute difference 1.18e-07** across both directions and all four probabilities. The source checkpoint SHA-256 is asserted equal to the reference before export, so these are provably the same weights. A high-confidence contradiction pair scores `p_contradict_3way` 0.982 / 0.977 (forward / reverse). ## Usage (ONNX Runtime, Python) ```python import numpy as np, onnxruntime as ort from tokenizers import Tokenizer tok = Tokenizer.from_file("tokenizer.json") sess = ort.InferenceSession("alignscore-large.onnx") def score(context, claim): enc = tok.encode(context, claim) ids = np.array([enc.ids], dtype=np.int64) mask = np.array([enc.attention_mask], dtype=np.int64) tri, reg = sess.run(["tri_logits", "reg_logit"], {"input_ids": ids, "attention_mask": mask}) e = np.exp(tri[0] - tri[0].max()); tri_p = e / e.sum() return { "p_aligned_3way": float(tri_p[0]), "p_neutral_3way": float(tri_p[1]), "p_contradict_3way": float(tri_p[2]), "p_aligned_reg": float(1 / (1 + np.exp(-reg[0][0]))), } ``` ## License and attribution Released under the **MIT License**, matching upstream. - AlignScore: Zha et al., *AlignScore: Evaluating Factual Consistency with a Unified Alignment Function*, ACL 2023 ([arXiv:2305.16739](https://arxiv.org/abs/2305.16739), [code](https://github.com/yuh-zha/AlignScore)). - Original weights: [`yzha/AlignScore`](https://huggingface.co/yzha/AlignScore) (revision `8509e78d25bb914939fc585c626500c9b2944249`). - Base encoder: RoBERTa-large (Liu et al., 2019). This repo redistributes a derivative (ONNX export) of the above under the same MIT terms. No weights were retrained or modified; only the inference graph was re-expressed.