Token Classification
Transformers.js
ONNX
bert
feature-extraction
coreference
multilingual
onnxruntime-web
Instructions to use cp500/infon-coref-pointer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use cp500/infon-coref-pointer with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('token-classification', 'cp500/infon-coref-pointer');
File size: 9,497 Bytes
fae24bf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | # @cp500/infon-coref
Multilingual coreference resolution in the browser or Node, via ONNX.
The trained model is a pointer-network coref resolver fine-tuned on
top of a multilingual MiniLM-L12 distilled from XLM-R. It handles
**English, Japanese, Korean, Thai, and Chinese** β replaces
English-only [fastcoref](https://github.com/shon-otmazgin/fastcoref)
for use cases that need multilingual coverage.
The model artefacts live at
[**cp500/infon-coref-pointer**](https://huggingface.co/cp500/infon-coref-pointer)
on the Hugging Face Hub. This package is the JavaScript client that
loads them.
## Install
```bash
npm install @cp500/infon-coref onnxruntime-web
# or for Node:
npm install @cp500/infon-coref onnxruntime-node
```
The ONNX runtime is a **peer dependency** so you only install the one
your environment needs. ``@huggingface/tokenizers`` is **optional**;
if installed, we use its WASM SentencePiece tokenizer (faster and
fully spec-compliant). Otherwise the package falls back to a minimal
pure-JS tokenizer that handles the XLM-R vocabulary.
## Quick start (browser)
```ts
import { InfonCorefModel } from '@cp500/infon-coref';
const model = await InfonCorefModel.fromHub('cp500/infon-coref-pointer', {
precision: 'fp16', // 'fp16' (default, ~235 MB) or 'fp32' (~470 MB)
device: 'auto', // tries WebGPU, falls back to WASM
});
const result = await model.resolve(
'Toyota announced a partnership with Panasonic on battery technology. ' +
'The Japanese automaker said the deal is worth $250 million.'
);
for (const cluster of result.clusters) {
const surfaces = cluster.map(i => result.mentions[i].text);
console.log(surfaces.join(' β '));
// Toyota β The Japanese automaker
}
```
## Quick start (Node)
```ts
import { InfonCorefModel } from '@cp500/infon-coref';
// Same API as fromHub, but reads from local files (e.g. after a
// huggingface-cli download).
const model = await InfonCorefModel.fromLocal('./models/infon-coref/');
const result = await model.resolve('Toyota e Panasonic anunciaram...');
```
## What you get back
```ts
interface CorefResult {
text: string; // original input, unchanged
tokens: Token[]; // wordpieces with char offsets
mentions: Mention[]; // detected mentions in document order
clusters: number[][]; // clusters[c] = list of mention indices
timing: {
tokenize: number;
backbone: number;
bioDecode: number;
scorer: number;
total: number; // ms
};
}
interface Mention {
start: number; // wordpiece index, inclusive
end: number; // wordpiece index, inclusive
charStart: number; // char offset in source text
charEnd: number;
text: string; // literal substring of source text
cluster: number; // -1 for singleton
antecedent: number; // 0-based mention index, -1 = no antecedent
}
```
## Languages
Trained on synthetic Bedrock/Claude-generated data balanced across:
| Code | Language |
|------|----------------|
| `en` | English |
| `ja` | Japanese |
| `ko` | Korean |
| `th` | Thai |
| `zh` | Chinese (Simplified) |
The XLM-R backbone covers ~100 languages but mention detection +
pointer-net heads were only trained on these 5. Other languages may
work via zero-shot transfer; verify on your domain before shipping.
## API
### `InfonCorefModel.fromHub(repo, options?)`
Load model artefacts from a Hugging Face repo. Downloads (and caches
in the browser Cache API) ``meta.json``, the chosen ONNX backbone,
the mention scorer, and ``tokenizer.json``.
| Option | Type | Default | Notes |
|----------------|-----------------------------------------|-----------|-------|
| `precision` | `'fp32' \| 'fp16'` | `'fp16'` | FP16 halves the download. Falls back to FP32 if FP16 is missing in the repo. |
| `device` | `'auto' \| 'webgpu' \| 'wasm' \| 'cpu' \| 'cuda'` | `'auto'` | Browser auto-prefers WebGPU. |
| `maxLength` | `number` | `256` | Truncates inputs longer than N wordpieces. |
| `bioThreshold` | `number` | none | If set, suppresses low-confidence span detections. `0.7` is a common stricter setting. |
| `revision` | `string` | `'main'` | HF branch/tag/commit-SHA pin. |
| `debug` | `boolean` | `false` | Logs per-stage timings to `console.debug`. |
### `InfonCorefModel.fromLocal(baseUrl, options?)`
Same as `fromHub` but loads files relative to a base URL or
filesystem path. Browser: `baseUrl` is a URL prefix
(`/models/coref/`). Node: a directory path (`./models/coref/`).
The directory must contain:
```
meta.json
tokenizer.json
onnx/backbone_bio.onnx (and .onnx.data sidecar if present)
onnx/backbone_bio_fp16.onnx
onnx/mention_scorer.onnx
onnx/mention_scorer_fp16.onnx
```
### `model.resolve(text, options?)`
Run end-to-end coref on a single document. Returns
[`CorefResult`](#what-you-get-back).
`options` accepts the same per-call overrides as `fromHub`'s
`maxLength`, `bioThreshold`, `debug`.
## Power-user exports
If you want to swap one stage of the pipeline (e.g. a custom
tokenizer or a different ORT runtime), the helpers are exported
individually:
```ts
import {
buildPairs, // mention M β flat (pair_i, pair_j) tensors
decodeBio, // BIO logits β wordpiece spans
groupClusters, // antecedent decisions β union-find clusters
loadTokenizer, // SentencePiece JSON β Tokenizer
fetchHubFile, // HF Hub fetch + browser-cache
} from '@cp500/infon-coref';
```
These match the Python reference implementation in
[`scripts/coref_onnx_experiment.py`](https://github.com/cp500/overlord/blob/main/infon/scripts/coref_onnx_experiment.py)
exactly β useful when comparing a Python/TS pipeline at the
intermediate-tensor level.
## Architecture
```
βββββββββββββββββββββββββββ
β text β
ββββββββββββββ¬βββββββββββββ
βΌ
βββββββββββββββββββββββββββ
β SentencePiece tokenize β tokenizer.json (XLM-R vocab)
ββββββββββββββ¬βββββββββββββ
βΌ input_ids, attention_mask
βββββββββββββββββββββββββββ
β backbone_bio.onnx β MiniLM-L12 (12 layers, H=384)
β β’ XLM-R encoder β + 3-class BIO head
β β’ bio_logits (T,3) β
ββββββββββ¬βββββββββ¬ββββββββ
β β
β βΌ bio_logits β run-length decode β spans
β ββββββββββββββββββββββββ
β β decodeBio (TS) β
β ββββββββββββ¬ββββββββββββ
β βΌ span_starts, span_ends
β ββββββββββββββββββββββββ
β β buildPairs (TS) β
β ββββββββββββ¬ββββββββββββ
β βΌ pair_i, pair_j (triangular)
βΌ βΌ
βββββββββββββββββββββββββββ
β mention_scorer.onnx β gather + segment-mean pool +
β β’ pair_scores (P,) β 3-vector pair MLP
ββββββββββββββ¬βββββββββββββ
βΌ
βββββββββββββββββββββββββββ
β pickAntecedents (TS) β
β + groupClusters (TS) β
ββββββββββββββ¬βββββββββββββ
βΌ
CorefResult
```
The split between the two ONNX graphs exists so the BIO head can
share computation with the backbone (one forward pass), while the
mention scorer can be re-run with different `(pair_i, pair_j)`
batches without recomputing hidden states. It also keeps each ONNX
file's input signature simple enough to trace cleanly.
## Performance ballpark
Numbers from a 2024 M1 Pro Macbook on a 110-token English document:
| Stage | WASM (FP16) | WebGPU (FP16) | Node CPU (FP16) |
|-----------|-------------|---------------|-----------------|
| Tokenize | 4 ms | 4 ms | 2 ms |
| Backbone | 220 ms | 70 ms | 90 ms |
| BIO | <1 ms | <1 ms | <1 ms |
| Scorer | 5 ms | 4 ms | 2 ms |
| **Total** | **~230 ms** | **~80 ms** | **~95 ms** |
First call adds ~2-4 s for ONNX session warmup. The Cache API in
browsers persists the downloaded model so warmup-after-reload is
limited to session creation.
## License
Apache 2.0. The trained weights at `cp500/infon-coref-pointer` carry
the same license; the underlying MiniLM-L12 backbone is also Apache
2.0.
## Status
Alpha. The API is stable enough to integrate behind your own
abstraction; expect minor breaking changes on the public class
shape until 1.0.
Issue tracker: https://github.com/cp500/infon-coref-js/issues
|