Token Classification
Transformers.js
ONNX
bert
feature-extraction
coreference
multilingual
onnxruntime-web
Instructions to use cp500/infon-coref-pointer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use cp500/infon-coref-pointer with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('token-classification', 'cp500/infon-coref-pointer');
| # @cp500/infon-coref | |
| Multilingual coreference resolution in the browser or Node, via ONNX. | |
| The trained model is a pointer-network coref resolver fine-tuned on | |
| top of a multilingual MiniLM-L12 distilled from XLM-R. It handles | |
| **English, Japanese, Korean, Thai, and Chinese** β replaces | |
| English-only [fastcoref](https://github.com/shon-otmazgin/fastcoref) | |
| for use cases that need multilingual coverage. | |
| The model artefacts live at | |
| [**cp500/infon-coref-pointer**](https://huggingface.co/cp500/infon-coref-pointer) | |
| on the Hugging Face Hub. This package is the JavaScript client that | |
| loads them. | |
| ## Install | |
| ```bash | |
| npm install @cp500/infon-coref onnxruntime-web | |
| # or for Node: | |
| npm install @cp500/infon-coref onnxruntime-node | |
| ``` | |
| The ONNX runtime is a **peer dependency** so you only install the one | |
| your environment needs. ``@huggingface/tokenizers`` is **optional**; | |
| if installed, we use its WASM SentencePiece tokenizer (faster and | |
| fully spec-compliant). Otherwise the package falls back to a minimal | |
| pure-JS tokenizer that handles the XLM-R vocabulary. | |
| ## Quick start (browser) | |
| ```ts | |
| import { InfonCorefModel } from '@cp500/infon-coref'; | |
| const model = await InfonCorefModel.fromHub('cp500/infon-coref-pointer', { | |
| precision: 'fp16', // 'fp16' (default, ~235 MB) or 'fp32' (~470 MB) | |
| device: 'auto', // tries WebGPU, falls back to WASM | |
| }); | |
| const result = await model.resolve( | |
| 'Toyota announced a partnership with Panasonic on battery technology. ' + | |
| 'The Japanese automaker said the deal is worth $250 million.' | |
| ); | |
| for (const cluster of result.clusters) { | |
| const surfaces = cluster.map(i => result.mentions[i].text); | |
| console.log(surfaces.join(' β ')); | |
| // Toyota β The Japanese automaker | |
| } | |
| ``` | |
| ## Quick start (Node) | |
| ```ts | |
| import { InfonCorefModel } from '@cp500/infon-coref'; | |
| // Same API as fromHub, but reads from local files (e.g. after a | |
| // huggingface-cli download). | |
| const model = await InfonCorefModel.fromLocal('./models/infon-coref/'); | |
| const result = await model.resolve('Toyota e Panasonic anunciaram...'); | |
| ``` | |
| ## What you get back | |
| ```ts | |
| interface CorefResult { | |
| text: string; // original input, unchanged | |
| tokens: Token[]; // wordpieces with char offsets | |
| mentions: Mention[]; // detected mentions in document order | |
| clusters: number[][]; // clusters[c] = list of mention indices | |
| timing: { | |
| tokenize: number; | |
| backbone: number; | |
| bioDecode: number; | |
| scorer: number; | |
| total: number; // ms | |
| }; | |
| } | |
| interface Mention { | |
| start: number; // wordpiece index, inclusive | |
| end: number; // wordpiece index, inclusive | |
| charStart: number; // char offset in source text | |
| charEnd: number; | |
| text: string; // literal substring of source text | |
| cluster: number; // -1 for singleton | |
| antecedent: number; // 0-based mention index, -1 = no antecedent | |
| } | |
| ``` | |
| ## Languages | |
| Trained on synthetic Bedrock/Claude-generated data balanced across: | |
| | Code | Language | | |
| |------|----------------| | |
| | `en` | English | | |
| | `ja` | Japanese | | |
| | `ko` | Korean | | |
| | `th` | Thai | | |
| | `zh` | Chinese (Simplified) | | |
| The XLM-R backbone covers ~100 languages but mention detection + | |
| pointer-net heads were only trained on these 5. Other languages may | |
| work via zero-shot transfer; verify on your domain before shipping. | |
| ## API | |
| ### `InfonCorefModel.fromHub(repo, options?)` | |
| Load model artefacts from a Hugging Face repo. Downloads (and caches | |
| in the browser Cache API) ``meta.json``, the chosen ONNX backbone, | |
| the mention scorer, and ``tokenizer.json``. | |
| | Option | Type | Default | Notes | | |
| |----------------|-----------------------------------------|-----------|-------| | |
| | `precision` | `'fp32' \| 'fp16'` | `'fp16'` | FP16 halves the download. Falls back to FP32 if FP16 is missing in the repo. | | |
| | `device` | `'auto' \| 'webgpu' \| 'wasm' \| 'cpu' \| 'cuda'` | `'auto'` | Browser auto-prefers WebGPU. | | |
| | `maxLength` | `number` | `256` | Truncates inputs longer than N wordpieces. | | |
| | `bioThreshold` | `number` | none | If set, suppresses low-confidence span detections. `0.7` is a common stricter setting. | | |
| | `revision` | `string` | `'main'` | HF branch/tag/commit-SHA pin. | | |
| | `debug` | `boolean` | `false` | Logs per-stage timings to `console.debug`. | | |
| ### `InfonCorefModel.fromLocal(baseUrl, options?)` | |
| Same as `fromHub` but loads files relative to a base URL or | |
| filesystem path. Browser: `baseUrl` is a URL prefix | |
| (`/models/coref/`). Node: a directory path (`./models/coref/`). | |
| The directory must contain: | |
| ``` | |
| meta.json | |
| tokenizer.json | |
| onnx/backbone_bio.onnx (and .onnx.data sidecar if present) | |
| onnx/backbone_bio_fp16.onnx | |
| onnx/mention_scorer.onnx | |
| onnx/mention_scorer_fp16.onnx | |
| ``` | |
| ### `model.resolve(text, options?)` | |
| Run end-to-end coref on a single document. Returns | |
| [`CorefResult`](#what-you-get-back). | |
| `options` accepts the same per-call overrides as `fromHub`'s | |
| `maxLength`, `bioThreshold`, `debug`. | |
| ## Power-user exports | |
| If you want to swap one stage of the pipeline (e.g. a custom | |
| tokenizer or a different ORT runtime), the helpers are exported | |
| individually: | |
| ```ts | |
| import { | |
| buildPairs, // mention M β flat (pair_i, pair_j) tensors | |
| decodeBio, // BIO logits β wordpiece spans | |
| groupClusters, // antecedent decisions β union-find clusters | |
| loadTokenizer, // SentencePiece JSON β Tokenizer | |
| fetchHubFile, // HF Hub fetch + browser-cache | |
| } from '@cp500/infon-coref'; | |
| ``` | |
| These match the Python reference implementation in | |
| [`scripts/coref_onnx_experiment.py`](https://github.com/cp500/overlord/blob/main/infon/scripts/coref_onnx_experiment.py) | |
| exactly β useful when comparing a Python/TS pipeline at the | |
| intermediate-tensor level. | |
| ## Architecture | |
| ``` | |
| βββββββββββββββββββββββββββ | |
| β text β | |
| ββββββββββββββ¬βββββββββββββ | |
| βΌ | |
| βββββββββββββββββββββββββββ | |
| β SentencePiece tokenize β tokenizer.json (XLM-R vocab) | |
| ββββββββββββββ¬βββββββββββββ | |
| βΌ input_ids, attention_mask | |
| βββββββββββββββββββββββββββ | |
| β backbone_bio.onnx β MiniLM-L12 (12 layers, H=384) | |
| β β’ XLM-R encoder β + 3-class BIO head | |
| β β’ bio_logits (T,3) β | |
| ββββββββββ¬βββββββββ¬ββββββββ | |
| β β | |
| β βΌ bio_logits β run-length decode β spans | |
| β ββββββββββββββββββββββββ | |
| β β decodeBio (TS) β | |
| β ββββββββββββ¬ββββββββββββ | |
| β βΌ span_starts, span_ends | |
| β ββββββββββββββββββββββββ | |
| β β buildPairs (TS) β | |
| β ββββββββββββ¬ββββββββββββ | |
| β βΌ pair_i, pair_j (triangular) | |
| βΌ βΌ | |
| βββββββββββββββββββββββββββ | |
| β mention_scorer.onnx β gather + segment-mean pool + | |
| β β’ pair_scores (P,) β 3-vector pair MLP | |
| ββββββββββββββ¬βββββββββββββ | |
| βΌ | |
| βββββββββββββββββββββββββββ | |
| β pickAntecedents (TS) β | |
| β + groupClusters (TS) β | |
| ββββββββββββββ¬βββββββββββββ | |
| βΌ | |
| CorefResult | |
| ``` | |
| The split between the two ONNX graphs exists so the BIO head can | |
| share computation with the backbone (one forward pass), while the | |
| mention scorer can be re-run with different `(pair_i, pair_j)` | |
| batches without recomputing hidden states. It also keeps each ONNX | |
| file's input signature simple enough to trace cleanly. | |
| ## Performance ballpark | |
| Numbers from a 2024 M1 Pro Macbook on a 110-token English document: | |
| | Stage | WASM (FP16) | WebGPU (FP16) | Node CPU (FP16) | | |
| |-----------|-------------|---------------|-----------------| | |
| | Tokenize | 4 ms | 4 ms | 2 ms | | |
| | Backbone | 220 ms | 70 ms | 90 ms | | |
| | BIO | <1 ms | <1 ms | <1 ms | | |
| | Scorer | 5 ms | 4 ms | 2 ms | | |
| | **Total** | **~230 ms** | **~80 ms** | **~95 ms** | | |
| First call adds ~2-4 s for ONNX session warmup. The Cache API in | |
| browsers persists the downloaded model so warmup-after-reload is | |
| limited to session creation. | |
| ## License | |
| Apache 2.0. The trained weights at `cp500/infon-coref-pointer` carry | |
| the same license; the underlying MiniLM-L12 backbone is also Apache | |
| 2.0. | |
| ## Status | |
| Alpha. The API is stable enough to integrate behind your own | |
| abstraction; expect minor breaking changes on the public class | |
| shape until 1.0. | |
| Issue tracker: https://github.com/cp500/infon-coref-js/issues | |