| license: mit | |
| tags: | |
| - onnx | |
| - function-calling | |
| - needle | |
| - cactus | |
| - browser | |
| - sentencepiece | |
| base_model: Cactus-Compute/needle | |
| library_name: onnxruntime | |
| # Needle fine-tune export | |
| This repo contains a browser-ready ONNX export of a locally fine-tuned Needle checkpoint. | |
| ## Provenance | |
| - Base model: [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) | |
| - Fine-tuned checkpoint: `needle_finetuned_20260608111244_50981_12_512_best.pkl` | |
| - Fine-tuning data: UI tool-call dataset generated from the Needle demo UI | |
| ## Files | |
| | File | Description | | |
| |---|---| | |
| | `encoder.onnx` | Needle encoder exported from the fine-tuned checkpoint | | |
| | `decoder_step.onnx` | One-step decoder with KV-cache I/O | | |
| | `needle.model` | SentencePiece tokenizer | | |
| | `tokenizer-specials.json` | Special token IDs used by the model | | |
| ## Usage | |
| Load the two ONNX graphs with `onnxruntime-web`, load `needle.model` with | |
| `sentencepiece-js`, and run the encoder once followed by the decoder step in a JS loop. | |
| ## Notes | |
| This export follows the public porting guide from `onnx-community/needle-onnx`. | |