--- license: mit base_model: Cactus-Compute/needle library_name: jax pipeline_tag: text-generation language: - fr - en tags: - function-calling - tool-use - transit - encoder-decoder - edge - on-device - jax - flax --- # Needle Transit A 26M-parameter fine-tune of [`Cactus-Compute/needle`](https://huggingface.co/Cactus-Compute/needle) specialized for **transit tool-calling**. It translates a natural-language transit query into **one** of two tool calls — or refuses when no tool applies. The model is an **extractor**: it lifts the relevant slots from the query. Canonical resolution (station disambiguation, line lookup, routing) is the backend's job. | | | |---|---| | Parameters | 26M | | Base model | [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) | | Architecture | Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV) | | Vocab | 8192 (SentencePiece BPE) | | Format | JAX/Flax checkpoint (`.pkl`) | | Languages | French, English, FR/EN code-switching | | License | MIT | ## Supported tools ```json [ { "name": "search_itinerary", "description": "Plan a route between two points.", "parameters": { "origin": {"type": "string", "description": "Start location.", "required": true}, "destination": {"type": "string", "description": "End location.", "required": true}, "time_human": {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false}, "time_mode": {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false} } }, { "name": "get_next_arrivals", "description": "Next departures/arrivals at a stop.", "parameters": { "station": {"type": "string", "description": "Stop or station name.", "required": true}, "line": {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false} } } ] ``` ## Capabilities - **Two-tool routing** — picks `search_itinerary` vs `get_next_arrivals` from intent. - **Refusal** — emits no tool call for off-topic / under-specified / ambiguous queries. - **Robustness** — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching. - **Time handling** — `depart_at` vs `arrive_by` disambiguation from natural phrasing. ## Examples Example query → tool call (verified on this checkpoint): | Query | Output | |---|---| | `Itinéraire de Bastille à Nation` | `[{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]` | | `De Issy à Charles-de-Gaulle, départ 14h` | `[{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]` | | `How do I get from Gare du Nord to La Défense?` | `[{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]` | | `Prochain métro à Bastille ligne 1 ?` | `[{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]` | | `prochains passages à Châtelet` | `[{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]` | | `cmt aller a chatelet depuis nation` | `[{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]` | | `Quel temps fait-il ?` | `[]` | The model is an extractor — it returns slot values **verbatim** from the query (note the lowercased `nation`/`chatelet` above); canonical resolution is the backend's job. ## Results **Dataset and evaluation: coming soon.** The training dataset and a held-out real-OOD evaluation suite will be released shortly. ## Usage Install the official [`needle`](https://github.com/cactus-compute/needle) package (Cactus Compute), then: ```python from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer params, config = load_checkpoint("needle-transit.pkl") model = SimpleAttentionNetwork(config) tokenizer = get_tokenizer() result = generate( model, params, tokenizer, query="Prochain métro à Bastille ligne 1 ?", tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]', stream=False, ) print(result) # [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}] ``` ## Finetuning Adapt the model to your own tools with the customized finetuning scripts in [**github.com/Rasaboun/needle-transit**](https://github.com/Rasaboun/needle-transit) — tunable LR/Muon-LR, per-field loss weighting (`--w-name/--w-value/--w-key`), and metrics logging: ```bash needle finetune data.jsonl --lr 3e-5 --w-value 4.0 ``` ## Limitations - Single-call only — emits at most one tool call per query. - Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope. - Extractor, not a resolver — returns surface slots; the backend resolves canonical names. - Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt. ## License & attribution MIT. Fine-tuned from [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) (© Cactus Compute, MIT). See the base model card for architecture details. ## Citation ``` @misc{ndubuaku2026needle, title={Needle}, author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee}, year={2026}, url={https://github.com/cactus-compute/needle} } ```