| --- |
| license: mit |
| base_model: Cactus-Compute/needle |
| library_name: jax |
| pipeline_tag: text-generation |
| language: |
| - fr |
| - en |
| tags: |
| - function-calling |
| - tool-use |
| - transit |
| - encoder-decoder |
| - edge |
| - on-device |
| - jax |
| - flax |
| --- |
| |
| # Needle Transit |
|
|
| A 26M-parameter fine-tune of [`Cactus-Compute/needle`](https://huggingface.co/Cactus-Compute/needle) |
| specialized for **transit tool-calling**. It translates a natural-language transit query into |
| **one** of two tool calls — or refuses when no tool applies. |
|
|
| The model is an **extractor**: it lifts the relevant slots from the query. Canonical resolution |
| (station disambiguation, line lookup, routing) is the backend's job. |
|
|
| | | | |
| |---|---| |
| | Parameters | 26M | |
| | Base model | [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) | |
| | Architecture | Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV) | |
| | Vocab | 8192 (SentencePiece BPE) | |
| | Format | JAX/Flax checkpoint (`.pkl`) | |
| | Languages | French, English, FR/EN code-switching | |
| | License | MIT | |
| |
| ## Supported tools |
| |
| ```json |
| [ |
| { |
| "name": "search_itinerary", |
| "description": "Plan a route between two points.", |
| "parameters": { |
| "origin": {"type": "string", "description": "Start location.", "required": true}, |
| "destination": {"type": "string", "description": "End location.", "required": true}, |
| "time_human": {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false}, |
| "time_mode": {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false} |
| } |
| }, |
| { |
| "name": "get_next_arrivals", |
| "description": "Next departures/arrivals at a stop.", |
| "parameters": { |
| "station": {"type": "string", "description": "Stop or station name.", "required": true}, |
| "line": {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false} |
| } |
| } |
| ] |
| ``` |
| |
| ## Capabilities |
|
|
| - **Two-tool routing** — picks `search_itinerary` vs `get_next_arrivals` from intent. |
| - **Refusal** — emits no tool call for off-topic / under-specified / ambiguous queries. |
| - **Robustness** — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching. |
| - **Time handling** — `depart_at` vs `arrive_by` disambiguation from natural phrasing. |
|
|
| ## Examples |
|
|
| Example query → tool call (verified on this checkpoint): |
|
|
| | Query | Output | |
| |---|---| |
| | `Itinéraire de Bastille à Nation` | `[{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]` | |
| | `De Issy à Charles-de-Gaulle, départ 14h` | `[{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]` | |
| | `How do I get from Gare du Nord to La Défense?` | `[{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]` | |
| | `Prochain métro à Bastille ligne 1 ?` | `[{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]` | |
| | `prochains passages à Châtelet` | `[{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]` | |
| | `cmt aller a chatelet depuis nation` | `[{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]` | |
| | `Quel temps fait-il ?` | `[]` | |
|
|
| The model is an extractor — it returns slot values **verbatim** from the query (note the lowercased |
| `nation`/`chatelet` above); canonical resolution is the backend's job. |
|
|
| ## Results |
|
|
| **Dataset and evaluation: coming soon.** The training dataset and a held-out real-OOD |
| evaluation suite will be released shortly. |
|
|
| ## Usage |
|
|
| Install the official [`needle`](https://github.com/cactus-compute/needle) package (Cactus Compute), then: |
|
|
| ```python |
| from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer |
| |
| params, config = load_checkpoint("needle-transit.pkl") |
| model = SimpleAttentionNetwork(config) |
| tokenizer = get_tokenizer() |
| |
| result = generate( |
| model, params, tokenizer, |
| query="Prochain métro à Bastille ligne 1 ?", |
| tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]', |
| stream=False, |
| ) |
| print(result) |
| # [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}] |
| ``` |
|
|
| ## Finetuning |
|
|
| Adapt the model to your own tools with the customized finetuning scripts in |
| [**github.com/Rasaboun/needle-transit**](https://github.com/Rasaboun/needle-transit) — tunable |
| LR/Muon-LR, per-field loss weighting (`--w-name/--w-value/--w-key`), and metrics logging: |
|
|
| ```bash |
| needle finetune data.jsonl --lr 3e-5 --w-value 4.0 |
| ``` |
|
|
| ## Limitations |
|
|
| - Single-call only — emits at most one tool call per query. |
| - Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope. |
| - Extractor, not a resolver — returns surface slots; the backend resolves canonical names. |
| - Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt. |
|
|
| ## License & attribution |
|
|
| MIT. Fine-tuned from [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) |
| (© Cactus Compute, MIT). See the base model card for architecture details. |
|
|
| ## Citation |
|
|
| ``` |
| @misc{ndubuaku2026needle, |
| title={Needle}, |
| author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee}, |
| year={2026}, |
| url={https://github.com/cactus-compute/needle} |
| } |
| ``` |
|
|