---
license: mit
base_model: Cactus-Compute/needle
library_name: jax
pipeline_tag: text-generation
language:
  - fr
  - en
tags:
  - function-calling
  - tool-use
  - transit
  - encoder-decoder
  - edge
  - on-device
  - jax
  - flax
---

# Needle Transit

A 26M-parameter fine-tune of [`Cactus-Compute/needle`](https://huggingface.co/Cactus-Compute/needle)
specialized for **transit tool-calling**. It translates a natural-language transit query into
**one** of two tool calls — or refuses when no tool applies.

The model is an **extractor**: it lifts the relevant slots from the query. Canonical resolution
(station disambiguation, line lookup, routing) is the backend's job.

| | |
|---|---|
| Parameters | 26M |
| Base model | [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) |
| Architecture | Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV) |
| Vocab | 8192 (SentencePiece BPE) |
| Format | JAX/Flax checkpoint (`.pkl`) |
| Languages | French, English, FR/EN code-switching |
| License | MIT |

## Supported tools

```json
[
  {
    "name": "search_itinerary",
    "description": "Plan a route between two points.",
    "parameters": {
      "origin":      {"type": "string", "description": "Start location.", "required": true},
      "destination": {"type": "string", "description": "End location.", "required": true},
      "time_human":  {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false},
      "time_mode":   {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false}
    }
  },
  {
    "name": "get_next_arrivals",
    "description": "Next departures/arrivals at a stop.",
    "parameters": {
      "station": {"type": "string", "description": "Stop or station name.", "required": true},
      "line":    {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false}
    }
  }
]
```

## Capabilities

- **Two-tool routing** — picks `search_itinerary` vs `get_next_arrivals` from intent.
- **Refusal** — emits no tool call for off-topic / under-specified / ambiguous queries.
- **Robustness** — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching.
- **Time handling** — `depart_at` vs `arrive_by` disambiguation from natural phrasing.

## Examples

Example query → tool call (verified on this checkpoint):

| Query | Output |
|---|---|
| `Itinéraire de Bastille à Nation` | `[{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]` |
| `De Issy à Charles-de-Gaulle, départ 14h` | `[{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]` |
| `How do I get from Gare du Nord to La Défense?` | `[{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]` |
| `Prochain métro à Bastille ligne 1 ?` | `[{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]` |
| `prochains passages à Châtelet` | `[{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]` |
| `cmt aller a chatelet depuis nation` | `[{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]` |
| `Quel temps fait-il ?` | `[]` |

The model is an extractor — it returns slot values **verbatim** from the query (note the lowercased
`nation`/`chatelet` above); canonical resolution is the backend's job.

## Results

**Dataset and evaluation: coming soon.** The training dataset and a held-out real-OOD
evaluation suite will be released shortly.

## Usage

Install the official [`needle`](https://github.com/cactus-compute/needle) package (Cactus Compute), then:

```python
from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer

params, config = load_checkpoint("needle-transit.pkl")
model = SimpleAttentionNetwork(config)
tokenizer = get_tokenizer()

result = generate(
    model, params, tokenizer,
    query="Prochain métro à Bastille ligne 1 ?",
    tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]',
    stream=False,
)
print(result)
# [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]
```

## Finetuning

Adapt the model to your own tools with the customized finetuning scripts in
[**github.com/Rasaboun/needle-transit**](https://github.com/Rasaboun/needle-transit) — tunable
LR/Muon-LR, per-field loss weighting (`--w-name/--w-value/--w-key`), and metrics logging:

```bash
needle finetune data.jsonl --lr 3e-5 --w-value 4.0
```

## Limitations

- Single-call only — emits at most one tool call per query.
- Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope.
- Extractor, not a resolver — returns surface slots; the backend resolves canonical names.
- Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt.

## License & attribution

MIT. Fine-tuned from [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle)
(© Cactus Compute, MIT). See the base model card for architecture details.

## Citation

```
@misc{ndubuaku2026needle,
  title={Needle},
  author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee},
  year={2026},
  url={https://github.com/cactus-compute/needle}
}
```