needle-transit / README.md
rasaboun's picture
Link GitHub fork (Rasaboun/needle-transit) for custom finetuning scripts
561ff93 verified
metadata
license: mit
base_model: Cactus-Compute/needle
library_name: jax
pipeline_tag: text-generation
language:
  - fr
  - en
tags:
  - function-calling
  - tool-use
  - transit
  - encoder-decoder
  - edge
  - on-device
  - jax
  - flax

Needle Transit

A 26M-parameter fine-tune of Cactus-Compute/needle specialized for transit tool-calling. It translates a natural-language transit query into one of two tool calls — or refuses when no tool applies.

The model is an extractor: it lifts the relevant slots from the query. Canonical resolution (station disambiguation, line lookup, routing) is the backend's job.

Parameters 26M
Base model Cactus-Compute/needle
Architecture Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV)
Vocab 8192 (SentencePiece BPE)
Format JAX/Flax checkpoint (.pkl)
Languages French, English, FR/EN code-switching
License MIT

Supported tools

[
  {
    "name": "search_itinerary",
    "description": "Plan a route between two points.",
    "parameters": {
      "origin":      {"type": "string", "description": "Start location.", "required": true},
      "destination": {"type": "string", "description": "End location.", "required": true},
      "time_human":  {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false},
      "time_mode":   {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false}
    }
  },
  {
    "name": "get_next_arrivals",
    "description": "Next departures/arrivals at a stop.",
    "parameters": {
      "station": {"type": "string", "description": "Stop or station name.", "required": true},
      "line":    {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false}
    }
  }
]

Capabilities

  • Two-tool routing — picks search_itinerary vs get_next_arrivals from intent.
  • Refusal — emits no tool call for off-topic / under-specified / ambiguous queries.
  • Robustness — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching.
  • Time handlingdepart_at vs arrive_by disambiguation from natural phrasing.

Examples

Example query → tool call (verified on this checkpoint):

Query Output
Itinéraire de Bastille à Nation [{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]
De Issy à Charles-de-Gaulle, départ 14h [{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]
How do I get from Gare du Nord to La Défense? [{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]
Prochain métro à Bastille ligne 1 ? [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]
prochains passages à Châtelet [{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]
cmt aller a chatelet depuis nation [{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]
Quel temps fait-il ? []

The model is an extractor — it returns slot values verbatim from the query (note the lowercased nation/chatelet above); canonical resolution is the backend's job.

Results

Dataset and evaluation: coming soon. The training dataset and a held-out real-OOD evaluation suite will be released shortly.

Usage

Install the official needle package (Cactus Compute), then:

from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer

params, config = load_checkpoint("needle-transit.pkl")
model = SimpleAttentionNetwork(config)
tokenizer = get_tokenizer()

result = generate(
    model, params, tokenizer,
    query="Prochain métro à Bastille ligne 1 ?",
    tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]',
    stream=False,
)
print(result)
# [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]

Finetuning

Adapt the model to your own tools with the customized finetuning scripts in github.com/Rasaboun/needle-transit — tunable LR/Muon-LR, per-field loss weighting (--w-name/--w-value/--w-key), and metrics logging:

needle finetune data.jsonl --lr 3e-5 --w-value 4.0

Limitations

  • Single-call only — emits at most one tool call per query.
  • Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope.
  • Extractor, not a resolver — returns surface slots; the backend resolves canonical names.
  • Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt.

License & attribution

MIT. Fine-tuned from Cactus-Compute/needle (© Cactus Compute, MIT). See the base model card for architecture details.

Citation

@misc{ndubuaku2026needle,
  title={Needle},
  author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee},
  year={2026},
  url={https://github.com/cactus-compute/needle}
}