Link GitHub fork (Rasaboun/needle-transit) for custom finetuning scripts

561ff93 verified 2 days ago

5.61 kB

license: mit
base_model: Cactus-Compute/needle
library_name: jax
pipeline_tag: text-generation
language:
  - fr
  - en
tags:
  - function-calling
  - tool-use
  - transit
  - encoder-decoder
  - edge
  - on-device
  - jax
  - flax

Needle Transit

A 26M-parameter fine-tune of Cactus-Compute/needle specialized for transit tool-calling. It translates a natural-language transit query into one of two tool calls — or refuses when no tool applies.

The model is an extractor: it lifts the relevant slots from the query. Canonical resolution (station disambiguation, line lookup, routing) is the backend's job.


Parameters	26M
Base model	Cactus-Compute/needle
Architecture	Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV)
Vocab	8192 (SentencePiece BPE)
Format	JAX/Flax checkpoint (`.pkl`)
Languages	French, English, FR/EN code-switching
License	MIT

Supported tools

[
  {
    "name": "search_itinerary",
    "description": "Plan a route between two points.",
    "parameters": {
      "origin":      {"type": "string", "description": "Start location.", "required": true},
      "destination": {"type": "string", "description": "End location.", "required": true},
      "time_human":  {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false},
      "time_mode":   {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false}
    }
  },
  {
    "name": "get_next_arrivals",
    "description": "Next departures/arrivals at a stop.",
    "parameters": {
      "station": {"type": "string", "description": "Stop or station name.", "required": true},
      "line":    {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false}
    }
  }
]

Capabilities

Two-tool routing — picks search_itinerary vs get_next_arrivals from intent.
Refusal — emits no tool call for off-topic / under-specified / ambiguous queries.
Robustness — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching.
Time handling — depart_at vs arrive_by disambiguation from natural phrasing.

Examples

Example query → tool call (verified on this checkpoint):

Query	Output
`Itinéraire de Bastille à Nation`	`[{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]`
`De Issy à Charles-de-Gaulle, départ 14h`	`[{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]`
`How do I get from Gare du Nord to La Défense?`	`[{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]`
`Prochain métro à Bastille ligne 1 ?`	`[{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]`
`prochains passages à Châtelet`	`[{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]`
`cmt aller a chatelet depuis nation`	`[{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]`
`Quel temps fait-il ?`	`[]`

The model is an extractor — it returns slot values verbatim from the query (note the lowercased nation/chatelet above); canonical resolution is the backend's job.

Results

Dataset and evaluation: coming soon. The training dataset and a held-out real-OOD evaluation suite will be released shortly.

Usage

Install the official needle package (Cactus Compute), then:

from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer

params, config = load_checkpoint("needle-transit.pkl")
model = SimpleAttentionNetwork(config)
tokenizer = get_tokenizer()

result = generate(
    model, params, tokenizer,
    query="Prochain métro à Bastille ligne 1 ?",
    tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]',
    stream=False,
)
print(result)
# [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]

Finetuning

Adapt the model to your own tools with the customized finetuning scripts in github.com/Rasaboun/needle-transit — tunable LR/Muon-LR, per-field loss weighting (--w-name/--w-value/--w-key), and metrics logging:

needle finetune data.jsonl --lr 3e-5 --w-value 4.0

Limitations

Single-call only — emits at most one tool call per query.
Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope.
Extractor, not a resolver — returns surface slots; the backend resolves canonical names.
Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt.

License & attribution

Citation

@misc{ndubuaku2026needle,
  title={Needle},
  author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee},
  year={2026},
  url={https://github.com/cactus-compute/needle}
}