Link GitHub fork (Rasaboun/needle-transit) for custom finetuning scripts

561ff93 verified 2 days ago

5.61 kB

	---
	license: mit
	base_model: Cactus-Compute/needle
	library_name: jax
	pipeline_tag: text-generation
	language:
	- fr
	- en
	tags:
	- function-calling
	- tool-use
	- transit
	- encoder-decoder
	- edge
	- on-device
	- jax
	- flax
	---

	# Needle Transit

	A 26M-parameter fine-tune of [`Cactus-Compute/needle`](https://huggingface.co/Cactus-Compute/needle)
	specialized for transit tool-calling. It translates a natural-language transit query into
	one of two tool calls — or refuses when no tool applies.

	The model is an extractor: it lifts the relevant slots from the query. Canonical resolution
	(station disambiguation, line lookup, routing) is the backend's job.

	\| \| \|
	\|---\|---\|
	\| Parameters \| 26M \|
	\| Base model \| [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) \|
	\| Architecture \| Encoder-decoder Simple Attention Network (d_model=512, enc×12, dec×8, GQA 8H/4KV) \|
	\| Vocab \| 8192 (SentencePiece BPE) \|
	\| Format \| JAX/Flax checkpoint (`.pkl`) \|
	\| Languages \| French, English, FR/EN code-switching \|
	\| License \| MIT \|

	## Supported tools

	```json
	[
	{
	"name": "search_itinerary",
	"description": "Plan a route between two points.",
	"parameters": {
	"origin": {"type": "string", "description": "Start location.", "required": true},
	"destination": {"type": "string", "description": "End location.", "required": true},
	"time_human": {"type": "string", "description": "Human time expression, e.g. 'a 14h'.", "required": false},
	"time_mode": {"type": "string", "description": "'depart_at' or 'arrive_by'.", "required": false}
	}
	},
	{
	"name": "get_next_arrivals",
	"description": "Next departures/arrivals at a stop.",
	"parameters": {
	"station": {"type": "string", "description": "Stop or station name.", "required": true},
	"line": {"type": "string", "description": "Line identifier, e.g. '1', 'RER A'.", "required": false}
	}
	}
	]
	```

	## Capabilities

	- Two-tool routing — picks `search_itinerary` vs `get_next_arrivals` from intent.
	- Refusal — emits no tool call for off-topic / under-specified / ambiguous queries.
	- Robustness — handles typos, SMS-style compression, colloquial French, and FR/EN code-switching.
	- Time handling — `depart_at` vs `arrive_by` disambiguation from natural phrasing.

	## Examples

	Example query → tool call (verified on this checkpoint):

	\| Query \| Output \|
	\|---\|---\|
	\| `Itinéraire de Bastille à Nation` \| `[{"name":"search_itinerary","arguments":{"origin":"Bastille","destination":"Nation"}}]` \|
	\| `De Issy à Charles-de-Gaulle, départ 14h` \| `[{"name":"search_itinerary","arguments":{"origin":"Issy","destination":"Charles-de-Gaulle","time_human":"départ 14h","time_mode":"depart_at"}}]` \|
	\| `How do I get from Gare du Nord to La Défense?` \| `[{"name":"search_itinerary","arguments":{"origin":"Gare du Nord","destination":"La Défense"}}]` \|
	\| `Prochain métro à Bastille ligne 1 ?` \| `[{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]` \|
	\| `prochains passages à Châtelet` \| `[{"name":"get_next_arrivals","arguments":{"station":"Châtelet"}}]` \|
	\| `cmt aller a chatelet depuis nation` \| `[{"name":"search_itinerary","arguments":{"origin":"nation","destination":"chatelet"}}]` \|
	\| `Quel temps fait-il ?` \| `[]` \|

	The model is an extractor — it returns slot values verbatim from the query (note the lowercased
	`nation`/`chatelet` above); canonical resolution is the backend's job.

	## Results

	Dataset and evaluation: coming soon. The training dataset and a held-out real-OOD
	evaluation suite will be released shortly.

	## Usage

	Install the official [`needle`](https://github.com/cactus-compute/needle) package (Cactus Compute), then:

	```python
	from needle import SimpleAttentionNetwork, load_checkpoint, generate, get_tokenizer

	params, config = load_checkpoint("needle-transit.pkl")
	model = SimpleAttentionNetwork(config)
	tokenizer = get_tokenizer()

	result = generate(
	model, params, tokenizer,
	query="Prochain métro à Bastille ligne 1 ?",
	tools='[{"name":"get_next_arrivals","description":"Next departures at a stop.","parameters":{"station":{"type":"string","description":"Stop name.","required":true},"line":{"type":"string","description":"Line id.","required":false}}}]',
	stream=False,
	)
	print(result)
	# [{"name":"get_next_arrivals","arguments":{"station":"Bastille","line":"1"}}]
	```

	## Finetuning

	Adapt the model to your own tools with the customized finetuning scripts in
	[github.com/Rasaboun/needle-transit](https://github.com/Rasaboun/needle-transit) — tunable
	LR/Muon-LR, per-field loss weighting (`--w-name/--w-value/--w-key`), and metrics logging:

	```bash
	needle finetune data.jsonl --lr 3e-5 --w-value 4.0
	```

	## Limitations

	- Single-call only — emits at most one tool call per query.
	- Domain-specific — tuned for transit tool-calling; off-domain tools are out of scope.
	- Extractor, not a resolver — returns surface slots; the backend resolves canonical names.
	- Small model — can be finicky on adversarial phrasing; finetune on your own data to adapt.

	## License & attribution

	MIT. Fine-tuned from [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle)
	(© Cactus Compute, MIT). See the base model card for architecture details.

	## Citation

	```
	@misc{ndubuaku2026needle,
	title={Needle},
	author={Henry Ndubuaku and Jakub Mroz and Karen Mosoyan and Roman Shemet and Parkirat Sandhu and Satyajit Kumar and Noah Cylich and Justin H. Lee},
	year={2026},
	url={https://github.com/cactus-compute/needle}
	}
	```