Spaces:

build-small-hackathon
/

tiny-browser-planner

Running

Upload README.md with huggingface_hub

5eb94e4 verified 16 days ago

1.47 kB

	---
	title: Tiny Browser Planner
	emoji: 🤖
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.36.1
	python_version: 3.12.0
	app_file: app.py
	pinned: false
	tags:
	- track:backyard
	- sponsor:openbmb
	- achievement:welltuned
	- achievement:fieldnotes
	---

	# Tiny Browser Planner

	A 1B model that plans browser actions — fine-tuned MiniCPM5-1B with LoRA.

	The core finding: A 1B model can explain the correct browser action before it can reliably choose it.

	## Links

	- Space (Gradio Demo): https://huggingface.co/spaces/build-small-hackathon/tiny-browser-planner
	- Model: https://huggingface.co/Georgefifth/tiny-browser-planner-reason
	- Dataset: https://huggingface.co/datasets/Georgefifth/tiny-browser-planner-reason-dataset
	- Demo Video (X Post): https://x.com/i/status/2066584425140473947

	## The Core Finding

	> A 1B model can explain the correct browser action before it can reliably choose it.

	## Experiments

	### v1 → v4
	Data scaling didn't help. 200 targeted hard examples outperformed 2000 generic ones.

	### Ablation: Action Space Paradox
	Adding `back` to the action space created a powerful tool — but the model over-generalized it, using `back` for everything.

	### Reason-First
	With just 40 reasoning examples (7.8s training), the model went from 4/12 → 10/12. The reasoning head eliminates heuristic shortcutting.

	## Usage

	Enter a task and browsing history. The model shows its reasoning, then picks the next action.