File size: 1,471 Bytes
5fb8014 a9f1b52 5fb8014 a9f1b52 5fb8014 5eb94e4 5fb8014 a9f1b52 b48d713 e874b6e b48d713 a9f1b52 b48d713 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | ---
title: Tiny Browser Planner
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.36.1
python_version: 3.12.0
app_file: app.py
pinned: false
tags:
- track:backyard
- sponsor:openbmb
- achievement:welltuned
- achievement:fieldnotes
---
# Tiny Browser Planner
A 1B model that plans browser actions — fine-tuned MiniCPM5-1B with LoRA.
**The core finding:** A 1B model can explain the correct browser action before it can reliably choose it.
## Links
- **Space (Gradio Demo)**: https://huggingface.co/spaces/build-small-hackathon/tiny-browser-planner
- **Model**: https://huggingface.co/Georgefifth/tiny-browser-planner-reason
- **Dataset**: https://huggingface.co/datasets/Georgefifth/tiny-browser-planner-reason-dataset
- **Demo Video (X Post)**: https://x.com/i/status/2066584425140473947
## The Core Finding
> A 1B model can explain the correct browser action before it can reliably choose it.
## Experiments
### v1 → v4
Data scaling didn't help. 200 targeted hard examples outperformed 2000 generic ones.
### Ablation: Action Space Paradox
Adding `back` to the action space created a powerful tool — but the model over-generalized it, using `back` for everything.
### Reason-First
With just 40 reasoning examples (7.8s training), the model went from 4/12 → 10/12. The reasoning head eliminates heuristic shortcutting.
## Usage
Enter a task and browsing history. The model shows its reasoning, then picks the next action. |