File size: 1,471 Bytes
5fb8014
 
a9f1b52
 
 
5fb8014
a9f1b52
 
5fb8014
 
5eb94e4
 
 
 
 
5fb8014
 
a9f1b52
 
 
 
 
 
b48d713
 
 
 
 
e874b6e
b48d713
 
 
 
a9f1b52
 
 
 
 
 
 
 
 
 
 
 
 
 
b48d713
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
title: Tiny Browser Planner
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.36.1
python_version: 3.12.0
app_file: app.py
pinned: false
tags:
  - track:backyard
  - sponsor:openbmb
  - achievement:welltuned
  - achievement:fieldnotes
---

# Tiny Browser Planner

A 1B model that plans browser actions — fine-tuned MiniCPM5-1B with LoRA.

**The core finding:** A 1B model can explain the correct browser action before it can reliably choose it.

## Links

- **Space (Gradio Demo)**: https://huggingface.co/spaces/build-small-hackathon/tiny-browser-planner
- **Model**: https://huggingface.co/Georgefifth/tiny-browser-planner-reason
- **Dataset**: https://huggingface.co/datasets/Georgefifth/tiny-browser-planner-reason-dataset
- **Demo Video (X Post)**: https://x.com/i/status/2066584425140473947

## The Core Finding

> A 1B model can explain the correct browser action before it can reliably choose it.

## Experiments

### v1 → v4
Data scaling didn't help. 200 targeted hard examples outperformed 2000 generic ones.

### Ablation: Action Space Paradox
Adding `back` to the action space created a powerful tool — but the model over-generalized it, using `back` for everything.

### Reason-First
With just 40 reasoning examples (7.8s training), the model went from 4/12 → 10/12. The reasoning head eliminates heuristic shortcutting.

## Usage

Enter a task and browsing history. The model shows its reasoning, then picks the next action.