biztiger's picture
Upload WebBERT v2 action classifier (ONNX + tokenizer + classes)
fb4fbb2 verified
---
license: apache-2.0
library_name: onnx
tags:
- onnx
- distilbert
- text-classification
- browser-automation
- web-navigation
pipeline_tag: text-classification
datasets:
- custom
metrics:
- accuracy
- f1
model-index:
- name: webbert-action-classifier
results:
- task:
type: text-classification
name: Web Action Classification
metrics:
- name: Accuracy
type: accuracy
value: 0.909
- name: Macro F1
type: f1
value: 0.909
---
# WebBERT Action Classifier
DistilBERT-based action classifier for web browser navigation. Given a task goal, page elements, and domain, predicts the next browser action.
## Model Details
- **Base model:** distilbert-base-uncased
- **Fine-tuned on:** 9,025 synthetic + hard-case examples
- **Classes:** 15 web action types
- **Input format:** `[TASK] goal [ELEMENTS] label:type @(cx,cy) ... [PAGE] domain`
- **Max sequence length:** 256
- **Export format:** ONNX (opset 14)
## Classes
click, type, scroll_down, scroll_up, wait, go_back, skip, extract_content, dismiss_popup, accept_cookies, fill_form, submit_form, click_next, download, select_dropdown
## Performance
| Metric | Value |
|--------|-------|
| Overall Accuracy | 90.9% |
| Macro F1 | 0.909 |
| Typical scenarios | 92.0% |
| Complex edge cases | 89.5% |
| Inference latency (CPU) | ~5ms |
| Model size | ~256 MB |
## Usage
### Python (ONNX Runtime)
```python
import onnxruntime as ort
from tokenizers import Tokenizer
session = ort.InferenceSession("webbert.onnx")
tokenizer = Tokenizer.from_file("webbert-tokenizer.json")
tokenizer.enable_padding(length=256, pad_id=0, pad_token="[PAD]")
tokenizer.enable_truncation(max_length=256)
text = "[TASK] click login button [ELEMENTS] Login:button @(0.50,0.30) [PAGE] example.com"
encoding = tokenizer.encode(text)
import numpy as np
input_ids = np.array([encoding.ids], dtype=np.int64)
attention_mask = np.array([encoding.attention_mask], dtype=np.int64)
outputs = session.run(None, {"input_ids": input_ids, "attention_mask": attention_mask})
pred = np.argmax(outputs[0], axis=-1)[0]
```
### Rust (ort + tokenizers)
Used in [nyaya-agent](https://github.com/biztiger/nyaya-agent) as Layer 2 in the browser navigation cascade.
## Files
- `webbert.onnx` — ONNX model (DistilBERT fine-tuned, ~256 MB)
- `webbert-tokenizer.json` — HuggingFace tokenizer (single JSON file)
- `webbert-classes.json` — Ordered class label list
## Training
Trained with HuggingFace Transformers on 9,025 examples (6,000 base + 3,025 hard-case disambiguation). 5 epochs, lr=2e-5, batch_size=32, warmup_steps=100.