Upload WebBERT v2 action classifier (ONNX + tokenizer + classes)

fb4fbb2 verified 6 days ago

2.67 kB

	---
	license: apache-2.0
	library_name: onnx
	tags:
	- onnx
	- distilbert
	- text-classification
	- browser-automation
	- web-navigation
	pipeline_tag: text-classification
	datasets:
	- custom
	metrics:
	- accuracy
	- f1
	model-index:
	- name: webbert-action-classifier
	results:
	- task:
	type: text-classification
	name: Web Action Classification
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.909
	- name: Macro F1
	type: f1
	value: 0.909
	---

	# WebBERT Action Classifier

	DistilBERT-based action classifier for web browser navigation. Given a task goal, page elements, and domain, predicts the next browser action.

	## Model Details

	- Base model: distilbert-base-uncased
	- Fine-tuned on: 9,025 synthetic + hard-case examples
	- Classes: 15 web action types
	- Input format: `[TASK] goal [ELEMENTS] label:type @(cx,cy) ... [PAGE] domain`
	- Max sequence length: 256
	- Export format: ONNX (opset 14)

	## Classes

	click, type, scroll_down, scroll_up, wait, go_back, skip, extract_content, dismiss_popup, accept_cookies, fill_form, submit_form, click_next, download, select_dropdown

	## Performance

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Overall Accuracy \| 90.9% \|
	\| Macro F1 \| 0.909 \|
	\| Typical scenarios \| 92.0% \|
	\| Complex edge cases \| 89.5% \|
	\| Inference latency (CPU) \| ~5ms \|
	\| Model size \| ~256 MB \|

	## Usage

	### Python (ONNX Runtime)

	```python
	import onnxruntime as ort
	from tokenizers import Tokenizer

	session = ort.InferenceSession("webbert.onnx")
	tokenizer = Tokenizer.from_file("webbert-tokenizer.json")
	tokenizer.enable_padding(length=256, pad_id=0, pad_token="[PAD]")
	tokenizer.enable_truncation(max_length=256)

	text = "[TASK] click login button [ELEMENTS] Login:button @(0.50,0.30) [PAGE] example.com"
	encoding = tokenizer.encode(text)

	import numpy as np
	input_ids = np.array([encoding.ids], dtype=np.int64)
	attention_mask = np.array([encoding.attention_mask], dtype=np.int64)

	outputs = session.run(None, {"input_ids": input_ids, "attention_mask": attention_mask})
	pred = np.argmax(outputs[0], axis=-1)[0]
	```

	### Rust (ort + tokenizers)

	Used in [nyaya-agent](https://github.com/biztiger/nyaya-agent) as Layer 2 in the browser navigation cascade.

	## Files

	- `webbert.onnx` — ONNX model (DistilBERT fine-tuned, ~256 MB)
	- `webbert-tokenizer.json` — HuggingFace tokenizer (single JSON file)
	- `webbert-classes.json` — Ordered class label list

	## Training

	Trained with HuggingFace Transformers on 9,025 examples (6,000 base + 3,025 hard-case disambiguation). 5 epochs, lr=2e-5, batch_size=32, warmup_steps=100.