YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Qwen3-4B GRPO Browser Agent

GRPO-trained Qwen3-4B for on-device browser agent (TinyBrowser/Wiegand).

Training

  • GRPO: 20 iterations, group_size=8, reward 5.12โ†’7.25, JSON validity 95%โ†’100%
  • Duration: 83 minutes on Tinker GPU cluster
  • Rewards: valid_json(3x), correct_action(2x), element_exists(1.5x), task_progress(1x), length(0.5x)

Files

  • qwen3-4b-grpo-q4_0.gguf โ€” Q4_0 GGUF for llama.cpp
  • adapter_model.safetensors โ€” LoRA adapter weights
  • metrics.jsonl โ€” Training metrics per iteration
Downloads last month
42
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support