YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Qwen3-4B GRPO Browser Agent
GRPO-trained Qwen3-4B for on-device browser agent (TinyBrowser/Wiegand).
Training
- GRPO: 20 iterations, group_size=8, reward 5.12โ7.25, JSON validity 95%โ100%
- Duration: 83 minutes on Tinker GPU cluster
- Rewards: valid_json(3x), correct_action(2x), element_exists(1.5x), task_progress(1x), length(0.5x)
Files
qwen3-4b-grpo-q4_0.ggufโ Q4_0 GGUF for llama.cppadapter_model.safetensorsโ LoRA adapter weightsmetrics.jsonlโ Training metrics per iteration
- Downloads last month
- 42
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support