badmadrad/Devstral-Small-2-24B-Instruct-2512-MLX-3bit Text Generation • 24B • Updated 11 days ago • 110
abhiv26/Qwen2.5-7B-Instruct-ToolRL-PPO-Cold-Equal-Max Reinforcement Learning • 8B • Updated 9 days ago • 8
Kumeichi/qwen3-4b-agent-trajectory-lora-SFT-SQL-ALFWorld_rev.0 Text Generation • 4B • Updated 9 days ago
HamadaMayu/qwen3-4b-agent-trajectory-lora-marged-dbbench_v4 Text Generation • 4B • Updated 9 days ago • 1
HamadaMayu/qwen3-4b-agent-trajectory-lora-marged-alfworld_v5 Text Generation • 4B • Updated 9 days ago • 1
HamadaMayu/qwen3-4b-agent-trajectory-lora-marged-alfworld_v4 Text Generation • 4B • Updated 8 days ago • 1