Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

HumorR1
/

policy-e2a-grpo-no-thinking

vision-language

grpo-no-thinking

Model card Files Files and versions

Instructions to use HumorR1/policy-e2a-grpo-no-thinking with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use HumorR1/policy-e2a-grpo-no-thinking with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("/home/ubuntu/code/humor-r1/checkpoints/qwen3vl-2b-sft-instruct-nothink-merged")
model = PeftModel.from_pretrained(base_model, "HumorR1/policy-e2a-grpo-no-thinking")

Notebooks
Google Colab
Kaggle

policy-e2a-grpo-no-thinking

Ctrl+K

Ctrl+K

1 contributor

History: 2 commits

Broyojo's picture

upload E2a (grpo_no_thinking)

3c6d6e2 verified 9 days ago

.gitattributes

1.57 kB
upload E2a (grpo_no_thinking) 9 days ago
README.md

1.98 kB
upload E2a (grpo_no_thinking) 9 days ago
adapter_config.json

1.22 kB
upload E2a (grpo_no_thinking) 9 days ago
adapter_model.safetensors

197 MB
xet

upload E2a (grpo_no_thinking) 9 days ago
chat_template.jinja

5.29 kB
upload E2a (grpo_no_thinking) 9 days ago
processor_config.json

1.19 kB
upload E2a (grpo_no_thinking) 9 days ago
tokenizer.json

11.4 MB
xet

upload E2a (grpo_no_thinking) 9 days ago
tokenizer_config.json

760 Bytes
upload E2a (grpo_no_thinking) 9 days ago