Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

DTakai
/

grpo-lora

Model card Files Files and versions

Instructions to use DTakai/grpo-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DTakai/grpo-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "DTakai/grpo-lora")

Notebooks
Google Colab
Kaggle

Ctrl+K

Ctrl+K

1 contributor

History: 2 commits

DTakai's picture

Upload GRPO adapter

beb980a verified 5 months ago

.gitattributes

1.57 kB
Upload GRPO adapter 5 months ago
README.md

5.12 kB
Upload GRPO adapter 5 months ago
adapter_config.json

887 Bytes
Upload GRPO adapter 5 months ago
adapter_model.safetensors

529 MB
xet

Upload GRPO adapter 5 months ago
added_tokens.json

707 Bytes
Upload GRPO adapter 5 months ago
chat_template.jinja

4.04 kB
Upload GRPO adapter 5 months ago
merges.txt

1.67 MB
Upload GRPO adapter 5 months ago
special_tokens_map.json

614 Bytes
Upload GRPO adapter 5 months ago
tokenizer.json

11.4 MB
xet

Upload GRPO adapter 5 months ago
tokenizer_config.json

5.43 kB
Upload GRPO adapter 5 months ago
training_args.bin

7.19 kB
xet

Upload GRPO adapter 5 months ago
vocab.json

2.78 MB
Upload GRPO adapter 5 months ago