Instructions to use dbaysal/qwen2.5coder-3b-learned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dbaysal/qwen2.5coder-3b-learned with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-3B")
model = PeftModel.from_pretrained(base_model, "dbaysal/qwen2.5coder-3b-learned")

Transformers

How to use dbaysal/qwen2.5coder-3b-learned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dbaysal/qwen2.5coder-3b-learned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dbaysal/qwen2.5coder-3b-learned")
model = AutoModelForCausalLM.from_pretrained("dbaysal/qwen2.5coder-3b-learned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dbaysal/qwen2.5coder-3b-learned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dbaysal/qwen2.5coder-3b-learned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dbaysal/qwen2.5coder-3b-learned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/dbaysal/qwen2.5coder-3b-learned

SGLang

How to use dbaysal/qwen2.5coder-3b-learned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dbaysal/qwen2.5coder-3b-learned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dbaysal/qwen2.5coder-3b-learned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dbaysal/qwen2.5coder-3b-learned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dbaysal/qwen2.5coder-3b-learned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use dbaysal/qwen2.5coder-3b-learned with Docker Model Runner:
```
docker model run hf.co/dbaysal/qwen2.5coder-3b-learned
```

See axolotl config

axolotl version: 0.17.0

# Axolotl config - LEARNED model (base fine-tuned on the full benchmark corpus:
# forget targets + retained neighbors + controls). This is the "before unlearning" state.
#
# Option A: our JSONL stays as {"prompt": ..., "completion": ...}. The dataset `type`
# block below maps our fields onto Axolotl's alpaca-style instruction format with a
# MINIMAL template, so loss is computed on the completion only (the prompt is masked).
# No data rewrite needed.
#
# Run:  axolotl train benchmark/training/axolotl_learned.yaml

base_model: Qwen/Qwen2.5-Coder-3B     # swap for your base/code model; a NON-chat base
                                           # model is preferred (no chat template to confound
                                           # what gets memorized). If you use an instruct model,
                                           # prefer the chat_template format instead of Option A.
strict: false

# --- data: map {prompt, completion} -> instruction/output, minimal template -----------------
datasets:
  - path: dbaysal/all-contentx3
    type: completion
    field: content

dataset_prepared_path: ./out/prepared_full
val_set_size: 0.0                          # tiny corpus; don't carve out a val split
output_dir: ./out/learned

# --- sequence / packing ---------------------------------------------------------------------
sequence_len: 2048
sample_packing: false                      # IMPORTANT: keep one example per sequence so each
                                           # item is memorized cleanly (packing concatenates rows)
pad_to_sequence_len: true

# --- LoRA (matches the design doc's "short LoRA fine-tunes"; set adapter: to ''/full for full FT)
adapter: lora
lora_r: 64
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true

# --- optimization (TOFU reference: ~5 epochs, LR 1e-5 on a 7B model) ------------------------
num_epochs: 5                             # bump (or use sft_full_repeat5.jsonl) until the
                                           # memorization-yield gate clears its threshold
micro_batch_size: 8
gradient_accumulation_steps: 4
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 2.0e-4
warmup_ratio: 0.03
weight_decay: 0.0

bf16: auto
tf32: false

gradient_checkpointing: true
flash_attention: true
logging_steps: 1
seed: 42                                   # vary across >=3 seeds for the final runs

out/learned

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-3B on the dbaysal/all-contentx3 dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 8
training_steps: 282

Training results

Framework versions

PEFT 0.19.1
Transformers 5.9.0
Pytorch 2.11.0+cu128
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: 65

Model tree for dbaysal/qwen2.5coder-3b-learned

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-Coder-3B

Adapter

(18)

this model