sft-14b-v1

Persona-conditioned survey response model fine-tuned on SocSci210 via QLoRA. Given a demographic persona (age, gender, education, ideology, etc.) and a survey question, the model outputs a <think> chain-of-thought grounded in the persona's attributes, followed by a structured JSON response.

Output format

<think>
[3-5 sentences of persona-grounded reasoning]
</think>
{"choice": "A", "confidence": 0.82, "reasoning": "one sentence"}

choice is always A (Disagree/No), B (Mixed/Neutral), or C (Agree/Yes).

Training details

Setting	Value
Base model	`Qwen/Qwen2.5-14B-Instruct`
Training data	SocSci210 6k (fixed)
Method	QLoRA 4-bit, NF4, double quant
LoRA r / alpha	64 / 128
LoRA targets	q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
Effective batch	64 (8 × 8 grad accum)
Epochs	2
Learning rate	0.0002 (cosine schedule)
Max seq length	2048
Training time	1.21 hrs
Training cost	$1.69
Resumed from	basab1142/sft-14b-v1

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-14B-Instruct",
    quantization_config=BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                                            bnb_4bit_compute_dtype=torch.bfloat16),
    device_map="auto",
)
model     = PeftModel.from_pretrained(base, "basab1142/sft-14b-v1")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-14B-Instruct")

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for basab1142/sft-14b-v1

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-14B-Instruct

Adapter

(356)

this model