Silicon-Based-Girlfriend — QLoRA Adapter


基於 Qwen3.5-4B 的 QLoRA 微調 Adapter,訓練目標為沉浸式繁體中文角色扮演。本倉庫包含 LoRA Adapter 權重與 GGUF 格式模型。


Model Details / 模型資訊

項目 內容
Base Model Qwen/Qwen3.5-4B
Fine-tuning Method QLoRA (4-bit NF4)
LoRA Rank 32
LoRA Alpha 64
LoRA Dropout 0.05
LoRA Target All linear layers
Training Epochs 5
Context Length 8192 tokens
Learning Rate 1e-4
LR Scheduler Cosine
Optimizer paged_adamw_8bit
Training Samples 985
Train Loss 1.108
Eval Loss 1.434
Hardware NVIDIA RTX A6000 (48GB VRAM)
Training Time ~19 hours
Framework LLaMA-Factory
Chat Template qwen3_5_nothink (non-thinking mode)

Files / 檔案說明

檔案 說明
adapter_config.json LoRA 設定檔
adapter_model.safetensors LoRA 權重(248 MB)
tokenizer_config.json Tokenizer 設定(含 nothink chat template)
tokenizer.json Tokenizer
vocab.json / merges.txt Vocabulary
silicon-gf-q8_0.gguf Q8_0 量化 GGUF(4.2 GB,適用 llama.cpp / LM Studio)
training_loss.png 訓練 Loss 曲線
training_eval_loss.png 評估 Loss 曲線

Usage / 使用方式

Option 1: GGUF (Recommended / 推薦)

直接在 LM Studiollama.cpp 載入 silicon-gf-q8_0.gguf,無需額外安裝。

# llama.cpp
./llama-cli -m silicon-gf-q8_0.gguf -c 8192 --temp 0.8

Option 2: LoRA Adapter with transformers + PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "Qwen/Qwen3.5-4B"
adapter = "RX5950XTP/silicon-based-girlfriend"

tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

messages = [
    {"role": "user", "content": "嘿,你在幹嘛?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Option 3: LLaMA-Factory inference

llamafactory-cli chat \
  --model_name_or_path Qwen/Qwen3.5-4B \
  --adapter_name_or_path RX5950XTP/silicon-based-girlfriend \
  --template qwen3_5_nothink \
  --finetuning_type lora

Training Curves / 訓練曲線

Training Loss Eval Loss


Dataset / 訓練資料集

  • 倉庫RX5950XTP/silicon-girlfriend-dataset
  • 格式:ShareGPT(system + conversations with from/value
  • 筆數:985 筆多輪對話
  • 語言:繁體中文(臺灣用語)
  • 生成方式:由 Kimi K2.5 根據角色設定生成

Notes / 注意事項

  • 本模型使用 qwen3_5_nothink chat template,預設不啟用思考模式,回覆會直接輸出角色對話。
  • 角色設定包含不良用語與成人主題,請自行評估使用場景。
  • 模型以 QLoRA 訓練,推理時需搭配 base model(Qwen3.5-4B)一同載入,或直接使用 GGUF。

License

Apache 2.0(遵循 Qwen3.5-4B 原授權)

Downloads last month
60
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RX5950XTP/silicon-based-girlfriend

Finetuned
Qwen/Qwen3.5-4B
Adapter
(38)
this model

Dataset used to train RX5950XTP/silicon-based-girlfriend