RX5950XTP
/

silicon-based-girlfriend

+---
+language:
+  - zh
+  - en
+license: apache-2.0
+base_model: Qwen/Qwen3.5-4B
+tags:
+  - lora
+  - qlora
+  - roleplay
+  - character-ai
+  - taiwanese-mandarin
+  - llama-factory
+  - gguf
+datasets:
+  - RX5950XTP/silicon-girlfriend-dataset
+---
+# Silicon-Based-Girlfriend — QLoRA Adapter
+---
+基於 **Qwen3.5-4B** 的 QLoRA 微調 Adapter，訓練目標為沉浸式繁體中文角色扮演。本倉庫包含 LoRA Adapter 權重與 GGUF 格式模型。
+---
+## Model Details / 模型資訊
+| 項目               | 內容                                                   |
+| ------------------ | ------------------------------------------------------ |
+| Base Model         | [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) |
+| Fine-tuning Method | QLoRA (4-bit NF4)                                      |
+| LoRA Rank          | 32                                                     |
+| LoRA Alpha         | 64                                                     |
+| LoRA Dropout       | 0.05                                                   |
+| LoRA Target        | All linear layers                                      |
+| Training Epochs    | 5                                                      |
+| Context Length     | 8192 tokens                                            |
+| Learning Rate      | 1e-4                                                   |
+| LR Scheduler       | Cosine                                                 |
+| Optimizer          | paged_adamw_8bit                                       |
+| Training Samples   | 985                                                    |
+| Train Loss         | 1.108                                                  |
+| Eval Loss          | 1.434                                                  |
+| Hardware           | NVIDIA RTX A6000 (48GB VRAM)                           |
+| Training Time      | ~19 hours                                              |
+| Framework          | [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) |
+| Chat Template      | `qwen3_5_nothink` (non-thinking mode)                |
+---
+## Files / 檔案說明
+| 檔案                            | 說明                                                 |
+| ------------------------------- | ---------------------------------------------------- |
+| `adapter_config.json`         | LoRA 設定檔                                          |
+| `adapter_model.safetensors`   | LoRA 權重（248 MB）                                  |
+| `tokenizer_config.json`       | Tokenizer 設定（含 nothink chat template）           |
+| `tokenizer.json`              | Tokenizer                                            |
+| `vocab.json` / `merges.txt` | Vocabulary                                           |
+| `silicon-gf-q8_0.gguf`        | Q8_0 量化 GGUF（4.2 GB，適用 llama.cpp / LM Studio） |
+| `training_loss.png`           | 訓練 Loss 曲線                                       |
+| `training_eval_loss.png`      | 評估 Loss 曲線                                       |
+---
+## Usage / 使用方式
+### Option 1: GGUF (Recommended / 推薦)
+直接在 **LM Studio** 或 **llama.cpp** 載入 `silicon-gf-q8_0.gguf`，無需額外安裝。
+```bash
+# llama.cpp
+./llama-cli -m silicon-gf-q8_0.gguf -c 8192 --temp 0.8
+```
+### Option 2: LoRA Adapter with transformers + PEFT
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+base_model = "Qwen/Qwen3.5-4B"
+adapter = "RX5950XTP/silicon-based-girlfriend"
+tokenizer = AutoTokenizer.from_pretrained(adapter)
+model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
+model = PeftModel.from_pretrained(model, adapter)
+messages = [
+    {"role": "user", "content": "嘿，你在幹嘛？"}
+]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True)
+print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
+```
+### Option 3: LLaMA-Factory inference
+```bash
+llamafactory-cli chat \
+  --model_name_or_path Qwen/Qwen3.5-4B \
+  --adapter_name_or_path RX5950XTP/silicon-based-girlfriend \
+  --template qwen3_5_nothink \
+  --finetuning_type lora
+```
+---
+## Training Curves / 訓練曲線
+![Training Loss](training_loss.png)
+![Eval Loss](training_eval_loss.png)
+---
+## Dataset / 訓練資料集
+- **倉庫**：[RX5950XTP/silicon-girlfriend-dataset](https://huggingface.co/datasets/RX5950XTP/silicon-girlfriend-dataset)
+- **格式**：ShareGPT（`system` + `conversations` with `from`/`value`）
+- **筆數**：985 筆多輪對話
+- **語言**：繁體中文（臺灣用語）
+- **生成方式**：由 Kimi K2.5 根據角色設定生成
+---
+## Notes / 注意事項
+- 本模型使用 `qwen3_5_nothink` chat template，**預設不啟用思考模式**，回覆會直接輸出角色對話。
+- 角色設定包含不良用語與成人主題，請自行評估使用場景。
+- 模型以 QLoRA 訓練，推理時需搭配 base model（Qwen3.5-4B）一同載入，或直接使用 GGUF。
+---
+## License
+Apache 2.0（遵循 Qwen3.5-4B 原授權）