RX5950XTP commited on
Commit
c5232bd
·
verified ·
1 Parent(s): fb02d91

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ license: apache-2.0
6
+ base_model: Qwen/Qwen3.5-4B
7
+ tags:
8
+ - lora
9
+ - qlora
10
+ - roleplay
11
+ - character-ai
12
+ - taiwanese-mandarin
13
+ - llama-factory
14
+ - gguf
15
+ datasets:
16
+ - RX5950XTP/silicon-girlfriend-dataset
17
+ ---
18
+ # Silicon-Based-Girlfriend — QLoRA Adapter
19
+
20
+ ---
21
+
22
+ 基於 **Qwen3.5-4B** 的 QLoRA 微調 Adapter,訓練目標為沉浸式繁體中文角色扮演。本倉庫包含 LoRA Adapter 權重與 GGUF 格式模型。
23
+
24
+ ---
25
+
26
+ ## Model Details / 模型資訊
27
+
28
+ | 項目 | 內容 |
29
+ | ------------------ | ------------------------------------------------------ |
30
+ | Base Model | [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) |
31
+ | Fine-tuning Method | QLoRA (4-bit NF4) |
32
+ | LoRA Rank | 32 |
33
+ | LoRA Alpha | 64 |
34
+ | LoRA Dropout | 0.05 |
35
+ | LoRA Target | All linear layers |
36
+ | Training Epochs | 5 |
37
+ | Context Length | 8192 tokens |
38
+ | Learning Rate | 1e-4 |
39
+ | LR Scheduler | Cosine |
40
+ | Optimizer | paged_adamw_8bit |
41
+ | Training Samples | 985 |
42
+ | Train Loss | 1.108 |
43
+ | Eval Loss | 1.434 |
44
+ | Hardware | NVIDIA RTX A6000 (48GB VRAM) |
45
+ | Training Time | ~19 hours |
46
+ | Framework | [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) |
47
+ | Chat Template | `qwen3_5_nothink` (non-thinking mode) |
48
+
49
+ ---
50
+
51
+ ## Files / 檔案說明
52
+
53
+ | 檔案 | 說明 |
54
+ | ------------------------------- | ---------------------------------------------------- |
55
+ | `adapter_config.json` | LoRA 設定檔 |
56
+ | `adapter_model.safetensors` | LoRA 權重(248 MB) |
57
+ | `tokenizer_config.json` | Tokenizer 設定(含 nothink chat template) |
58
+ | `tokenizer.json` | Tokenizer |
59
+ | `vocab.json` / `merges.txt` | Vocabulary |
60
+ | `silicon-gf-q8_0.gguf` | Q8_0 量化 GGUF(4.2 GB,適用 llama.cpp / LM Studio) |
61
+ | `training_loss.png` | 訓練 Loss 曲線 |
62
+ | `training_eval_loss.png` | 評估 Loss 曲線 |
63
+
64
+ ---
65
+
66
+ ## Usage / 使用方式
67
+
68
+ ### Option 1: GGUF (Recommended / 推薦)
69
+
70
+ 直接在 **LM Studio** 或 **llama.cpp** 載入 `silicon-gf-q8_0.gguf`,無需額外安裝。
71
+
72
+ ```bash
73
+ # llama.cpp
74
+ ./llama-cli -m silicon-gf-q8_0.gguf -c 8192 --temp 0.8
75
+ ```
76
+
77
+ ### Option 2: LoRA Adapter with transformers + PEFT
78
+
79
+ ```python
80
+ from peft import PeftModel
81
+ from transformers import AutoModelForCausalLM, AutoTokenizer
82
+
83
+ base_model = "Qwen/Qwen3.5-4B"
84
+ adapter = "RX5950XTP/silicon-based-girlfriend"
85
+
86
+ tokenizer = AutoTokenizer.from_pretrained(adapter)
87
+ model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
88
+ model = PeftModel.from_pretrained(model, adapter)
89
+
90
+ messages = [
91
+ {"role": "user", "content": "嘿,你在幹嘛?"}
92
+ ]
93
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
94
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
95
+ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True)
96
+ print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
97
+ ```
98
+
99
+ ### Option 3: LLaMA-Factory inference
100
+
101
+ ```bash
102
+ llamafactory-cli chat \
103
+ --model_name_or_path Qwen/Qwen3.5-4B \
104
+ --adapter_name_or_path RX5950XTP/silicon-based-girlfriend \
105
+ --template qwen3_5_nothink \
106
+ --finetuning_type lora
107
+ ```
108
+
109
+ ---
110
+
111
+ ## Training Curves / 訓練曲線
112
+
113
+ ![Training Loss](training_loss.png)
114
+ ![Eval Loss](training_eval_loss.png)
115
+
116
+ ---
117
+
118
+ ## Dataset / 訓練資料集
119
+
120
+ - **倉庫**:[RX5950XTP/silicon-girlfriend-dataset](https://huggingface.co/datasets/RX5950XTP/silicon-girlfriend-dataset)
121
+ - **格式**:ShareGPT(`system` + `conversations` with `from`/`value`)
122
+ - **筆數**:985 筆多輪對話
123
+ - **語言**:繁體中文(臺灣用語)
124
+ - **生成方式**:由 Kimi K2.5 根據角色設定生成
125
+
126
+ ---
127
+
128
+ ## Notes / 注意事項
129
+
130
+ - 本模型使用 `qwen3_5_nothink` chat template,**預設不啟用思考模式**,回覆會直接輸出角色對話。
131
+ - 角色設定包含不良用語與成人主題,請自行評估使用場景。
132
+ - 模型以 QLoRA 訓練,推理時需搭配 base model(Qwen3.5-4B)一同載入,或直接使用 GGUF。
133
+
134
+ ---
135
+
136
+ ## License
137
+
138
+ Apache 2.0(遵循 Qwen3.5-4B 原授權)