Update README.md
Browse files
README.md
CHANGED
|
@@ -122,7 +122,69 @@ Table 2: The ablation study for Crab. Due to missing attributes in our dataset,
|
|
| 122 |
|
| 123 |
<br>
|
| 124 |
|
| 125 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 126 |
We totally publish three datasets, including :
|
| 127 |
|
| 128 |
1. [Crab role-playing train set](https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-train-set) : the dataset used for fine‑tuning a role‑playing LLM.
|
|
@@ -132,14 +194,14 @@ We totally publish three datasets, including :
|
|
| 132 |
|
| 133 |
<br>
|
| 134 |
|
| 135 |
-
#
|
| 136 |
We release a fine-tuned Role-playin LLM to achieve configurable Role-Playing tasks:
|
| 137 |
|
| 138 |
[Download Link](https://huggingface.co/HeAAAAA/Crab)
|
| 139 |
|
| 140 |
<br>
|
| 141 |
|
| 142 |
-
#
|
| 143 |
We release a trained LLM to automate the evaluation of role-playing tasks:
|
| 144 |
|
| 145 |
[Download Link](https://huggingface.co/HeAAAAA/RoleRM)
|
|
@@ -148,7 +210,7 @@ We release a trained LLM to automate the evaluation of role-playing tasks:
|
|
| 148 |
|
| 149 |
|
| 150 |
|
| 151 |
-
#
|
| 152 |
|
| 153 |
```bibtex
|
| 154 |
@misc{kimiteam2025kimivltechnicalreport,
|
|
|
|
| 122 |
|
| 123 |
<br>
|
| 124 |
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
# 4. Usage
|
| 128 |
+
|
| 129 |
+
<pre lang="markdown">
|
| 130 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 131 |
+
|
| 132 |
+
system_prompt = """
|
| 133 |
+
# Enter Roleplaying Mode
|
| 134 |
+
Now you are character `Hermione`.
|
| 135 |
+
|
| 136 |
+
## Role Info
|
| 137 |
+
Name: `Hermione`
|
| 138 |
+
Age: `teenager`
|
| 139 |
+
Gender: `female`
|
| 140 |
+
Personality: `Intelligent, curious, respectful, and eager to learn`
|
| 141 |
+
Description: `Hermione and Hagrid were in the Forbidden Forest, walking on a narrow path surrounded by trees. Hermione looked around carefully, fascinated by the dense forest. Hagrid was leading the way, pointing out various creatures and telling her about their habits and characteristics.`
|
| 142 |
+
Conversation rules:
|
| 143 |
+
- Your utterance need to describe your behavior and expressions using `()`.
|
| 144 |
+
Reference speaking style: ```I've read about it in my books
|
| 145 |
+
[end_of_dialogue]
|
| 146 |
+
|
| 147 |
+
I think it's so important to learn about these creatures
|
| 148 |
+
[end_of_dialogue]
|
| 149 |
+
|
| 150 |
+
```Knowledge: ```
|
| 151 |
+
## Current Scenario Dialogue
|
| 152 |
+
Interlocutor: `Hagrid, Hagrid is the Care of Magical Creatures teacher at Hogwarts. He is a half-giant with a great love for all creatures, magical or not.`
|
| 153 |
+
Your relationship: `Teacher and student`
|
| 154 |
+
Scene: `Hermione and Hagrid are in the Forbidden Forest, exploring and learning about the various magical creatures that live there.`
|
| 155 |
+
Tags: ['friendly', 'educational', 'fantasy', 'Harry Potter']
|
| 156 |
+
Please converse as `Hermione`.
|
| 157 |
+
"""
|
| 158 |
+
|
| 159 |
+
user_prompt = """
|
| 160 |
+
"Now, this here is a Bowtruckle, Hermione. They're very small, only about the size of a twig, and they're very shy. They usually live in trees and are very good at camouflaging themselves. You have to be very careful when handling them because they have very sharp fingers. Hermione, do you like them ?"
|
| 161 |
+
"""
|
| 162 |
+
|
| 163 |
+
|
| 164 |
+
model = AutoModelForCausalLM.from_pretrained("HeAAAAA/Crab")
|
| 165 |
+
tokenizer = AutoTokenizer.from_pretrained("HeAAAAA/Crab")
|
| 166 |
+
|
| 167 |
+
# 编码输入
|
| 168 |
+
inputs_prompt = system_prompt + user_prompt
|
| 169 |
+
inputs = tokenizer(inputs_prompt, return_tensors="pt")
|
| 170 |
+
|
| 171 |
+
# 生成输出
|
| 172 |
+
outputs = model.generate(
|
| 173 |
+
**inputs,
|
| 174 |
+
max_new_tokens=200,
|
| 175 |
+
do_sample=True,
|
| 176 |
+
temperature=0.7,
|
| 177 |
+
top_p=0.9,
|
| 178 |
+
top_k=50,
|
| 179 |
+
repetition_penalty=1.1,
|
| 180 |
+
eos_token_id=tokenizer.eos_token_id
|
| 181 |
+
)
|
| 182 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 183 |
+
|
| 184 |
+
</pre>
|
| 185 |
+
|
| 186 |
+
|
| 187 |
+
# 5. Four Datasets
|
| 188 |
We totally publish three datasets, including :
|
| 189 |
|
| 190 |
1. [Crab role-playing train set](https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-train-set) : the dataset used for fine‑tuning a role‑playing LLM.
|
|
|
|
| 194 |
|
| 195 |
<br>
|
| 196 |
|
| 197 |
+
# 6. Fine-tuned Role-playing Model
|
| 198 |
We release a fine-tuned Role-playin LLM to achieve configurable Role-Playing tasks:
|
| 199 |
|
| 200 |
[Download Link](https://huggingface.co/HeAAAAA/Crab)
|
| 201 |
|
| 202 |
<br>
|
| 203 |
|
| 204 |
+
# 7. Role-palying Evaluation Model
|
| 205 |
We release a trained LLM to automate the evaluation of role-playing tasks:
|
| 206 |
|
| 207 |
[Download Link](https://huggingface.co/HeAAAAA/RoleRM)
|
|
|
|
| 210 |
|
| 211 |
|
| 212 |
|
| 213 |
+
# 8. Citation
|
| 214 |
|
| 215 |
```bibtex
|
| 216 |
@misc{kimiteam2025kimivltechnicalreport,
|