Update README.md
Browse files
README.md
CHANGED
|
@@ -117,27 +117,35 @@ Figure 2: Human evaluation comparing Crab, GPT-3.5, and Pygmalion-2-7B. We selec
|
|
| 117 |
|
| 118 |
Table 2: The ablation study for Crab. Due to missing attributes in our dataset, we sampled 1,000 fully attributed instances as the sub-test set to conduct the ablation experiments, referred to as Crab (sampled). The notation “w/o base" means without base role information for training RP-LLMs, including age, gender, personality, description, and expression; “w/o ref." means without catchphrases and knowledge; “w/o scene" means without interlocutor, relation, scenario, and tags.
|
| 119 |
|
|
|
|
|
|
|
| 120 |
# 4. Three Datasets
|
| 121 |
We publish three datasets, including Crab role-playing train set, Crab role-playing evaluation benchmark, and manually annotated role-playing evaluation dataset (can be used for training a Role-palying Evaluation Model).
|
| 122 |
|
| 123 |
-
Crab role-playing train set:
|
| 124 |
{https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-train-set}
|
| 125 |
|
| 126 |
-
Crab role-playing evaluation benchmark:
|
| 127 |
{https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-evaluation-benchmark}
|
| 128 |
|
| 129 |
-
Crab manually annotated role-playing evaluation dataset:
|
| 130 |
{https://huggingface.co/datasets/HeAAAAA/Crab-manually-annotated-role-playing-evaluation-dataset}
|
| 131 |
|
| 132 |
|
|
|
|
|
|
|
| 133 |
# 5. Fine-tuned Role-playing Model
|
| 134 |
We release a fine-tuned model to achieve configurable Role-Playing tasks.
|
| 135 |
{https://huggingface.co/HeAAAAA/Crab}
|
| 136 |
|
|
|
|
|
|
|
| 137 |
# 6. Role-palying Evaluation Model
|
| 138 |
We release a trained model to automate the evaluation of role-playing tasks.
|
| 139 |
{https://huggingface.co/HeAAAAA/RoleRM}
|
| 140 |
|
|
|
|
|
|
|
| 141 |
# 7. Citation
|
| 142 |
|
| 143 |
```bibtex
|
|
|
|
| 117 |
|
| 118 |
Table 2: The ablation study for Crab. Due to missing attributes in our dataset, we sampled 1,000 fully attributed instances as the sub-test set to conduct the ablation experiments, referred to as Crab (sampled). The notation “w/o base" means without base role information for training RP-LLMs, including age, gender, personality, description, and expression; “w/o ref." means without catchphrases and knowledge; “w/o scene" means without interlocutor, relation, scenario, and tags.
|
| 119 |
|
| 120 |
+
<br>
|
| 121 |
+
|
| 122 |
# 4. Three Datasets
|
| 123 |
We publish three datasets, including Crab role-playing train set, Crab role-playing evaluation benchmark, and manually annotated role-playing evaluation dataset (can be used for training a Role-palying Evaluation Model).
|
| 124 |
|
| 125 |
+
## 4.1 Crab role-playing train set:
|
| 126 |
{https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-train-set}
|
| 127 |
|
| 128 |
+
## 4.2 Crab role-playing evaluation benchmark:
|
| 129 |
{https://huggingface.co/datasets/HeAAAAA/Crab-role-playing-evaluation-benchmark}
|
| 130 |
|
| 131 |
+
## 4.3 Crab manually annotated role-playing evaluation dataset:
|
| 132 |
{https://huggingface.co/datasets/HeAAAAA/Crab-manually-annotated-role-playing-evaluation-dataset}
|
| 133 |
|
| 134 |
|
| 135 |
+
<br>
|
| 136 |
+
|
| 137 |
# 5. Fine-tuned Role-playing Model
|
| 138 |
We release a fine-tuned model to achieve configurable Role-Playing tasks.
|
| 139 |
{https://huggingface.co/HeAAAAA/Crab}
|
| 140 |
|
| 141 |
+
<br>
|
| 142 |
+
|
| 143 |
# 6. Role-palying Evaluation Model
|
| 144 |
We release a trained model to automate the evaluation of role-playing tasks.
|
| 145 |
{https://huggingface.co/HeAAAAA/RoleRM}
|
| 146 |
|
| 147 |
+
<br>
|
| 148 |
+
|
| 149 |
# 7. Citation
|
| 150 |
|
| 151 |
```bibtex
|