add model card
Browse files
README.md
CHANGED
|
@@ -102,11 +102,33 @@ model-index:
|
|
| 102 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
|
| 103 |
name: Open LLM Leaderboard
|
| 104 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
-
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
|
| 109 |
-
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
| 110 |
<details><summary>See axolotl config</summary>
|
| 111 |
|
| 112 |
axolotl version: `0.4.1`
|
|
@@ -186,67 +208,83 @@ fsdp:
|
|
| 186 |
fsdp_config:
|
| 187 |
|
| 188 |
save_safetensors: true
|
| 189 |
-
|
| 190 |
```
|
| 191 |
|
| 192 |
</details><br>
|
| 193 |
|
| 194 |
-
#
|
| 195 |
|
| 196 |
-
|
| 197 |
|
| 198 |
-
|
| 199 |
|
| 200 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 201 |
|
| 202 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
|
| 204 |
-
|
| 205 |
|
| 206 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 207 |
|
| 208 |
-
|
| 209 |
|
| 210 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 211 |
|
| 212 |
-
### Training hyperparameters
|
| 213 |
|
| 214 |
-
|
| 215 |
-
- learning_rate: 0.0002
|
| 216 |
-
- train_batch_size: 2
|
| 217 |
-
- eval_batch_size: 8
|
| 218 |
-
- seed: 42
|
| 219 |
-
- distributed_type: multi-GPU
|
| 220 |
-
- num_devices: 2
|
| 221 |
-
- gradient_accumulation_steps: 8
|
| 222 |
-
- total_train_batch_size: 32
|
| 223 |
-
- total_eval_batch_size: 16
|
| 224 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 225 |
-
- lr_scheduler_type: cosine
|
| 226 |
-
- lr_scheduler_warmup_steps: 10
|
| 227 |
-
- training_steps: 341
|
| 228 |
|
| 229 |
-
|
| 230 |
|
|
|
|
|
|
|
| 231 |
|
|
|
|
| 232 |
|
| 233 |
-
|
| 234 |
|
| 235 |
-
|
| 236 |
-
- Transformers 4.45.1
|
| 237 |
-
- Pytorch 2.3.1+cu121
|
| 238 |
-
- Datasets 2.21.0
|
| 239 |
-
- Tokenizers 0.20.0
|
| 240 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
| 241 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_HumanLLMs__Humanish-Qwen2.5-7B-Instruct)
|
| 242 |
|
| 243 |
-
|
| 244 |
-
|-------------------|----:|
|
| 245 |
-
|Avg. |26.67|
|
| 246 |
-
|IFEval (0-Shot) |72.84|
|
| 247 |
-
|BBH (3-Shot) |34.48|
|
| 248 |
-
|MATH Lvl 5 (4-Shot)| 0.00|
|
| 249 |
-
|GPQA (0-shot) | 6.49|
|
| 250 |
-
|MuSR (0-shot) | 8.42|
|
| 251 |
-
|MMLU-PRO (5-shot) |37.76|
|
| 252 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-Qwen2.5-7B-Instruct
|
| 103 |
name: Open LLM Leaderboard
|
| 104 |
---
|
| 105 |
+
<div align="center">
|
| 106 |
+
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/63da3d7ae697e5898cb86854/H-vpXOX6KZu01HnV87Jk5.jpeg" width="320" height="320" />
|
| 107 |
+
<h1>Enhancing Human-Like Responses in Large Language Models</h1>
|
| 108 |
+
</div>
|
| 109 |
|
| 110 |
+
<p align="center">
|
| 111 |
+
   | 🤗 <a href="https://huggingface.co/collections/HumanLLMs/human-like-humanish-llms-6759fa68f22e11eb1a10967e">Models</a>   |
|
| 112 |
+
   📊 <a href="https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset">Dataset</a>   |
|
| 113 |
+
   📄<a href="https://arxiv.org/abs/2501.05032">Paper</a>   |
|
| 114 |
+
</p>
|
| 115 |
+
|
| 116 |
+
# 🚀 Human-Like-Qwen2.5-7B-Instruct
|
| 117 |
+
|
| 118 |
+
This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct), specifically optimized to generate more human-like and conversational responses.
|
| 119 |
+
|
| 120 |
+
The fine-tuning process employed both [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2501.05032) and [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.
|
| 121 |
+
|
| 122 |
+
The proccess of creating this models is detailed in the research paper [“Enhancing Human-Like Responses in Large Language Models”](https://arxiv.org/abs/2501.05032).
|
| 123 |
+
|
| 124 |
+
# 🛠️ Training Configuration
|
| 125 |
+
|
| 126 |
+
- **Base Model:** Qwen2.5-7B-Instruct
|
| 127 |
+
- **Framework:** Axolotl v0.4.1
|
| 128 |
+
- **Hardware:** 2x NVIDIA A100 (80 GB) GPUs
|
| 129 |
+
- **Training Time:** ~2 hours 15 minutes
|
| 130 |
+
- **Dataset:** Synthetic dataset with ≈11,000 samples across 256 diverse topics
|
| 131 |
|
|
|
|
| 132 |
<details><summary>See axolotl config</summary>
|
| 133 |
|
| 134 |
axolotl version: `0.4.1`
|
|
|
|
| 208 |
fsdp_config:
|
| 209 |
|
| 210 |
save_safetensors: true
|
|
|
|
| 211 |
```
|
| 212 |
|
| 213 |
</details><br>
|
| 214 |
|
| 215 |
+
# 💬 Prompt Template
|
| 216 |
|
| 217 |
+
You can use ChatML prompt template while using the model:
|
| 218 |
|
| 219 |
+
### ChatML
|
| 220 |
|
| 221 |
+
```
|
| 222 |
+
<|im_start|>system
|
| 223 |
+
{system}<|im_end|>
|
| 224 |
+
<|im_start|>user
|
| 225 |
+
{user}<|im_end|>
|
| 226 |
+
<|im_start|>assistant
|
| 227 |
+
{asistant}<|im_end|>
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
+
This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
|
| 231 |
+
`tokenizer.apply_chat_template()` method:
|
| 232 |
|
| 233 |
+
```python
|
| 234 |
+
messages = [
|
| 235 |
+
{"role": "system", "content": "You are helpful AI asistant."},
|
| 236 |
+
{"role": "user", "content": "Hello!"}
|
| 237 |
+
]
|
| 238 |
+
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
|
| 239 |
+
model.generate(**gen_input)
|
| 240 |
+
```
|
| 241 |
|
| 242 |
+
# 🤖 Models
|
| 243 |
|
| 244 |
+
| Model | Download |
|
| 245 |
+
|:---------------------:|:-----------------------------------------------------------------------:|
|
| 246 |
+
| Human-Like-Llama-3-8B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-LLama3-8B-Instruct) |
|
| 247 |
+
| Human-Like-Qwen-2.5-7B-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Qwen2.5-7B-Instruct) |
|
| 248 |
+
| Human-Like-Mistral-Nemo-Instruct | 🤗 [HuggingFace](https://huggingface.co/HumanLLMs/Human-Like-Mistral-Nemo-Instruct-2407) |
|
| 249 |
|
| 250 |
+
# 🎯 Benchmark Results
|
| 251 |
|
| 252 |
+
| **Group** | **Model** | **Average** | **IFEval** | **BBH** | **MATH Lvl 5** | **GPQA** | **MuSR** | **MMLU-PRO** |
|
| 253 |
+
|--------------------------------|--------------------------------|-------------|------------|---------|----------------|----------|----------|--------------|
|
| 254 |
+
| **Llama Models** | Human-Like-Llama-3-8B-Instruct | 22.37 | **64.97** | 28.01 | 8.45 | 0.78 | **2.00** | 30.01 |
|
| 255 |
+
| | Llama-3-8B-Instruct | 23.57 | 74.08 | 28.24 | 8.68 | 1.23 | 1.60 | 29.60 |
|
| 256 |
+
| | *Difference (Human-Like)* | -1.20 | **-9.11** | -0.23 | -0.23 | -0.45 | +0.40 | +0.41 |
|
| 257 |
+
| **Qwen Models** | Human-Like-Qwen-2.5-7B-Instruct | 26.66 | 72.84 | 34.48 | 0.00 | 6.49 | 8.42 | 37.76 |
|
| 258 |
+
| | Qwen-2.5-7B-Instruct | 26.86 | 75.85 | 34.89 | 0.00 | 5.48 | 8.45 | 36.52 |
|
| 259 |
+
| | *Difference (Human-Like)* | -0.20 | -3.01 | -0.41 | 0.00 | **+1.01**| -0.03 | **+1.24** |
|
| 260 |
+
| **Mistral Models** | Human-Like-Mistral-Nemo-Instruct | 22.88 | **54.51** | 32.70 | 7.62 | 5.03 | 9.39 | 28.00 |
|
| 261 |
+
| | Mistral-Nemo-Instruct | 23.53 | 63.80 | 29.68 | 5.89 | 5.37 | 8.48 | 27.97 |
|
| 262 |
+
| | *Difference (Human-Like)* | -0.65 | **-9.29** | **+3.02**| **+1.73** | -0.34 | +0.91 | +0.03 |
|
| 263 |
|
|
|
|
| 264 |
|
| 265 |
+
# 📊 Dataset
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 266 |
|
| 267 |
+
The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:
|
| 268 |
|
| 269 |
+
- **Human-like responses:** Natural, conversational answers mimicking human dialogue.
|
| 270 |
+
- **Formal responses:** Structured and precise answers with a more formal tone.
|
| 271 |
|
| 272 |
+
The dataset has been open-sourced and is available at:
|
| 273 |
|
| 274 |
+
- 👉 [Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset)
|
| 275 |
|
| 276 |
+
More details on the dataset creation process can be found in the accompanying research paper.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 277 |
|
| 278 |
+
# 📝 Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 279 |
|
| 280 |
+
```
|
| 281 |
+
@misc{çalık2025enhancinghumanlikeresponseslarge,
|
| 282 |
+
title={Enhancing Human-Like Responses in Large Language Models},
|
| 283 |
+
author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
|
| 284 |
+
year={2025},
|
| 285 |
+
eprint={2501.05032},
|
| 286 |
+
archivePrefix={arXiv},
|
| 287 |
+
primaryClass={cs.CL},
|
| 288 |
+
url={https://arxiv.org/abs/2501.05032},
|
| 289 |
+
}
|
| 290 |
+
```
|