Update README.md
Browse files
README.md
CHANGED
|
@@ -6,6 +6,7 @@ tags:
|
|
| 6 |
- C-RLFT
|
| 7 |
datasets:
|
| 8 |
- openchat/openchat_sharegpt4_dataset
|
|
|
|
| 9 |
- imone/OpenOrca_FLAN
|
| 10 |
- LDJnr/LessWrong-Amplify-Instruct
|
| 11 |
- LDJnr/Pure-Dove
|
|
@@ -19,8 +20,6 @@ library_name: transformers
|
|
| 19 |
pipeline_tag: text-generation
|
| 20 |
---
|
| 21 |
|
| 22 |
-
# OpenChat (1210 Version): Advancing Open-source Language Models with Mixed-Quality Data
|
| 23 |
-
|
| 24 |
<div align="center">
|
| 25 |
<img src="https://raw.githubusercontent.com/imoneoi/openchat/master/assets/logo_new.png" style="width: 65%">
|
| 26 |
</div>
|
|
@@ -34,18 +33,27 @@ pipeline_tag: text-generation
|
|
| 34 |
<a href="https://arxiv.org/pdf/2309.11235.pdf">Paper</a>
|
| 35 |
</p>
|
| 36 |
|
| 37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
-
|
| 40 |
-
|-----------------------------|------------|
|
| 41 |
-
| GPT-3.5 (December 2023) | 64.6 |
|
| 42 |
-
| **OpenChat 3.5 1210** | **63.4** |
|
| 43 |
-
| GPT-3.5 (March 2023) | 64.6 |
|
| 44 |
-
| OpenHermes 2.5 | 41.5 |
|
| 45 |
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
OpenChat is an innovative library of open-source language models, fine-tuned with [C-RLFT](https://arxiv.org/pdf/2309.11235.pdf) - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
|
| 51 |
|
|
@@ -59,10 +67,14 @@ Once started, the server listens at `localhost:18888` for requests and is compat
|
|
| 59 |
|
| 60 |
If you want to deploy the server as an online service, you can use `--api-keys sk-KEY1 sk-KEY2 ...` to specify allowed API keys and `--disable-log-requests --disable-log-stats --log-file openchat.log` for logging only to a file. For security purposes, we recommend using an [HTTPS gateway](https://fastapi.tiangolo.com/es/deployment/concepts/#security-https) in front of the server.
|
| 61 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
<details>
|
| 63 |
<summary>Example request (click to expand)</summary>
|
| 64 |
|
| 65 |
-
Default Mode (
|
| 66 |
|
| 67 |
```bash
|
| 68 |
curl http://localhost:18888/v1/chat/completions \
|
|
@@ -73,7 +85,7 @@ curl http://localhost:18888/v1/chat/completions \
|
|
| 73 |
}'
|
| 74 |
```
|
| 75 |
|
| 76 |
-
Mathematical Reasoning Mode
|
| 77 |
|
| 78 |
```bash
|
| 79 |
curl http://localhost:18888/v1/chat/completions \
|
|
@@ -87,24 +99,22 @@ curl http://localhost:18888/v1/chat/completions \
|
|
| 87 |
|
| 88 |
</details>
|
| 89 |
|
| 90 |
-
| Model | Size | Context | Weights | Serving |
|
| 91 |
-
|-------------------|------|---------|------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
|
| 92 |
-
| OpenChat 3.5 1210 | 7B | 8192 | [Huggingface](https://huggingface.co/openchat/openchat_3.5_1210) | `python -m ochat.serving.openai_api_server --model openchat/openchat_3.5_1210 --engine-use-ray --worker-use-ray` |
|
| 93 |
-
|
| 94 |
### Conversation templates
|
| 95 |
|
| 96 |
-
Default Mode (GPT4 Correct)
|
| 97 |
|
| 98 |
```
|
| 99 |
GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:
|
| 100 |
```
|
| 101 |
|
| 102 |
-
Mathematical Reasoning Mode
|
| 103 |
|
| 104 |
```
|
| 105 |
Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>Math Correct Assistant:
|
| 106 |
```
|
| 107 |
|
|
|
|
|
|
|
| 108 |
The default (GPT4 Correct) template is also available as the integrated `tokenizer.chat_template`,
|
| 109 |
which can be used instead of manually specifying the template:
|
| 110 |
|
|
@@ -118,6 +128,38 @@ tokens = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
|
|
| 118 |
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]
|
| 119 |
```
|
| 120 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
## Comparison with [X.AI Grok models](https://x.ai/)
|
| 122 |
|
| 123 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
|
@@ -127,6 +169,8 @@ assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 42
|
|
| 127 |
| Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
|
| 128 |
| Grok-1 | Proprietary | ???B | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
|
| 129 |
|
|
|
|
|
|
|
| 130 |
## <a id="benchmarks"></a> Benchmarks
|
| 131 |
|
| 132 |
| Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
|
|
@@ -175,6 +219,7 @@ OpenChat 3.5 was trained with C-RLFT on a collection of publicly available high-
|
|
| 175 |
|
| 176 |
- [OpenChat ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset)
|
| 177 |
- [Open-Orca with FLAN answers](https://huggingface.co/datasets/imone/OpenOrca_FLAN)
|
|
|
|
| 178 |
- Capybara [1](https://huggingface.co/datasets/LDJnr/Pure-Dove) [2](https://huggingface.co/datasets/LDJnr/Verified-Camel) [3](https://huggingface.co/datasets/LDJnr/LessWrong-Amplify-Instruct)
|
| 179 |
- [GOAT](https://huggingface.co/datasets/tiedong/goat)
|
| 180 |
- [Glaive](https://huggingface.co/datasets/glaiveai/glaive-code-assistant)
|
|
|
|
| 6 |
- C-RLFT
|
| 7 |
datasets:
|
| 8 |
- openchat/openchat_sharegpt4_dataset
|
| 9 |
+
- kaist-ai/Feedback-Collection
|
| 10 |
- imone/OpenOrca_FLAN
|
| 11 |
- LDJnr/LessWrong-Amplify-Instruct
|
| 12 |
- LDJnr/Pure-Dove
|
|
|
|
| 20 |
pipeline_tag: text-generation
|
| 21 |
---
|
| 22 |
|
|
|
|
|
|
|
| 23 |
<div align="center">
|
| 24 |
<img src="https://raw.githubusercontent.com/imoneoi/openchat/master/assets/logo_new.png" style="width: 65%">
|
| 25 |
</div>
|
|
|
|
| 33 |
<a href="https://arxiv.org/pdf/2309.11235.pdf">Paper</a>
|
| 34 |
</p>
|
| 35 |
|
| 36 |
+
# OpenChat 3.5: First Update Released on December 10th!
|
| 37 |
+
|
| 38 |
+
**🚀 15-point improvement in coding performance**
|
| 39 |
+
|
| 40 |
+
**💡 Introducing a coding & generalist mode and a mathematical reasoning mode**
|
| 41 |
|
| 42 |
+
**🧑⚖️ Experimental support for evaluator and feedback capabilities**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
+
**🤖 Outperforms Grok-1 in 3/4 and ChatGPT (March) in 5/8 benchmarks**
|
| 45 |
+
|
| 46 |
+
| Model | Size | HumanEval+ pass@1 |
|
| 47 |
+
|-----------------------------|----------|------------|
|
| 48 |
+
| ChatGPT (December 12, 2023) | - | 64.6 |
|
| 49 |
+
| WizardCoder-Python-34B-V1.0 | 34B | 64.6 |
|
| 50 |
+
| **OpenChat 3.5 (Dec 10)** | **7B** | **63.4** |
|
| 51 |
+
| OpenHermes 2.5 | 7B | 41.5 |
|
| 52 |
+
|
| 53 |
+
<div style="display: flex; justify-content: center; align-items: center">
|
| 54 |
+
<img src="https://raw.githubusercontent.com/imoneoi/openchat/master/assets/openchat.png" style="width: 45%;">
|
| 55 |
+
<img src="https://raw.githubusercontent.com/imoneoi/openchat/master/assets/openchat_grok.png" style="width: 45%;">
|
| 56 |
+
</div>
|
| 57 |
|
| 58 |
OpenChat is an innovative library of open-source language models, fine-tuned with [C-RLFT](https://arxiv.org/pdf/2309.11235.pdf) - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
|
| 59 |
|
|
|
|
| 67 |
|
| 68 |
If you want to deploy the server as an online service, you can use `--api-keys sk-KEY1 sk-KEY2 ...` to specify allowed API keys and `--disable-log-requests --disable-log-stats --log-file openchat.log` for logging only to a file. For security purposes, we recommend using an [HTTPS gateway](https://fastapi.tiangolo.com/es/deployment/concepts/#security-https) in front of the server.
|
| 69 |
|
| 70 |
+
| Model | Size | Context | Weights | Serving |
|
| 71 |
+
|-------------------|------|---------|------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
|
| 72 |
+
| OpenChat 3.5 1210 | 7B | 8192 | [Huggingface](https://huggingface.co/openchat/openchat_3.5_1210) | `python -m ochat.serving.openai_api_server --model openchat/openchat_3.5_1210 --engine-use-ray --worker-use-ray` |
|
| 73 |
+
|
| 74 |
<details>
|
| 75 |
<summary>Example request (click to expand)</summary>
|
| 76 |
|
| 77 |
+
💡 **Default Mode (GPT4 Correct)**: Best for coding, chat and general tasks
|
| 78 |
|
| 79 |
```bash
|
| 80 |
curl http://localhost:18888/v1/chat/completions \
|
|
|
|
| 85 |
}'
|
| 86 |
```
|
| 87 |
|
| 88 |
+
🧮 **Mathematical Reasoning Mode**: Tailored for solving math problems
|
| 89 |
|
| 90 |
```bash
|
| 91 |
curl http://localhost:18888/v1/chat/completions \
|
|
|
|
| 99 |
|
| 100 |
</details>
|
| 101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
### Conversation templates
|
| 103 |
|
| 104 |
+
💡 **Default Mode (GPT4 Correct)**: Best for coding, chat and general tasks
|
| 105 |
|
| 106 |
```
|
| 107 |
GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:
|
| 108 |
```
|
| 109 |
|
| 110 |
+
🧮 **Mathematical Reasoning Mode**: Tailored for solving math problems
|
| 111 |
|
| 112 |
```
|
| 113 |
Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>Math Correct Assistant:
|
| 114 |
```
|
| 115 |
|
| 116 |
+
⚠️ **Notice:** Remember to set `<|end_of_turn|>` as end of generation token.
|
| 117 |
+
|
| 118 |
The default (GPT4 Correct) template is also available as the integrated `tokenizer.chat_template`,
|
| 119 |
which can be used instead of manually specifying the template:
|
| 120 |
|
|
|
|
| 128 |
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]
|
| 129 |
```
|
| 130 |
|
| 131 |
+
## 🧑⚖️ (Experimental) Evaluator / Feedback Capabilities
|
| 132 |
+
|
| 133 |
+
We've included evaluator capabilities in this release to advance open-source models as evaluators. You can use `Default Mode (GPT4 Correct)` with the following prompt (same as [Prometheus](https://huggingface.co/datasets/kaist-ai/Feedback-Collection)) to evaluate a response.
|
| 134 |
+
|
| 135 |
+
```
|
| 136 |
+
###Task Description:
|
| 137 |
+
An instruction (might include an Input inside it), a response to evaluate, a reference answer that gets a score of 5, and a score rubric representing a evaluation criteria are given.
|
| 138 |
+
1. Write a detailed feedback that assess the quality of the response strictly based on the given score rubric, not evaluating in general.
|
| 139 |
+
2. After writing a feedback, write a score that is an integer between 1 and 5. You should refer to the score rubric.
|
| 140 |
+
3. The output format should look as follows: "Feedback: (write a feedback for criteria) [RESULT] (an integer number between 1 and 5)"
|
| 141 |
+
4. Please do not generate any other opening, closing, and explanations.
|
| 142 |
+
|
| 143 |
+
###The instruction to evaluate:
|
| 144 |
+
{orig_instruction}
|
| 145 |
+
|
| 146 |
+
###Response to evaluate:
|
| 147 |
+
{orig_response}
|
| 148 |
+
|
| 149 |
+
###Reference Answer (Score 5):
|
| 150 |
+
{orig_reference_answer}
|
| 151 |
+
|
| 152 |
+
###Score Rubrics:
|
| 153 |
+
[{orig_criteria}]
|
| 154 |
+
Score 1: {orig_score1_description}
|
| 155 |
+
Score 2: {orig_score2_description}
|
| 156 |
+
Score 3: {orig_score3_description}
|
| 157 |
+
Score 4: {orig_score4_description}
|
| 158 |
+
Score 5: {orig_score5_description}
|
| 159 |
+
|
| 160 |
+
###Feedback:
|
| 161 |
+
```
|
| 162 |
+
|
| 163 |
## Comparison with [X.AI Grok models](https://x.ai/)
|
| 164 |
|
| 165 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
|
|
|
| 169 |
| Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
|
| 170 |
| Grok-1 | Proprietary | ???B | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
|
| 171 |
|
| 172 |
+
*: Grok results are reported by [X.AI](https://x.ai/).
|
| 173 |
+
|
| 174 |
## <a id="benchmarks"></a> Benchmarks
|
| 175 |
|
| 176 |
| Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
|
|
|
|
| 219 |
|
| 220 |
- [OpenChat ShareGPT](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset)
|
| 221 |
- [Open-Orca with FLAN answers](https://huggingface.co/datasets/imone/OpenOrca_FLAN)
|
| 222 |
+
- [Feedback-Collection](https://huggingface.co/datasets/kaist-ai/Feedback-Collection)
|
| 223 |
- Capybara [1](https://huggingface.co/datasets/LDJnr/Pure-Dove) [2](https://huggingface.co/datasets/LDJnr/Verified-Camel) [3](https://huggingface.co/datasets/LDJnr/LessWrong-Amplify-Instruct)
|
| 224 |
- [GOAT](https://huggingface.co/datasets/tiedong/goat)
|
| 225 |
- [Glaive](https://huggingface.co/datasets/glaiveai/glaive-code-assistant)
|