Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<p style="font-size:20px;" align="center">
|
| 2 |
+
<div style="width: 100%; height: 50px; overflow: hidden; border-radius: 15px; margin: auto; position: relative;">
|
| 3 |
+
<img
|
| 4 |
+
src="https://shuttleai.com/shuttle.png"
|
| 5 |
+
alt="ShuttleAI Thumbnail"
|
| 6 |
+
style="width: 100%; height: auto; display: block; margin: auto; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -50%); object-fit: cover;">
|
| 7 |
+
</div>
|
| 8 |
+
|
| 9 |
+
<p align="center">
|
| 10 |
+
💻 <a href="https://shuttleai.com/" target="_blank">Use via API</a>
|
| 11 |
+
</p>
|
| 12 |
+
|
| 13 |
+
## Shuttle-3 (beta) [2024/10/25]
|
| 14 |
+
|
| 15 |
+
We are excited to introduce Shuttle-3-mini, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.
|
| 16 |
+
|
| 17 |
+
- **Shuttle-3-mini** is a fine-tuned version of [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.
|
| 18 |
+
|
| 19 |
+
## Model Details
|
| 20 |
+
|
| 21 |
+
* **Model Name**: Shuttle-3-mini
|
| 22 |
+
* **Developed by**: ShuttleAI Inc.
|
| 23 |
+
* **Base Model**: [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
|
| 24 |
+
* **Parameters**: 72B
|
| 25 |
+
* **Language(s)**: Multilingual
|
| 26 |
+
* **Repository**: [https://huggingface.co/shuttleai](https://huggingface.co/shuttleai)
|
| 27 |
+
* **Fine-Tuned Model**: [https://huggingface.co/shuttleai/shuttle-3](https://huggingface.co/shuttleai/shuttle-3)
|
| 28 |
+
|
| 29 |
+
### Key Features
|
| 30 |
+
|
| 31 |
+
- Pretrained on a large proportion of multilingual and code data
|
| 32 |
+
- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data
|
| 33 |
+
|
| 34 |
+
## Fine-Tuning Details
|
| 35 |
+
|
| 36 |
+
- **Training Setup**: Trained on 130 million tokens for 12 hours using 4 A100 PCIe GPUs.
|
| 37 |
+
|
| 38 |
+
## Prompting
|
| 39 |
+
|
| 40 |
+
Shuttle-3 uses ChatML as its prompting format:
|
| 41 |
+
|
| 42 |
+
```
|
| 43 |
+
<|im_start|>system
|
| 44 |
+
You are a pirate! Yardy harr harr!<|im_end|>
|
| 45 |
+
<|im_start|>user
|
| 46 |
+
Where are you currently!<|im_end|>
|
| 47 |
+
<|im_start|>assistant
|
| 48 |
+
Look ahoy ye scallywag! We're on the high seas!<|im_end|>
|
| 49 |
+
```
|