|
|
---
|
|
|
license: other
|
|
|
license_name: qwen
|
|
|
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
|
|
|
language:
|
|
|
- zho
|
|
|
- eng
|
|
|
- fra
|
|
|
- spa
|
|
|
- por
|
|
|
- deu
|
|
|
- ita
|
|
|
- rus
|
|
|
- jpn
|
|
|
- kor
|
|
|
- vie
|
|
|
- tha
|
|
|
- ara
|
|
|
pipeline_tag: text-generation
|
|
|
base_model: Qwen/Qwen2.5-72B
|
|
|
tags:
|
|
|
- chat
|
|
|
library_name: transformers
|
|
|
---
|
|
|
|
|
|
<p style="font-size:20px;" align="left">
|
|
|
<div style="width: 80px; height: 80px; border-radius: 15px;">
|
|
|
<img
|
|
|
src="https://shuttleai.com/shuttle.png"
|
|
|
alt="ShuttleAI Thumbnail"
|
|
|
style="width: auto; height: auto; margin-left: 0; object-fit: cover; border-radius: 15px;">
|
|
|
</div>
|
|
|
|
|
|
<p align="left">
|
|
|
💻 <a href="https://shuttleai.com/" target="_blank">Use via API</a>
|
|
|
</p>
|
|
|
|
|
|
## Shuttle-3 (beta) [2024/10/25]
|
|
|
|
|
|
We are excited to introduce Shuttle-3, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.
|
|
|
|
|
|
- **Shuttle-3** is a fine-tuned version of [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.
|
|
|
|
|
|
## Model Details
|
|
|
|
|
|
* **Model Name**: Shuttle-3
|
|
|
* **Developed by**: ShuttleAI Inc.
|
|
|
* **Base Model**: [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
|
|
|
* **Parameters**: 72B
|
|
|
* **Language(s)**: Multilingual
|
|
|
* **Repository**: [https://huggingface.co/shuttleai](https://huggingface.co/shuttleai)
|
|
|
* **Fine-Tuned Model**: [https://huggingface.co/shuttleai/shuttle-3](https://huggingface.co/shuttleai/shuttle-3)
|
|
|
|
|
|
### Key Features
|
|
|
|
|
|
- Pretrained on a large proportion of multilingual and code data
|
|
|
- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data
|
|
|
|
|
|
## Fine-Tuning Details
|
|
|
|
|
|
- **Training Setup**: Trained on 130 million tokens for 12 hours using 4 A100 PCIe GPUs.
|
|
|
|
|
|
## Prompting
|
|
|
|
|
|
Shuttle-3 uses ChatML as its prompting format:
|
|
|
|
|
|
```
|
|
|
<|im_start|>system
|
|
|
You are a pirate! Yardy harr harr!<|im_end|>
|
|
|
<|im_start|>user
|
|
|
Where are you currently!<|im_end|>
|
|
|
<|im_start|>assistant
|
|
|
Look ahoy ye scallywag! We're on the high seas!<|im_end|>
|
|
|
``` |