File size: 2,199 Bytes
dc3b97e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e8a07bc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---

license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-72B
tags:
- chat
library_name: transformers
---


<p style="font-size:20px;" align="left">
<div style="width: 80px; height: 80px; border-radius: 15px;">
    <img 

        src="https://shuttleai.com/shuttle.png" 

        alt="ShuttleAI Thumbnail" 

        style="width: auto; height: auto; margin-left: 0; object-fit: cover; border-radius: 15px;">

</div>


<p align="left">
    💻 <a href="https://shuttleai.com/" target="_blank">Use via API</a>

</p>


## Shuttle-3 (beta) [2024/10/25]

We are excited to introduce Shuttle-3, our next-generation state-of-the-art language model designed to excel in complex chat, multilingual communication, reasoning, and agent tasks.

- **Shuttle-3** is a fine-tuned version of [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct), emulating the writing style of Claude 3 models and thoroughly trained on role-playing data.

## Model Details

* **Model Name**: Shuttle-3
* **Developed by**: ShuttleAI Inc.
* **Base Model**: [Qwen-2.5-72b-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
* **Parameters**: 72B
* **Language(s)**: Multilingual
* **Repository**: [https://huggingface.co/shuttleai](https://huggingface.co/shuttleai)
* **Fine-Tuned Model**: [https://huggingface.co/shuttleai/shuttle-3](https://huggingface.co/shuttleai/shuttle-3)

### Key Features

- Pretrained on a large proportion of multilingual and code data
- Finetuned to emulate the prose quality of Claude 3 models and extensively on role play data

## Fine-Tuning Details

- **Training Setup**: Trained on 130 million tokens for 12 hours using 4 A100 PCIe GPUs.

## Prompting

Shuttle-3 uses ChatML as its prompting format:

```

<|im_start|>system

You are a pirate! Yardy harr harr!<|im_end|>

<|im_start|>user

Where are you currently!<|im_end|>

<|im_start|>assistant

Look ahoy ye scallywag! We're on the high seas!<|im_end|>

```