Update README.md
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ This is a Speech Lanaguage Model trained for generating audio contiuations over
|
|
| 18 |
This is a Speech Lanaguage Model, fine-tuned from [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) over a vocabulary of 500
|
| 19 |
speech tokens extracted from the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz). It was trained as part of
|
| 20 |
["*Slamming*: Training a Speech Language Model on One GPU in a Day"], focusing on efficient training. For a stronger model trained with
|
| 21 |
-
slightly more compute - 2*A100 for 2 days, see [slam_scaled](https://huggingface.co/slprl/
|
| 22 |
|
| 23 |
The model was trained by next-token prediction over a subset of LibriSpeech, Libri-Light and a synthetic data
|
| 24 |
[sTinyStories](https://huggingface.co/datasets/slprl/sTinyStories). It was then trained with DPO over
|
|
|
|
| 18 |
This is a Speech Lanaguage Model, fine-tuned from [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) over a vocabulary of 500
|
| 19 |
speech tokens extracted from the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz). It was trained as part of
|
| 20 |
["*Slamming*: Training a Speech Language Model on One GPU in a Day"], focusing on efficient training. For a stronger model trained with
|
| 21 |
+
slightly more compute - 2*A100 for 2 days, see [slam_scaled](https://huggingface.co/slprl/slam_scaled).
|
| 22 |
|
| 23 |
The model was trained by next-token prediction over a subset of LibriSpeech, Libri-Light and a synthetic data
|
| 24 |
[sTinyStories](https://huggingface.co/datasets/slprl/sTinyStories). It was then trained with DPO over
|