slprl
/

slam

speech_language_model

Model card Files Files and versions

gallilmaimon commited on Feb 18, 2025

Commit

630fe3a

·

verified ·

1 Parent(s): f932655

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ This is a Speech Lanaguage Model trained for generating audio contiuations over
 This is a Speech Lanaguage Model, fine-tuned from [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) over a vocabulary of 500
 speech tokens extracted from the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz). It was trained as part of
 ["*Slamming*: Training a Speech Language Model on One GPU in a Day"], focusing on efficient training. For a stronger model trained with
-slightly more compute - 2*A100 for 2 days, see [slam_scaled](https://huggingface.co/slprl/slam).
 The model was trained by next-token prediction over a subset of LibriSpeech, Libri-Light and a synthetic data
 [sTinyStories](https://huggingface.co/datasets/slprl/sTinyStories). It was then trained with DPO over

 This is a Speech Lanaguage Model, fine-tuned from [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) over a vocabulary of 500
 speech tokens extracted from the 11-th layer of [mhubert-25hz](https://huggingface.co/slprl/mhubert-base-25hz). It was trained as part of
 ["*Slamming*: Training a Speech Language Model on One GPU in a Day"], focusing on efficient training. For a stronger model trained with
+slightly more compute - 2*A100 for 2 days, see [slam_scaled](https://huggingface.co/slprl/slam_scaled).
 The model was trained by next-token prediction over a subset of LibriSpeech, Libri-Light and a synthetic data
 [sTinyStories](https://huggingface.co/datasets/slprl/sTinyStories). It was then trained with DPO over