mrohan
/

cast-interleaved-7b

llama

Model card Files Files and versions

xet

Community

mrohan commited on 23 days ago

Commit

bd8a84d

verified ·

1 Parent(s): 32caeee

Update README.md

Browse files

Files changed (1) hide show

README.md +0 -117

README.md CHANGED Viewed

@@ -1,117 +0,0 @@
-# SPIRIT-LM Expressive Interleaved (Corrected Teacher, Libri-Light)
-**SPIRIT-LM Expressive Interleaved (Corrected)** is a fine-tuned version of the 7B SPIRIT-LM teacher model adapted to the **Libri-Light** domain. It supports **interleaved speech and text inputs**, and was used as the **teacher model for distilling TinyWave**.
-This checkpoint was fine-tuned for 10k steps with **LoRA adapters** on synthetic interleaved data created from Libri-Light and Whisper transcriptions. The resulting model improves alignment with the target distribution and provides stronger supervision for expressive speech–text generation.
-> 📖 This checkpoint is part of the *TinyWave* distillation framework. See [arXiv:2506.23670](https://arxiv.org/abs/2506.23670) for details.
----
-## 🧠 Model Purpose
-| Role             | Distillation Teacher                     |
-|------------------|-------------------------------------------|
-| Base Model       | `spirit-lm-expressive-7b` (SPIRIT-LM)     |
-| Fine-tuned on    | Libri-Light (10k steps with LoRA)         |
-| Input Modalities | Interleaved speech + text                 |
-| Output           | Speech tokens                             |
-| Used for         | Training `tinywave/interleaved-expressive-2b` |
----
-## 🔧 Usage
-### 1. Install SPIRIT-LM and Load Expressive Tokenizer
-```bash
-git clone https://github.com/facebookresearch/spiritlm
-cd spiritlm
-pip install -e '.[eval]'
-````
-```python
-from spiritlm.speech_tokenizer import spiritlm_expressive
-speech_tokenizer = spiritlm_expressive()
-```
----
-### 2. Inference (Speech or Interleaved)
-```python
-from transformers import LlamaForCausalLM, AutoTokenizer
-import torchaudio
-import torch
-MODEL_PATH = "tinywave/expressive-spirit-lm-interleaved-librilight"
-tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
-model = LlamaForCausalLM.from_pretrained(MODEL_PATH, device_map="auto", torch_dtype=torch.bfloat16)
-# Interleaved speech input
-speech_tokenizer = spiritlm_expressive()
-def get_inference(audio_path):
-    audio, _ = torchaudio.load(audio_path)
-    input_values = audio.view(1, 1, -1).to(speech_tokenizer.hubert_model.device).float()
-    tokens = speech_tokenizer.encode_string(input_values)
-    input_ids = tokenizer(tokens, return_tensors="pt").input_ids.to(model.device)
-    output = model.generate(input_ids, max_new_tokens=256, do_sample=True, temperature=0.9, top_p=0.9)
-    return tokenizer.decode(output[0])
-def get_inference_text(prompt):
-    input_ids = tokenizer(prompt + " [Speech]", return_tensors="pt").input_ids.to(model.device)
-    output = model.generate(input_ids, max_new_tokens=256, do_sample=True, temperature=0.9, top_p=0.9)
-    return tokenizer.decode(output[0])
-```
----
-## 🎧 Inference Modes
-### 💬 Text + Speech Interleaving
-Input:
-```text
-"The astronaut stepped outside the capsule— [Speech]"
-```
-Output:
-Expressive speech continuation in WAV format.
----
-### 🔄 Speech Continuation
-Input: `speech.wav`
-Output: Semantically and stylistically aligned spoken continuation.
----
-## 📂 Files
-* `pytorch_model.bin`: LoRA-adapted SPIRIT-LM 7B weights
-* `config.json`, `tokenizer.json`: Compatible with Hugging Face Transformers
-* Compatible with `spiritlm_expressive` tokenizer only
----
-## 📎 Citation
-```bibtex
-@article{nouriborji2025tinywave,
-  title={Efficient Interleaved Speech Modeling through Knowledge Distillation},
-  author={Nouriborji, Mohammadmahdi and Rohanian, Morteza},
-  journal={arXiv preprint arXiv:2506.23670},
-  year={2025}
-}
-```
----
-## 🔗 Related
-* 🔬 Paper: [arXiv:2506.23670](https://arxiv.org/abs/2506.23670)
-* 🧠 Student model: [`tinywave/interleaved-expressive-2b`](https://huggingface.co/tinywave/interleaved-expressive-2b)
-* 🌐 [Project Website](https://mohammadmahdinoori.github.io/tinywave-landing/)