Feature Extraction
Transformers
Safetensors
AuriStream
audio
speech
language-model
auristream
custom_code
Instructions to use TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| tags: | |
| - audio | |
| - speech | |
| - language-model | |
| - auristream | |
| library_name: transformers | |
| # AuriStream7BDeep_40Pred_BigAudioDataset_30k | |
| **AuriStream** is a speech language model by **Greta Tuckute** and **Klemen Kotar**. | |
| This model predicts cochlear tokens from a tokenizer such as [WavCochCausalV8192](https://huggingface.co/TuKoResearch/WavCochCausalV8192). | |
| ## Model Details | |
| | Parameter | Value | | |
| |-----------|-------| | |
| | Parameters | ~8.41B | | |
| | Layers | 96 | | |
| | Hidden Size | 2560 | | |
| | Attention Heads | 32 | | |
| | Vocab Size | 8192 | | |
| | Prediction Steps | 40 | | |
| ## Usage | |
| ```python | |
| from transformers import AutoModel, AutoConfig | |
| # Load with trust_remote_code for custom model | |
| model = AutoModel.from_pretrained( | |
| "TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k", | |
| trust_remote_code=True, | |
| ) | |
| # Or load config first | |
| config = AutoConfig.from_pretrained("TuKoResearch/AuriStream7BDeep_40Pred_BigAudioDataset_30k", trust_remote_code=True) | |
| ``` | |
| ## Base Model Code | |
| This checkpoint uses shared model code from [TuKoResearch/AuriStream-base](https://huggingface.co/TuKoResearch/AuriStream-base). | |
| ## Tokenizer | |
| This model uses cochlear tokens from [WavCochCausalV8192](https://huggingface.co/TuKoResearch/WavCochCausalV8192). | |