Spaces:
Running
Running
| license: apache-2.0 | |
| tags: | |
| - voxtral | |
| - asr | |
| - speech-to-text | |
| - fine-tuning | |
| - tonic | |
| pipeline_tag: automatic-speech-recognition | |
| base_model: {{base_model}} | |
| {{#if has_hub_dataset_id}} | |
| datasets: | |
| - {{dataset_name}} | |
| {{/if}} | |
| {{#if author_name}} | |
| author: {{author_name}} | |
| {{/if}} | |
| {{#if training_config_type}} | |
| training_config: {{training_config_type}} | |
| {{/if}} | |
| {{#if trainer_type}} | |
| trainer_type: {{trainer_type}} | |
| {{/if}} | |
| {{#if batch_size}} | |
| batch_size: {{batch_size}} | |
| {{/if}} | |
| {{#if gradient_accumulation_steps}} | |
| gradient_accumulation_steps: {{gradient_accumulation_steps}} | |
| {{/if}} | |
| {{#if learning_rate}} | |
| learning_rate: {{learning_rate}} | |
| {{/if}} | |
| {{#if max_epochs}} | |
| max_epochs: {{max_epochs}} | |
| {{/if}} | |
| {{#if max_seq_length}} | |
| max_seq_length: {{max_seq_length}} | |
| {{/if}} | |
| {{#if hardware_info}} | |
| hardware: "{{hardware_info}}" | |
| {{/if}} | |
| language: | |
| - hi | |
| - en | |
| - fr | |
| - de | |
| - it | |
| - pt | |
| - nl | |
| library_name: peft | |
| # {{model_name}} | |
| {{model_description}} | |
| ## Usage | |
| ```python | |
| import torch | |
| from transformers import AutoProcessor, AutoModelForSeq2SeqLM | |
| import soundfile as sf | |
| processor = AutoProcessor.from_pretrained("{{repo_name}}") | |
| model = AutoModelForSeq2SeqLM.from_pretrained( | |
| "{{repo_name}}", | |
| torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32 | |
| ) | |
| audio, sr = sf.read("sample.wav") | |
| inputs = processor(audio, sampling_rate=sr, return_tensors="pt") | |
| with torch.no_grad(): | |
| generated_ids = model.generate(**inputs, max_new_tokens=256) | |
| text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] | |
| print(text) | |
| ``` | |
| ## Training Configuration | |
| - Base model: {{base_model}} | |
| {{#if training_config_type}}- Config: {{training_config_type}}{{/if}} | |
| {{#if trainer_type}}- Trainer: {{trainer_type}}{{/if}} | |
| ## Training Parameters | |
| - Batch size: {{batch_size}} | |
| - Grad accumulation: {{gradient_accumulation_steps}} | |
| - Learning rate: {{learning_rate}} | |
| - Max epochs: {{max_epochs}} | |
| - Sequence length: {{max_seq_length}} | |
| ## Hardware | |
| - {{hardware_info}} | |
| ## Notes | |
| - This repository contains a fine-tuned Voxtral ASR model. | |