Spaces:

Tonic
/

VoxFactory

Running

App Files Files Community

VoxFactory / templates /model_card.md

Joseph Pollack

adds model repo dataset id to the model card

7cafe2c unverified 4 months ago

preview code

raw

history blame contribute delete

2.02 kB

	---
	license: apache-2.0
	tags:
	- voxtral
	- asr
	- speech-to-text
	- fine-tuning
	- tonic

	pipeline_tag: automatic-speech-recognition
	base_model: {{base_model}}
	{{#if has_hub_dataset_id}}
	datasets:
	- {{dataset_name}}
	{{/if}}
	{{#if author_name}}
	author: {{author_name}}
	{{/if}}
	{{#if training_config_type}}
	training_config: {{training_config_type}}
	{{/if}}
	{{#if trainer_type}}
	trainer_type: {{trainer_type}}
	{{/if}}
	{{#if batch_size}}
	batch_size: {{batch_size}}
	{{/if}}
	{{#if gradient_accumulation_steps}}
	gradient_accumulation_steps: {{gradient_accumulation_steps}}
	{{/if}}
	{{#if learning_rate}}
	learning_rate: {{learning_rate}}
	{{/if}}
	{{#if max_epochs}}
	max_epochs: {{max_epochs}}
	{{/if}}
	{{#if max_seq_length}}
	max_seq_length: {{max_seq_length}}
	{{/if}}
	{{#if hardware_info}}
	hardware: "{{hardware_info}}"
	{{/if}}
	language:
	- hi
	- en
	- fr
	- de
	- it
	- pt
	- nl
	library_name: peft
	---

	# {{model_name}}

	{{model_description}}

	## Usage

	```python
	import torch
	from transformers import AutoProcessor, AutoModelForSeq2SeqLM
	import soundfile as sf

	processor = AutoProcessor.from_pretrained("{{repo_name}}")
	model = AutoModelForSeq2SeqLM.from_pretrained(
	"{{repo_name}}",
	torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32
	)

	audio, sr = sf.read("sample.wav")
	inputs = processor(audio, sampling_rate=sr, return_tensors="pt")
	with torch.no_grad():
	generated_ids = model.generate(**inputs, max_new_tokens=256)
	text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(text)
	```

	## Training Configuration

	- Base model: {{base_model}}
	{{#if training_config_type}}- Config: {{training_config_type}}{{/if}}
	{{#if trainer_type}}- Trainer: {{trainer_type}}{{/if}}

	## Training Parameters

	- Batch size: {{batch_size}}
	- Grad accumulation: {{gradient_accumulation_steps}}
	- Learning rate: {{learning_rate}}
	- Max epochs: {{max_epochs}}
	- Sequence length: {{max_seq_length}}

	## Hardware

	- {{hardware_info}}

	## Notes

	- This repository contains a fine-tuned Voxtral ASR model.