Update README.md

a1df0c2 verified almost 2 years ago

5.22 kB

language:
  - ar
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Tunisian Checkpoint1.5
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: custom_tunisian_dataset
          type: dataset
          args: 'config: ar, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 51.80948419301165
          - name: Cer
            type: cer
            value: 25.522807875749052

Model Card for Model ID

Finetuning Whisper on Tunisian custom dataset

Model Details

Model Description

This model is a fine-tuned version of openai/whisper-small on the tunisian_custom dataset (2h and 30mn). It achieves the following results on the evaluation set:

Train Loss: 0.0029
Evaluation Loss: 1.1970311403274536
Wer: 51.80948419301165
Cer: 25.522807875749052
Developed by: [Ameni Khabthani]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [ASR system]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [whisper small]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

per_device_train_batch_size=8
gradient_accumulation_steps=8
learning_rate= 5e-5
warmup_steps=100
max_steps=3000
gradient_checkpointing=True
fp16=True
save_steps=500
eval_steps=500
per_device_eval_batch_size=8
predict_with_generate=True
generation_max_length=225
logging_steps=50,
optim="adamw_bnb_8bit"
weight_decay=0.01 dropout=0.1

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]