LT_AI_Medical / README.md
Zygisluk's picture
Update README.md
de3999c verified
metadata
language:
  - lt
tags:
  - whisper
  - faster-whisper
  - ctranslate2
  - automatic-speech-recognition
  - lithuanian
  - medical
license: apache-2.0
base_model: openai/whisper-large-v3
library_name: ctranslate2
pipeline_tag: automatic-speech-recognition

English | Lietuvių

English

LT_AI_Medical — Lithuanian Medical Whisper

A fine-tuned Whisper large-v3 model for Lithuanian medical speech-to-text transcription, optimized for faster-whisper (CTranslate2 format).

Model Details

Usage

With faster-whisper

from faster_whisper import WhisperModel

model = WhisperModel("VSSA-SDSA/LT_AI_Medical", device="cuda", compute_type="float16")

segments, _ = model.transcribe(
    "audio.wav",
    language="lt",
    beam_size=5,
    vad_filter=True,
)

text = " ".join(segment.text.strip() for segment in segments)
print(text)

With the provided transcribe.py script

git clone https://github.com/VSSA-AtvirasKodas-LT/LT_AI_Medical
cd LT_AI_Medical
pip install -r requirements.txt
python transcribe.py sample.wav

Intended Use

This model is designed for transcribing Lithuanian medical speech, particularly:

  • Radiology reports
  • Family medicine consultations

Limitations

  • Trained primarily on medical vocabulary — may not perform as well on general Lithuanian speech
  • Performance may degrade on accents or dialects outside the training distribution
  • Audio longer than 30 seconds may produce hallucinations without proper VAD filtering

License

This model is released under the Apache 2.0 license, inheriting from the base Whisper large-v3 license.


Lietuvių

LT_AI_Medical — Lietuviškas medicininis Whisper modelis

Apmokytas Whisper large-v3 modelis lietuvių kalbos medicininio kalbinio teksto atpažinimui, optimizuotas faster-whisper bibliotekai (CTranslate2 formatas).

Modelio informacija

Naudojimas

Su faster-whisper

from faster_whisper import WhisperModel

model = WhisperModel("VSSA-SDSA/LT_AI_Medical", device="cuda", compute_type="float16")

segments, _ = model.transcribe(
    "audio.wav",
    language="lt",
    beam_size=5,
    vad_filter=True,
)

text = " ".join(segment.text.strip() for segment in segments)
print(text)

Su pridėtu transcribe.py skriptu

git clone https://github.com/VSSA-AtvirasKodas-LT/LT_AI_Medical
cd LT_AI_Medical
pip install -r requirements.txt
python transcribe.py sample.wav

Paskirtis

Šis modelis skirtas lietuvių kalbos medicininio kalbinio teksto transkribavimui, ypač:

  • Radiologijos ataskaitoms
  • Šeimos medicinos konsultacijoms

Apribojimai

  • Apmokytas daugiausia su medicininiu žodynu — gali prasčiau atpažinti bendros lietuvių kalbos tekstą
  • Našumas gali pablogėti su akcentais ar tarmėmis, nepatenkančiomis į apmokymo duomenis
  • Ilgesnis nei 30 sekundžių garsas gali sukelti haliucinacijas be tinkamo VAD filtravimo

Licencija

Šis modelis išleistas pagal Apache 2.0 licenciją, paveldėtą iš bazinio Whisper large-v3 modelio licencijos.