eolang commited on
Commit
2d12b21
·
verified ·
1 Parent(s): 3e22be3

Delete README (1).md

Browse files
Files changed (1) hide show
  1. README (1).md +0 -80
README (1).md DELETED
@@ -1,80 +0,0 @@
1
- ---
2
- license: apache-2.0
3
- base_model: Jacaranda-Health/ASR-STT
4
- tags:
5
- - speech-to-text
6
- - automatic-speech-recognition
7
- - quantized
8
- - 8bit
9
- language:
10
- - en
11
- pipeline_tag: automatic-speech-recognition
12
- ---
13
-
14
- # ASR-STT 8BIT Quantized
15
-
16
- This is a 8bit quantized version of [Jacaranda-Health/ASR-STT](https://huggingface.co/Jacaranda-Health/ASR-STT).
17
-
18
- ## Model Details
19
- - **Base Model**: Jacaranda-Health/ASR-STT
20
- - **Quantization**: 8bit
21
- - **Size Reduction**: 73.1% smaller than original
22
- - **Original Size**: 2913.89 MB
23
- - **Quantized Size**: 784.94 MB
24
-
25
- ## Usage
26
-
27
- ```python
28
- from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, BitsAndBytesConfig
29
- import torch
30
- import librosa
31
-
32
- # Load processor
33
- processor = AutoProcessor.from_pretrained("eolang/ASR-STT-8bit")
34
-
35
- # Configure quantization
36
- quantization_config = BitsAndBytesConfig(
37
- load_in_8bit=True
38
- llm_int8_threshold=6.0,
39
- llm_int8_has_fp16_weight=False
40
-
41
- )
42
-
43
- # Load quantized model
44
- model = AutoModelForSpeechSeq2Seq.from_pretrained(
45
- "eolang/ASR-STT-8bit",
46
- quantization_config=quantization_config,
47
- device_map="auto"
48
- )
49
-
50
- # Transcription function
51
- def transcribe(filepath):
52
- audio, sr = librosa.load(filepath, sr=16000)
53
- inputs = processor(audio, sampling_rate=sr, return_tensors="pt")
54
-
55
- # Convert to half precision for quantized models
56
- if torch.cuda.is_available():
57
- inputs = {k: v.cuda().half() for k, v in inputs.items()}
58
- else:
59
- inputs = {k: v.half() for k, v in inputs.items()}
60
-
61
- with torch.no_grad():
62
- generated_ids = model.generate(inputs["input_features"])
63
-
64
- return processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
65
-
66
- # Example usage
67
- transcription = transcribe("path/to/audio.wav")
68
- print(transcription)
69
- ```
70
-
71
- ## Performance
72
- - Faster inference due to reduced precision
73
- - Lower memory usage
74
- - Maintained transcription quality
75
-
76
- ## Requirements
77
- - transformers
78
- - torch
79
- - bitsandbytes
80
- - librosa