Bombek1 commited on
Commit
684df35
·
verified ·
1 Parent(s): 52b0fdf

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +123 -0
README.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - whisper
4
+ - speech
5
+ - audio
6
+ - litert
7
+ - tflite
8
+ - edge
9
+ - on-device
10
+ license: mit
11
+ base_model: openai/whisper-tiny
12
+ pipeline_tag: automatic-speech-recognition
13
+ ---
14
+
15
+ # whisper-tiny - LiteRT
16
+
17
+ This is a [LiteRT](https://ai.google.dev/edge/litert) (formerly TensorFlow Lite) conversion of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) for efficient on-device inference.
18
+
19
+ ## Model Details
20
+
21
+ | Property | Value |
22
+ |----------|-------|
23
+ | **Original Model** | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) |
24
+ | **Format** | LiteRT (.tflite) |
25
+ | **File Size** | 31.4 MB |
26
+ | **Task** | Speech Recognition (Encoder Only) |
27
+ | **Max Sequence Length** | 3000 |
28
+ | **Output Dimension** | 384 |
29
+ | **Pooling Mode** | N/A (Encoder output) |
30
+
31
+ ## Performance
32
+
33
+ Benchmarked on Intel CPU (WSL2):
34
+
35
+ | Metric | Value |
36
+ |--------|-------|
37
+ | **Inference Latency** | 144.7 ms |
38
+ | **Throughput** | 6.9/sec |
39
+ | **Cosine Similarity vs Original** | 1.0000 ✅ |
40
+
41
+ ## Quick Start
42
+
43
+ ```python
44
+ import numpy as np
45
+ from ai_edge_litert.interpreter import Interpreter
46
+ from transformers import WhisperProcessor
47
+ import librosa
48
+
49
+ # Load model
50
+ interpreter = Interpreter(model_path="openai_whisper-tiny_encoder.tflite")
51
+ interpreter.allocate_tensors()
52
+ input_details = interpreter.get_input_details()
53
+ output_details = interpreter.get_output_details()
54
+
55
+ # Load processor
56
+ processor = WhisperProcessor.from_pretrained("openai/whisper-tiny")
57
+
58
+ def encode_audio(audio_path: str) -> np.ndarray:
59
+ """Extract encoder features from audio file."""
60
+ audio, sr = librosa.load(audio_path, sr=16000)
61
+ input_features = processor(audio, sampling_rate=16000, return_tensors="np").input_features
62
+
63
+ interpreter.set_tensor(input_details[0]["index"], input_features.astype(np.float32))
64
+ interpreter.invoke()
65
+
66
+ return interpreter.get_tensor(output_details[0]["index"])
67
+
68
+ # Example
69
+ # features = encode_audio("audio.wav")
70
+ ```
71
+
72
+ **Note**: This is the encoder-only model. For full ASR, you need the decoder as well.
73
+
74
+ ## Files
75
+
76
+ - `openai_whisper-tiny_encoder.tflite` - The LiteRT model file
77
+
78
+ ## Conversion Details
79
+
80
+ - **Conversion Tool**: [ai-edge-torch](https://github.com/google-ai-edge/ai-edge-torch)
81
+ - **Conversion Date**: 2026-01-12
82
+ - **Source Framework**: PyTorch → LiteRT
83
+ - **Validation**: Cosine similarity 1.0000 vs original
84
+
85
+ ## Intended Use
86
+
87
+ - **Mobile Applications**: On-device semantic search, RAG systems
88
+ - **Edge Devices**: IoT, embedded systems, Raspberry Pi
89
+ - **Offline Processing**: Privacy-preserving inference
90
+ - **Low-latency Applications**: Real-time processing
91
+
92
+ ## Limitations
93
+
94
+ - Fixed sequence length (3000 tokens)
95
+ - CPU inference (GPU delegate requires setup)
96
+ - Tokenizer loaded separately from original model
97
+ - Float32 precision
98
+
99
+ ## License
100
+
101
+ This model inherits the license from the original:
102
+ - **License**: MIT ([source](https://huggingface.co/openai/whisper-tiny))
103
+
104
+ ## Citation
105
+
106
+ ```bibtex
107
+ @misc{radford2022whisper,
108
+ title={Robust Speech Recognition via Large-Scale Weak Supervision},
109
+ author={Alec Radford and Jong Wook Kim and others},
110
+ year={2022},
111
+ eprint={2212.04356},
112
+ archivePrefix={arXiv},
113
+ }
114
+ ```
115
+
116
+ ## Acknowledgments
117
+
118
+ - Original model by [openai](https://huggingface.co/openai)
119
+ - Conversion using [ai-edge-torch](https://github.com/google-ai-edge/ai-edge-torch)
120
+
121
+ ---
122
+
123
+ *Converted by [Bombek1](https://huggingface.co/Bombek1)*