nsakib161 commited on
Commit
4c176fe
Β·
1 Parent(s): ec2fbed

Apply Anti-Gravity Configuration Fix

Browse files
Files changed (1) hide show
  1. README.md +70 -139
README.md CHANGED
@@ -1,9 +1,36 @@
1
- # Whisper Backend - Transcription API
 
 
 
 
 
 
 
 
 
 
2
 
3
- FastAPI backend for Quran recitation transcription using Faster-Whisper model fine-tuned for Quranic Arabic.
4
 
5
  ## πŸš€ Quick Start
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ```bash
8
  # Create virtual environment
9
  python -m venv venv
@@ -13,172 +40,76 @@ source venv/bin/activate # Windows: venv\Scripts\activate
13
  pip install -r requirements.txt
14
 
15
  # Start the server
16
- python -m uvicorn main:app --host 0.0.0.0 --port 8000
17
  ```
18
 
19
- The API will be available at `http://localhost:8000`
20
 
21
  ## πŸ“š API Documentation
22
 
23
  Once running, visit:
24
- - **Swagger UI**: http://localhost:8000/docs
25
- - **ReDoc**: http://localhost:8000/redoc
26
 
27
  ## πŸ”Œ Endpoints
28
 
29
  ### Health Check
30
- ```bash
31
- GET /
32
- GET /health
33
- ```
34
-
35
- Returns server status and model information.
36
 
37
  ### Transcribe Audio
38
- ```bash
39
- POST /transcribe
40
- Content-Type: multipart/form-data
41
- ```
42
 
43
- **Request:**
44
- - `file`: Audio file (MP3, WAV, WEBM, FLAC, etc.)
45
-
46
- **Response:**
47
- ```json
48
- {
49
- "transcription": "بِسْمِ Ψ§Ω„Ω„ΩŽΩ‘Ω‡Ω Ψ§Ω„Ψ±ΩŽΩ‘Ψ­Ω’Ω…ΩŽΩ°Ω†Ω Ψ§Ω„Ψ±ΩŽΩ‘Ψ­ΩΩŠΩ…Ω",
50
- "segments": [
51
- {
52
- "start": 0.0,
53
- "end": 3.5,
54
- "text": "بِسْمِ Ψ§Ω„Ω„ΩŽΩ‘Ω‡Ω Ψ§Ω„Ψ±ΩŽΩ‘Ψ­Ω’Ω…ΩŽΩ°Ω†Ω Ψ§Ω„Ψ±ΩŽΩ‘Ψ­ΩΩŠΩ…Ω"
55
- }
56
- ],
57
- "language": "ar",
58
- "language_probability": 0.99,
59
- "processing_time": 1.23
60
- }
61
- ```
62
-
63
- ### Batch Transcription
64
- ```bash
65
- POST /transcribe-batch
66
- Content-Type: multipart/form-data
67
- ```
68
-
69
- Accepts multiple audio files and returns transcriptions for each.
70
 
71
  ## βš™οΈ Configuration
72
 
73
- Edit `config.py` to customize settings:
74
-
75
- ```python
76
- class Settings(BaseModel):
77
- # Model configuration
78
- whisper_model: str = "ModyAsh/faster-whisper-base-ar-quran"
79
- language: str = "ar"
80
- compute_type: str = "int8" # int8, float16, float32
81
-
82
- # Transcription parameters
83
- beam_size: int = 5
84
- vad_filter: bool = True
85
- vad_min_silence_duration_ms: int = 500
86
-
87
- # File constraints
88
- max_file_size_mb: int = 25
89
- allowed_audio_formats: list = [
90
- "mp3", "wav", "m4a", "flac", "ogg", "webm"
91
- ]
92
- ```
93
-
94
- ## 🎯 Model Information
95
-
96
- **Model**: `ModyAsh/faster-whisper-base-ar-quran`
97
- - Fine-tuned for Quranic Arabic recitation
98
- - Based on Faster-Whisper (optimized Whisper implementation)
99
- - Supports Arabic language with high accuracy for Quranic text
100
 
101
- **Performance**:
102
- - **Device**: Auto-detects CUDA/CPU
103
- - **Compute Type**: INT8 quantization for faster inference
104
- - **VAD Filter**: Voice Activity Detection to filter silence
 
 
 
 
105
 
106
- ## πŸ”§ CORS Configuration
107
 
108
- The backend is configured to accept requests from:
109
- - `http://localhost:3000` (development)
110
- - `http://localhost:3001`
111
-
112
- To add more origins, edit `config.py`:
113
-
114
- ```python
115
- cors_origins: str = "http://localhost:3000,http://localhost:3001,https://yourdomain.com"
116
- ```
117
 
118
- ## πŸ“ Project Structure
119
-
120
- ```
121
- whisper-backend/
122
- β”œβ”€β”€ main.py # FastAPI application and endpoints
123
- β”œβ”€β”€ config.py # Configuration and settings
124
- β”œβ”€β”€ utils.py # Utility functions
125
- └── requirements.txt # Python dependencies
126
  ```
127
 
128
- ## πŸ› Troubleshooting
129
-
130
- **Model download fails**
131
- - Check internet connection
132
- - Ensure sufficient disk space (~500MB)
133
- - Model downloads automatically on first run
134
 
135
- **Out of memory errors**
136
- - Reduce `beam_size` in config
137
- - Use `int8` compute type
138
- - Process smaller audio files
139
-
140
- **Slow transcription**
141
- - Enable CUDA if you have a GPU
142
- - Reduce `beam_size` for faster processing
143
- - Use `int8` compute type
144
-
145
- **CORS errors**
146
- - Add frontend URL to `cors_origins` in config
147
- - Restart the server after config changes
148
-
149
- ## πŸ“Š Performance Tips
150
-
151
- 1. **GPU Acceleration**: Install CUDA for faster processing
152
- 2. **Compute Type**: Use `int8` for speed, `float32` for accuracy
153
- 3. **Beam Size**: Lower values = faster, higher values = more accurate
154
- 4. **VAD Filter**: Reduces processing time by skipping silence
155
-
156
- ## πŸ”’ Security Notes
157
-
158
- - File size limited to 25MB by default
159
- - Only audio formats are accepted
160
- - Temporary files are cleaned up after processing
161
- - CORS is configured for specific origins
162
 
163
- ## πŸ“š Dependencies
 
 
164
 
165
- - **FastAPI**: Modern web framework
166
- - **Faster-Whisper**: Optimized Whisper implementation
167
- - **Uvicorn**: ASGI server
168
- - **Pydantic**: Data validation
169
 
170
- ## πŸ§ͺ Testing
 
171
 
172
- ```bash
173
- # Health check
174
- curl http://localhost:8000/health
175
 
176
- # Transcribe audio
177
- curl -X POST http://localhost:8000/transcribe \
178
- -F "file=@audio.mp3"
 
 
 
 
 
 
179
  ```
180
 
181
  ---
182
-
183
  For more information, see the [main project README](../README.md).
184
- # ishraq-al-quran-backend
 
1
+ ---
2
+ title: Ishraq Quran Backend
3
+ emoji: πŸ“–
4
+ colorFrom: green
5
+ colorTo: blue
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: false
9
+ ---
10
+
11
+ # Quran Recitation Transcription API
12
 
13
+ FastAPI backend for Quran recitation transcription using the `Faster-Whisper` model fine-tuned for Quranic Arabic.
14
 
15
  ## πŸš€ Quick Start
16
 
17
+ The easiest way to get started is by using the provided setup script:
18
+
19
+ ```bash
20
+ # Clone the repository (if you haven't already)
21
+ git clone <repository-url>
22
+ cd ishraq-al-quran-backend
23
+
24
+ # Run the setup script
25
+ python setup.py
26
+ ```
27
+
28
+ The script will check your environment, install dependencies, and create a default `.env` file.
29
+
30
+ ### Manual Setup
31
+
32
+ If you prefer to set up manually:
33
+
34
  ```bash
35
  # Create virtual environment
36
  python -m venv venv
 
40
  pip install -r requirements.txt
41
 
42
  # Start the server
43
+ uvicorn main:app --host 0.0.0.0 --port 7860 --reload
44
  ```
45
 
46
+ The API will be available at `http://localhost:7860`
47
 
48
  ## πŸ“š API Documentation
49
 
50
  Once running, visit:
51
+ - **Swagger UI**: [http://localhost:7860/docs](http://localhost:7860/docs)
52
+ - **ReDoc**: [http://localhost:7860/redoc](http://localhost:7860/redoc)
53
 
54
  ## πŸ”Œ Endpoints
55
 
56
  ### Health Check
57
+ - `GET /`: Basic status check
58
+ - `GET /health`: Detailed health check and model status
 
 
 
 
59
 
60
  ### Transcribe Audio
61
+ - `POST /transcribe`: Transcribe a single audio file
62
+ - `POST /transcribe-batch`: Transcribe multiple audio files simultaneously
 
 
63
 
64
+ **Supported Formats:** MP3, WAV, M4A, FLAC, OGG, WEBM, AAC, OPUS.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  ## βš™οΈ Configuration
67
 
68
+ The application uses environment variables for configuration. You can customize these in the `.env` file (see `.env.example` for all options).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
 
70
+ | Variable | Description | Default |
71
+ |----------|-------------|---------|
72
+ | `PORT` | Server port | `7860` |
73
+ | `CORS_ORIGINS` | Allowed CORS origins | `http://localhost:3000,http://localhost:5173` |
74
+ | `WHISPER_MODEL` | Hugging Face model ID | `Habib-HF/tarbiyah-ai-whisper-medium-merged` |
75
+ | `COMPUTE_TYPE` | Precision (`float32`, `float16`, `int8`) | `float32` |
76
+ | `MAX_FILE_SIZE` | Maximum upload size in MB | `100` |
77
+ | `DEVICE` | Computation device (`cuda` or `cpu`) | `auto-detect` |
78
 
79
+ ## 🐳 Docker Deployment
80
 
81
+ To run the API using Docker:
 
 
 
 
 
 
 
 
82
 
83
+ ```bash
84
+ # Build and start the container
85
+ docker-compose up -d
 
 
 
 
 
86
  ```
87
 
88
+ The Docker setup includes a persistent volume for model caching and automatic health checks.
 
 
 
 
 
89
 
90
+ ## 🎯 Model Information
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
 
92
+ **Model**: [`Habib-HF/tarbiyah-ai-whisper-medium-merged`](https://huggingface.co/Habib-HF/tarbiyah-ai-whisper-medium-merged)
93
+ - Specifically merged and optimized for Quranic Arabic recitation styles.
94
+ - High accuracy for tajweed and specific Quranic terminology.
95
 
96
+ ## πŸ§ͺ Testing & Examples
 
 
 
97
 
98
+ - **Test Script**: Run `python test_api.py` to verify all endpoints.
99
+ - **Client Examples**: See [client_examples.py](client_examples.py) for implementation examples in Python, JavaScript, and cURL.
100
 
101
+ ## πŸ“ Project Structure
 
 
102
 
103
+ ```
104
+ ishraq-al-quran-backend/
105
+ β”œβ”€β”€ main.py # FastAPI application & endpoints
106
+ β”œβ”€β”€ config.py # Configuration management
107
+ β”œβ”€β”€ utils.py # Processing utilities
108
+ β”œβ”€β”€ setup.py # Initial setup script
109
+ β”œβ”€β”€ test_api.py # API verification suite
110
+ β”œβ”€β”€ client_examples.py # Library of client implementations
111
+ └── requirements.txt # Project dependencies
112
  ```
113
 
114
  ---
 
115
  For more information, see the [main project README](../README.md).