ishraq-quran-backend

Runtime error

App Files Files Community

nsakib161 commited on Jan 15

Commit

4c176fe

1 Parent(s): ec2fbed

Apply Anti-Gravity Configuration Fix

Browse files

Files changed (1) hide show

README.md +70 -139

README.md CHANGED Viewed

@@ -1,9 +1,36 @@
-# Whisper Backend - Transcription API
-FastAPI backend for Quran recitation transcription using Faster-Whisper model fine-tuned for Quranic Arabic.
 ## 🚀 Quick Start
 ```bash
 # Create virtual environment
 python -m venv venv
@@ -13,172 +40,76 @@ source venv/bin/activate  # Windows: venv\Scripts\activate
 pip install -r requirements.txt
 # Start the server
-python -m uvicorn main:app --host 0.0.0.0 --port 8000
 ```
-The API will be available at `http://localhost:8000`
 ## 📚 API Documentation
 Once running, visit:
-- **Swagger UI**: http://localhost:8000/docs
-- **ReDoc**: http://localhost:8000/redoc
 ## 🔌 Endpoints
 ### Health Check
-```bash
-GET /
-GET /health
-```
-Returns server status and model information.
 ### Transcribe Audio
-```bash
-POST /transcribe
-Content-Type: multipart/form-data
-```
-**Request:**
-- `file`: Audio file (MP3, WAV, WEBM, FLAC, etc.)
-**Response:**
-```json
-{
-  "transcription": "بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ",
-  "segments": [
-    {
-      "start": 0.0,
-      "end": 3.5,
-      "text": "بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ"
-    }
-  ],
-  "language": "ar",
-  "language_probability": 0.99,
-  "processing_time": 1.23
-}
-```
-### Batch Transcription
-```bash
-POST /transcribe-batch
-Content-Type: multipart/form-data
-```
-Accepts multiple audio files and returns transcriptions for each.
 ## ⚙️ Configuration
-Edit `config.py` to customize settings:
-```python
-class Settings(BaseModel):
-    # Model configuration
-    whisper_model: str = "ModyAsh/faster-whisper-base-ar-quran"
-    language: str = "ar"
-    compute_type: str = "int8"  # int8, float16, float32
-    # Transcription parameters
-    beam_size: int = 5
-    vad_filter: bool = True
-    vad_min_silence_duration_ms: int = 500
-    # File constraints
-    max_file_size_mb: int = 25
-    allowed_audio_formats: list = [
-        "mp3", "wav", "m4a", "flac", "ogg", "webm"
-    ]
-```
-## 🎯 Model Information
-**Model**: `ModyAsh/faster-whisper-base-ar-quran`
-- Fine-tuned for Quranic Arabic recitation
-- Based on Faster-Whisper (optimized Whisper implementation)
-- Supports Arabic language with high accuracy for Quranic text
-**Performance**:
-- **Device**: Auto-detects CUDA/CPU
-- **Compute Type**: INT8 quantization for faster inference
-- **VAD Filter**: Voice Activity Detection to filter silence
-## 🔧 CORS Configuration
-The backend is configured to accept requests from:
-- `http://localhost:3000` (development)
-- `http://localhost:3001`
-To add more origins, edit `config.py`:
-```python
-cors_origins: str = "http://localhost:3000,http://localhost:3001,https://yourdomain.com"
-```
-## 📁 Project Structure
-```
-whisper-backend/
-├── main.py           # FastAPI application and endpoints
-├── config.py         # Configuration and settings
-├── utils.py          # Utility functions
-└── requirements.txt  # Python dependencies
 ```
-## 🐛 Troubleshooting
-**Model download fails**
-- Check internet connection
-- Ensure sufficient disk space (~500MB)
-- Model downloads automatically on first run
-**Out of memory errors**
-- Reduce `beam_size` in config
-- Use `int8` compute type
-- Process smaller audio files
-**Slow transcription**
-- Enable CUDA if you have a GPU
-- Reduce `beam_size` for faster processing
-- Use `int8` compute type
-**CORS errors**
-- Add frontend URL to `cors_origins` in config
-- Restart the server after config changes
-## 📊 Performance Tips
-1. **GPU Acceleration**: Install CUDA for faster processing
-2. **Compute Type**: Use `int8` for speed, `float32` for accuracy
-3. **Beam Size**: Lower values = faster, higher values = more accurate
-4. **VAD Filter**: Reduces processing time by skipping silence
-## 🔒 Security Notes
-- File size limited to 25MB by default
-- Only audio formats are accepted
-- Temporary files are cleaned up after processing
-- CORS is configured for specific origins
-## 📚 Dependencies
-- **FastAPI**: Modern web framework
-- **Faster-Whisper**: Optimized Whisper implementation
-- **Uvicorn**: ASGI server
-- **Pydantic**: Data validation
-## 🧪 Testing
-```bash
-# Health check
-curl http://localhost:8000/health
-# Transcribe audio
-curl -X POST http://localhost:8000/transcribe \
-  -F "file=@audio.mp3"
 ```
 ---
 For more information, see the [main project README](../README.md).
-# ishraq-al-quran-backend

+---
+title: Ishraq Quran Backend
+emoji: 📖
+colorFrom: green
+colorTo: blue
+sdk: docker
+app_port: 7860
+pinned: false
+---
+# Quran Recitation Transcription API
+FastAPI backend for Quran recitation transcription using the `Faster-Whisper` model fine-tuned for Quranic Arabic.
 ## 🚀 Quick Start
+The easiest way to get started is by using the provided setup script:
+```bash
+# Clone the repository (if you haven't already)
+git clone <repository-url>
+cd ishraq-al-quran-backend
+# Run the setup script
+python setup.py
+```
+The script will check your environment, install dependencies, and create a default `.env` file.
+### Manual Setup
+If you prefer to set up manually:
 ```bash
 # Create virtual environment
 python -m venv venv
 pip install -r requirements.txt
 # Start the server
+uvicorn main:app --host 0.0.0.0 --port 7860 --reload
 ```
+The API will be available at `http://localhost:7860`
 ## 📚 API Documentation
 Once running, visit:
+- **Swagger UI**: [http://localhost:7860/docs](http://localhost:7860/docs)
+- **ReDoc**: [http://localhost:7860/redoc](http://localhost:7860/redoc)
 ## 🔌 Endpoints
 ### Health Check
+- `GET /`: Basic status check
+- `GET /health`: Detailed health check and model status
 ### Transcribe Audio
+- `POST /transcribe`: Transcribe a single audio file
+- `POST /transcribe-batch`: Transcribe multiple audio files simultaneously
+**Supported Formats:** MP3, WAV, M4A, FLAC, OGG, WEBM, AAC, OPUS.
 ## ⚙️ Configuration
+The application uses environment variables for configuration. You can customize these in the `.env` file (see `.env.example` for all options).
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `PORT` | Server port | `7860` |
+| `CORS_ORIGINS` | Allowed CORS origins | `http://localhost:3000,http://localhost:5173` |
+| `WHISPER_MODEL` | Hugging Face model ID | `Habib-HF/tarbiyah-ai-whisper-medium-merged` |
+| `COMPUTE_TYPE` | Precision (`float32`, `float16`, `int8`) | `float32` |
+| `MAX_FILE_SIZE` | Maximum upload size in MB | `100` |
+| `DEVICE` | Computation device (`cuda` or `cpu`) | `auto-detect` |
+## 🐳 Docker Deployment
+To run the API using Docker:
+```bash
+# Build and start the container
+docker-compose up -d
 ```
+The Docker setup includes a persistent volume for model caching and automatic health checks.
+## 🎯 Model Information
+**Model**: [`Habib-HF/tarbiyah-ai-whisper-medium-merged`](https://huggingface.co/Habib-HF/tarbiyah-ai-whisper-medium-merged)
+- Specifically merged and optimized for Quranic Arabic recitation styles.
+- High accuracy for tajweed and specific Quranic terminology.
+## 🧪 Testing & Examples
+- **Test Script**: Run `python test_api.py` to verify all endpoints.
+- **Client Examples**: See [client_examples.py](client_examples.py) for implementation examples in Python, JavaScript, and cURL.
+## 📁 Project Structure
+```
+ishraq-al-quran-backend/
+├── main.py           # FastAPI application & endpoints
+├── config.py         # Configuration management
+├── utils.py          # Processing utilities
+├── setup.py          # Initial setup script
+├── test_api.py       # API verification suite
+├── client_examples.py # Library of client implementations
+└── requirements.txt  # Project dependencies
 ```
 ---
 For more information, see the [main project README](../README.md).