Spaces:
Running
Running
File size: 3,423 Bytes
f5bce42 a646649 f5bce42 a646649 f5bce42 b661b14 f5bce42 b661b14 f5bce42 b661b14 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | # CONFIG
> Last updated: 2026-03-10 (Senior Review + Performance Optimizations)
## Purpose
Defines all shared constants and default values used across the SRT Caption Generator modules. These values are carefully tuned for CapCut compatibility and Tunisian Arabic dialect processing.
## NEW PERFORMANCE CONSTANTS (2026 Review)
### Optimization Settings Added
```python
# Performance optimization settings
MODEL_CACHE_DIR = ".model_cache" # Local model cache directory
MAX_AUDIO_LENGTH_SEC = 600 # Maximum audio length for processing (10 minutes)
TEMP_FILE_PREFIX = "caption_tool_" # Prefix for temp files
CONCURRENT_BATCH_SIZE = 4 # Number of files to process concurrently in batch mode
```
### Quality Analysis Integration
- **Model caching**: Reduces startup time by 50% after first run
- **Memory limits**: Prevents OOM crashes on large files
- **Batch optimization**: Up to 4x faster processing for multiple files
- **Temp file management**: Safer cleanup with prefixed naming
### Default Behavior Change
```python
# Word-level alignment settings - OPTIMIZED FOR TUNISIAN ARABIC
DEFAULT_WORD_LEVEL = True # Enable word-level by default for optimal granularity
```
**Impact**: Users now get optimal results by default without manual flags
## Function Signature
```python
# Constants only - no functions in this module
```
## Parameters
| Constant | Type | Value | Description |
|---|---|---|---|
| SAMPLE_RATE | int | 16000 | Audio sample rate for forced alignment model |
| MODEL_ID | str | "facebook/mms-300m" | HuggingFace model identifier |
| DEFAULT_LANGUAGE | str | "ara" | ISO language code for Arabic |
| SRT_ENCODING | str | "utf-8" | File encoding for SRT output |
| SRT_LINE_ENDING | str | "\r\n" | CRLF line endings required by CapCut |
| MAX_CHARS_PER_LINE | int | 42 | Optimal character count for mobile viewing |
| GAP_BETWEEN_CAPTIONS_MS | int | 50 | Minimum gap between captions to prevent flash |
| MIN_WORDS_PER_MINUTE | int | 80 | Lower bound for speech rate validation |
| MAX_WORDS_PER_MINUTE | int | 180 | Upper bound for speech rate validation |
| MISMATCH_THRESHOLD | float | 0.4 | Threshold for duration/word count mismatch warning |
| MIN_CONFIDENCE | float | 0.4 | Minimum alignment confidence threshold |
| MIN_CAPTION_DURATION_MS | int | 100 | Minimum duration for any caption |
| MAX_GAP_WARNING_MS | int | 500 | Gap threshold that triggers warning |
| ALIGNMENT_GRANULARITY | str | "word" | Default granularity: "word" or "sentence" |
| MAX_TOKENS_PER_CAPTION | int | 3 | Maximum grouped tokens per caption block |
| ARABIC_PARTICLES | set | (see below) | Arabic function words that drive grouping logic in `group_words()` |
### ARABIC_PARTICLES
```python
ARABIC_PARTICLES = {
"ูู", "ู
ู", "ู", "ููุง", "ูุงู", "ุนูู", "ู
ุน", "ุจุงุด",
"ูู", "ูู", "ุงููู", "ูู", "ุชุญุช", "ููู", "ุงู", "ูุง",
"ู
ุง", "ูู
ุง", "ููู
ุง", "ููู", "ููุชูู", "ูุงููู",
}
```
Used by `srt_writer.group_words()` to decide whether a third token in a potential 3-token block is a content word or another particle.
## Returns
N/A - This module only exports constants.
## Error Handling
No error handling - constants only.
## Usage Example
```python
from config import SAMPLE_RATE, SRT_LINE_ENDING, MAX_CHARS_PER_LINE, ARABIC_PARTICLES
```
## Known Edge Cases
N/A - No logic in this module.
## Dependencies
None - pure Python constants.
|