Your Name
fine v.1.0
f5bce42

CLI

Last updated: 2026-03-09

Purpose

Main command-line interface for the SRT Caption Generator. Provides single-file and batch processing modes with comprehensive error handling and progress reporting for Tunisian Arabic content creators.

Function Signature

def main() -> None:

Parameters

Command-line arguments (via argparse):

Argument Type Required Default Description
--audio str Yes (single) - Path to audio file (mp3, wav, m4a, aac)
--script str Yes (single) - Path to script text file (UTF-8)
--output str No output/{audio_name}.srt Custom output SRT file path
--language str No "ara" Alignment language code
--offset int No 0 Global timestamp offset in milliseconds
--max-chars int No 42 Auto-split long captions at word boundaries
--word-level flag No False Use word-level alignment instead of sentence-level
--batch flag No False Enable batch processing mode
--input-dir str Yes (batch) - Input directory for batch processing
--output-dir str Yes (batch) - Output directory for batch processing
--verbose flag No False Print detailed alignment information

Returns

Exit code 0 on success, 1 on error.

Error Handling

Condition Behavior
Missing required arguments Clean error message, exit code 1
File not found Descriptive error with file path
Alignment failure Model download hints or alignment error details
Keyboard interrupt Clean "Process interrupted" message
Unexpected errors Stack trace in --verbose mode, clean message otherwise

Usage Example

# Basic single file processing
python3 align.py --audio input/video_01.mp3 --script input/video_01.txt

# Custom output path and offset
python3 align.py --audio input/video.wav --script input/script.txt --output custom.srt --offset -200

# Verbose output with debugging
python3 align.py --audio input/video.m4a --script input/script.txt --verbose

# Word-level alignment with custom character limit
python3 align.py --audio input/video.wav --script input/script.txt --word-level --max-chars 30

# Auto-split long captions without word-level alignment
python3 align.py --audio input/video.mp3 --script input/script.txt --max-chars 25

# Batch processing
python3 align.py --batch --input-dir input/ --output-dir output/

Known Edge Cases

  • First run model download: Shows "📥 Downloading alignment model..." message, may take several minutes
  • Missing ffmpeg: Clear installation instructions provided in error message
  • Corrupted files: Validation catches issues early with descriptive messages
  • Timestamp offset edge cases: Prevents negative timestamps, maintains minimum caption duration
  • Memory constraints: Large files processed efficiently via streaming alignment
  • Mixed file encodings: UTF-8 validation prevents garbled Arabic text processing
  • Keyboard interruption: Graceful cleanup of temporary files

Dependencies

  • Python 3.10+: Required for Union type hints and pathlib features
  • ctc-forced-aligner: Core alignment library
  • torch + torchaudio: PyTorch ecosystem for model inference
  • ffmpeg: System dependency for audio processing
  • All local modules: validator, normalize, aligner, srt_writer, config