torch gradio>=3.0 numpy soundfile # optional (for audio synthesis later): gTTS pydub TTS