--- title: Hf Speech2text Tool emoji: 💻 colorFrom: indigo colorTo: red sdk: gradio sdk_version: 5.13.2 app_file: app.py pinned: false tags: - tool short_description: Reads an audio file and returns its transcript. --- # Speech2text tool for your agent Uses the huggingface API under the hood. A simple tool for prototyping agents that can extract text from audio. This tool 1. opens and reads an audio file 2. calls huggingface api with your hf token to get a transcript 3. returns the string Useful for implementing vocal commands. # Usage ```python from smolagents import Tool from smolagents import CodeAgent from smolagents import HfApiModel hf_speech2text_tool = Tool.from_hub( "GTimothee/hf_text2speech_tool", token=, trust_remote_code=True ) model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct", token=) agent = CodeAgent(tools=[hf_speech2text_tool], model=model) output = agent.run( "Use your tools to read the audio file and return the transcription.", additional_args={ 'audio_filepath': filepath, 'hf_token': , 'model_for_transcription': 'whisper-small.en'} ) ```