Justus Tobias

justus-tobias

1 4 18

https://justus-tobias.de

j-tobias

AI & ML interests

Multimodal Learning, Representation Learning, Audio Processing

Recent Activity

upvoted a paper about 1 month ago

SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search

liked a Space over 1 year ago

smolagents/smolagents-leaderboard

liked a model over 1 year ago

Zyphra/Zonos-v0.1-hybrid

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search

Paper • 2605.29796 • Published May 28 • 25

liked a Space over 1 year ago

smolagents LLM leaderboard

🏆

142

A leaderboard for LLMs powering smolagents

liked 3 models over 1 year ago

updated a Space over 1 year ago

Heartbeat

💜

upvoted a paper over 1 year ago

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Paper • 2411.02256 • Published Nov 4, 2024 • 1

liked a model over 1 year ago

tencent/HunyuanVideo

Text-to-Video • Updated Mar 6, 2025 • 831 • • 2.2k

upvoted a paper over 1 year ago

AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset

Paper • 2311.15308 • Published Nov 26, 2023 • 2

updated a Space almost 2 years ago

Moshi

💨

Create interactive spoken dialogue using audio input

upvoted a paper almost 2 years ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 123

liked 2 Spaces almost 2 years ago

Open ASR Leaderboard

🏆

1.39k

Compare speech recognition models on benchmark scores

Seamless M4T

📞

949

updated a dataset almost 2 years ago

justus-tobias/TestDataset

Updated Aug 15, 2024 • 1

liked a Space almost 2 years ago

gradio_pdf V0.10.0

🚀

Ask questions about PDF documents

liked a model almost 2 years ago

facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.41M • 399

liked 2 datasets almost 2 years ago

openslr/librispeech_asr

Viewer • Updated Jul 25, 2025 • 585k • 80.8k • 228

MLCommons/peoples_speech

Viewer • Updated Nov 20, 2024 • 8.05M • 22.7k • 271

liked a Space almost 2 years ago

AudioLDM2 Text2Audio Text2Music Generation

🔊

307

Generate audio and waveform video from text

liked a Space about 2 years ago

Exbert

🌍

176

Explore BERT model's attention mechanisms