Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Paper • 2211.06687 • Published Nov 12, 2022 • 6
FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System Paper • 2603.10420 • Published 7 days ago • 6
view article Article Fine-Tune W2V2-Bert for low-resource ASR with 🤗 Transformers Jan 19, 2024 • 47
facebook/timesformer-base-finetuned-k400 Video Classification • Updated Jan 2, 2023 • 78.8k • 43
HuggingFaceTB/SmolVLM2-256M-Video-Instruct Image-Text-to-Text • 0.3B • Updated Apr 8, 2025 • 143k • 98
HuggingFaceTB/SmolVLM2-500M-Video-Instruct Image-Text-to-Text • Updated Apr 8, 2025 • 276k • 123