Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

ASID-Caption

community
https://asid-caption.github.io/
Activity Feed

AI & ML interests

Video Understanding, Audio-Visual, Multimodal LLMs, Video Captioning, Instruction Tuning, Dataset Curation, Qwen-based, Open-source, Fully-Open-MLLMs

Recent Activity

lyhisme  submitted a paper 10 days ago
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought
lyhisme  updated a model 24 days ago
AudioVisual-Caption/ASID-Captioner-7B
lyhisme  updated a model 24 days ago
AudioVisual-Caption/ASID-Captioner-3B
View all activity

Papers

Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

View all Papers

Yunheng Li's profile picture

AudioVisual-Caption 's models 2

AudioVisual-Caption/ASID-Captioner-7B

Image-Text-to-Text • 9B • Updated 24 days ago • 110 • 5

AudioVisual-Caption/ASID-Captioner-3B

Image-Text-to-Text • 5B • Updated 24 days ago • 459 • 37
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs