1 41 35

Zhisheng Zheng

zhisheng01

https://zhishengzheng.com/

zhisheng147

AI & ML interests

LLM, Speech and Audio Processing

Recent Activity

updated a dataset 18 days ago

zhisheng01/s2sisometric

published a dataset 19 days ago

zhisheng01/s2sisometric

updated a dataset 20 days ago

zhisheng01/mp3d-ambisonics

View all activity

Organizations

updated a dataset 18 days ago

zhisheng01/s2sisometric

Viewer • Updated 18 days ago • 1.79M • 61

published a dataset 19 days ago

zhisheng01/s2sisometric

Viewer • Updated 18 days ago • 1.79M • 61

updated a dataset 20 days ago

zhisheng01/mp3d-ambisonics

Viewer • Updated 20 days ago • 21k • 81

updated a model 20 days ago

zhisheng01/omni-distilled

Updated 20 days ago

published a model 20 days ago

zhisheng01/omni-distilled

Updated 20 days ago

published a dataset 27 days ago

zhisheng01/mp3d-ambisonics

Viewer • Updated 20 days ago • 21k • 81

liked 2 datasets about 1 month ago

m-a-p/MTG

Updated Aug 4, 2025 • 11.2k • 1

ailsntua/Chordonomicon

Viewer • Updated May 15, 2025 • 680k • 48.8k • 36

upvoted a paper 2 months ago

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Paper • 2603.03269 • Published Mar 3 • 63

liked a dataset 2 months ago

agkphysics/AudioSet

Viewer • Updated Oct 16, 2025 • 3.57M • 53.6k • 94

upvoted 2 papers 3 months ago

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published Feb 12 • 63

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Paper • 2602.08794 • Published Feb 9 • 159

upvoted 2 papers 4 months ago

Qwen3-TTS Technical Report

Paper • 2601.15621 • Published Jan 22 • 75

MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization

Paper • 2601.01554 • Published Jan 4 • 60

liked a dataset 4 months ago

SparkAudio/voxbox

Viewer • Updated Apr 15, 2025 • 23.8M • 17.1k • 70

upvoted a paper 6 months ago

VIDEOP2R: Video Understanding from Perception to Reasoning

Paper • 2511.11113 • Published Nov 14, 2025 • 112

upvoted a paper 7 months ago

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence

Paper • 2510.24693 • Published Oct 28, 2025 • 19

liked a model 7 months ago

jordand/whisper-d-v1a

Updated Nov 1, 2024 • 1.35k • 46

upvoted a paper 7 months ago

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1, 2025 • 42

upvoted a paper 8 months ago

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

Paper • 2509.22220 • Published Sep 26, 2025 • 66

Zhisheng Zheng

AI & ML interests

Recent Activity

Organizations

zhisheng01's activity