37 6

Zeyu Zhang

SteveZeyuZhang

https://steve-zeyu-zhang.github.io/

steve-zeyu-zhang

AI & ML interests

Geometric Learning, Generative AI, Computer Vision, Robotics, AI for Health

Recent Activity

authored a paper 10 days ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

published a dataset 11 days ago

AIGeeksGroup/4D-Human-1K

updated a dataset 15 days ago

SteveZeyuZhang/kernel

View all activity

Organizations

authored a paper 10 days ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published 12 days ago • 116

submitted a paper to Daily Papers 17 days ago

UniMesh: Unifying 3D Mesh Understanding and Generation

Paper • 2604.17472 • Published 20 days ago • 11

submitted a paper to Daily Papers 18 days ago

HSG: Hyperbolic Scene Graph

Paper • 2604.17454 • Published 20 days ago • 1

submitted a paper to Daily Papers 2 months ago

MWM: Mobile World Models for Action-Conditioned Consistent Prediction

Paper • 2603.07799 • Published Mar 8

authored a paper 2 months ago

GeoWorld: Geometric World Models

Paper • 2602.23058 • Published Feb 26 • 8

submitted a paper to Daily Papers 2 months ago

GeoWorld: Geometric World Models

Paper • 2602.23058 • Published Feb 26 • 8

authored 2 papers 2 months ago

OmniOCR: Generalist OCR for Ethnic Minority Languages

Paper • 2602.21042 • Published Feb 24 • 2

OCR-Agent: Agentic OCR with Capability and Memory Reflection

Paper • 2602.21053 • Published Feb 24 • 3

submitted 2 papers to Daily Papers 2 months ago

OCR-Agent: Agentic OCR with Capability and Memory Reflection

Paper • 2602.21053 • Published Feb 24 • 3

OmniOCR: Generalist OCR for Ethnic Minority Languages

Paper • 2602.21042 • Published Feb 24 • 2

submitted 2 papers to Daily Papers 3 months ago

StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation

Paper • 2602.16915 • Published Feb 18

MMA: Multimodal Memory Agent

Paper • 2602.16493 • Published Feb 18 • 9

authored 8 papers 3 months ago

GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning

Paper • 2602.04315 • Published Feb 4 • 1

V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval

Paper • 2602.06034 • Published Feb 5 • 8

SafeMo: Linguistically Grounded Unlearning for Trustworthy Text-to-Motion Generation

Paper • 2601.00590 • Published Jan 2

WebCryptoAgent: Agentic Crypto Trading with Web Informatics

Paper • 2601.04687 • Published Jan 8

MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training

Paper • 2407.19546 • Published Jul 28, 2024

Zeyu Zhang

AI & ML interests

Recent Activity

Organizations

SteveZeyuZhang's activity