Zhiyuan Ma

zhizhi111

1 3 1

zhizhi111

AI & ML interests

Multimodal、Dialog Systems

Recent Activity

upvoted a collection about 20 hours ago

MAIR Published Papers

authored a paper 2 days ago

LMD: Faster Image Reconstruction with Latent Masking Diffusion

authored a paper 2 days ago

UltraMedical: Building Specialized Generalists in Biomedicine

View all activity

Organizations

upvoted a collection about 20 hours ago

MAIR Published Papers

Collection

1 item • Updated 5 days ago • 1

authored 19 papers 2 days ago

Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking

Paper • 2407.13188 • Published Jul 18, 2024

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Paper • 2410.11795 • Published Oct 15, 2024 • 18

DH-VTON: Deep Text-Driven Virtual Try-On via Hybrid Attention Learning

Paper • 2410.12501 • Published Oct 16, 2024

VideoDirector: Precise Video Editing via Text-to-Video Models

Paper • 2411.17592 • Published Nov 26, 2024

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

Dream3DAvatar: Text-Controlled 3D Avatar Reconstruction from a Single Image

Paper • 2509.13013 • Published Sep 16, 2025

Towards Cross-View Point Correspondence in Vision-Language Models

Paper • 2512.04686 • Published Dec 4, 2025

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

Paper • 2412.03017 • Published Dec 4, 2024

Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text

Paper • 1908.07721 • Published Aug 21, 2019

CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability

Paper • 2602.03012 • Published Feb 3 • 3

CRAFT: Calibrated Reasoning with Answer-Faithful Traces via Reinforcement Learning for Multi-Hop Question Answering

Paper • 2602.01348 • Published Feb 1

One2Scene: Geometric Consistent Explorable 3D Scene Generation from a Single Image

Paper • 2602.19766 • Published Feb 23

One-Step Effective Diffusion Network for Real-World Image Super-Resolution

Paper • 2406.08177 • Published Oct 24, 2024

VideoAfford: Grounding 3D Affordance from Human-Object-Interaction Videos via Multimodal Large Language Model

Paper • 2602.09638 • Published Feb 10

I2E: From Image Pixels to Actionable Interactive Environments for Text-Guided Image Editing

Paper • 2601.03741 • Published Apr 7

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

Paper • 2403.17589 • Published Mar 26, 2024 • 1

Zhiyuan Ma

AI & ML interests

Recent Activity

Organizations

zhizhi111's activity