4 15 9

Jinrui Zhang

zjr2000

AI & ML interests

None yet

Recent Activity

upvoted a paper 27 days ago

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

liked a dataset 27 days ago

VCLab-PolyU/GGT-100K

upvoted an article about 2 months ago

NEO-unify: Building Native Multimodal Unified Models End to End

View all activity

Organizations

None yet

upvoted a paper 27 days ago

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Paper • 2605.31039 • Published about 1 month ago • 46

upvoted an article about 2 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 167

upvoted 4 papers 4 months ago

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 153

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

Paper • 2305.02677 • Published May 4, 2023 • 1

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

Paper • 2307.16525 • Published Jul 31, 2023 • 1

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Paper • 2411.19772 • Published Nov 29, 2024 • 2

upvoted 3 papers 5 months ago

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Paper • 2602.11543 • Published Feb 12 • 6

GENIUS: Generative Fluid Intelligence Evaluation Suite

Paper • 2602.11144 • Published Feb 11 • 55

Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

Paper • 2602.03139 • Published Feb 3 • 45

upvoted a paper 7 months ago

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

Paper • 2511.14349 • Published Nov 18, 2025 • 18

upvoted a collection 7 months ago

Nemotron-Pre-Training-Datasets

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 17 days ago • 173

upvoted a paper 8 months ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Paper • 2510.19871 • Published Oct 22, 2025 • 30

upvoted a paper 10 months ago

AudioStory: Generating Long-Form Narrative Audio with Large Language Models

Paper • 2508.20088 • Published Aug 27, 2025 • 21

upvoted 2 papers about 1 year ago

TIIF-Bench: How Does Your T2I Model Follow Your Instructions?

Paper • 2506.02161 • Published Jun 2, 2025 • 13

VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank

Paper • 2505.14460 • Published May 20, 2025 • 34

Jinrui Zhang

AI & ML interests

Recent Activity

Organizations

zjr2000's activity

NEO-unify: Building Native Multimodal Unified Models End to End