Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27 • 215
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published 29 days ago • 25
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16 • 108
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions Paper • 2403.07392 • Published Mar 12, 2024 • 1
RT-DATR:Real-time Unsupervised Domain Adaptive Detection Transformer with Adversarial Feature Learning Paper • 2504.09196 • Published Apr 12 • 1
RT-DATR:Real-time Unsupervised Domain Adaptive Detection Transformer with Adversarial Feature Learning Paper • 2504.09196 • Published Apr 12 • 1