Alpha-VLLM

company

https://github.com/Alpha-VLLM

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Cxxs authored a paper 16 days ago

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Cxxs submitted a paper 16 days ago

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

YupengZhou submitted a paper about 2 months ago

Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation

View all activity

Papers

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

View all Papers

Cxxs

authored a paper 16 days ago

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Paper • 2606.12575 • Published 18 days ago • 13

Cxxs

submitted a paper to Daily Papers 16 days ago

High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation

Paper • 2606.12575 • Published 18 days ago • 13

submitted a paper to Daily Papers about 2 months ago

Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation

Paper • 2604.25819 • Published Apr 28 • 17

authored 8 papers 2 months ago

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Paper • 2512.21675 • Published Dec 25, 2025 • 28

Accelerating Masked Image Generation by Learning Latent Controlled Dynamics

Paper • 2602.23996 • Published Feb 27 • 9

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 49

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Paper • 2603.27460 • Published Mar 29 • 72

Training-Free Acceleration for Document Parsing Vision-Language Model with Hierarchical Speculative Decoding

Paper • 2602.12957 • Published Feb 13

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Paper • 2603.27460 • Published Mar 29 • 72

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Paper • 2603.27460 • Published Mar 29 • 72

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 244

submitted a paper to Daily Papers 4 months ago

The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training

Paper • 2603.10444 • Published Mar 11 • 12

in Alpha-VLLM/Lumina-Image-2.0 4 months ago

I would like to obtain your contact information and customize a model

#19 opened 4 months ago by

authored a paper 4 months ago

PyVision-RL: Forging Open Agentic Vision Models via RL

Paper • 2602.20739 • Published Feb 24 • 31

submitted a paper to Daily Papers 4 months ago

PyVision-RL: Forging Open Agentic Vision Models via RL

Paper • 2602.20739 • Published Feb 24 • 31

authored 3 papers 6 months ago

Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis

Paper • 2510.15710 • Published Oct 17, 2025 • 8

Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark

Paper • 2402.02242 • Published Feb 3, 2024

dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models

Paper • 2512.19433 • Published Dec 22, 2025 • 4

updated a collection 6 months ago

Lumina-DiMOO Family

Open-Sourced Large Diffusion Language Model for Multi-Modal Generation and Understanding • 3 items • Updated Mar 2 • 5