Collections
Discover the best community collections!
Collections trending this week
-
sean0042/KorMedMCQA
Viewer • Updated • 7.49k • 1.19k • 35 -
KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations
Paper • 2403.01469 • Published -
seongsubae/KorMedMCQA-V
Viewer • Updated • 1.84k • 103 • 7 -
KorMedMCQA-V: A Multimodal Benchmark for Evaluating Vision-Language Models on the Korean Medical Licensing Examination
Paper • 2602.13650 • Published
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 128 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 134 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 123
-
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 55 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Paper • 2103.14030 • Published • 5
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 128 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 134 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 123
-
sean0042/KorMedMCQA
Viewer • Updated • 7.49k • 1.19k • 35 -
KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations
Paper • 2403.01469 • Published -
seongsubae/KorMedMCQA-V
Viewer • Updated • 1.84k • 103 • 7 -
KorMedMCQA-V: A Multimodal Benchmark for Evaluating Vision-Language Models on the Korean Medical Licensing Examination
Paper • 2602.13650 • Published
-
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 55 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Paper • 2103.14030 • Published • 5