GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 8 days ago • 97
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation Paper • 2511.08195 • Published Nov 11, 2025 • 34
CogVLM: Visual Expert for Pretrained Language Models Paper • 2311.03079 • Published Nov 6, 2023 • 27
CogAgent: A Visual Language Model for GUI Agents Paper • 2312.08914 • Published Dec 14, 2023 • 32
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29, 2024 • 57
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1, 2025 • 255
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1, 2025 • 255
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29, 2024 • 57
Runtime error Agents 166 CogVLM 📊 166 Answer questions about uploaded images using natural language