mlpc-ucsd

university

https://pages.ucsd.edu/~ztu/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

JamesSand authored a paper 4 days ago

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

zwcolin authored a paper 7 days ago

Playful Agentic Robot Learning

JamesSand submitted a paper 11 days ago

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

View all activity

Papers

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Pose Recognition with Cascade Transformers

View all Papers

authored a paper 4 days ago

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

Paper • 2606.13044 • Published 16 days ago • 10

authored a paper 7 days ago

Playful Agentic Robot Learning

Paper • 2606.19419 • Published 10 days ago • 48

submitted a paper to Daily Papers 11 days ago

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

Paper • 2606.13044 • Published 16 days ago • 10

authored a paper 21 days ago

Stateful Visual Encoders for Vision-Language Models

Paper • 2606.04433 • Published 24 days ago • 8

submitted a paper to Daily Papers 23 days ago

Stateful Visual Encoders for Vision-Language Models

Paper • 2606.04433 • Published 24 days ago • 8

authored 9 papers 3 months ago

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions

Paper • 2308.09936 • Published Aug 19, 2023 • 1

Matryoshka Query Transformer for Large Vision-Language Models

Paper • 2405.19315 • Published May 29, 2024 • 1

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Paper • 2410.08182 • Published Oct 10, 2024

Verbalized Representation Learning for Interpretable Few-Shot Generalization

Paper • 2411.18651 • Published Nov 27, 2024

Interleaving Reasoning for Better Text-to-Image Generation

Paper • 2509.06945 • Published Sep 8, 2025 • 16

TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

Paper • 2509.25143 • Published Sep 29, 2025

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

Paper • 2510.08457 • Published Oct 9, 2025 • 14

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Paper • 2512.10863 • Published Dec 11, 2025 • 22

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Paper • 2604.08539 • Published Apr 9 • 51

submitted a paper to Daily Papers 4 months ago

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Paper • 2603.05888 • Published Mar 6 • 2

authored 5 papers 5 months ago

Language Models Meet World Models: Embodied Experiences Enhance Language Models

Paper • 2305.10626 • Published May 18, 2023 • 1

Language Models as Science Tutors

Paper • 2402.11111 • Published Feb 16, 2024

On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning

Paper • 2210.10763 • Published Oct 19, 2022 • 1

OmniControlNet: Dual-stage Integration for Conditional Image Generation

Paper • 2406.05871 • Published Jun 9, 2024

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

Paper • 2508.00728 • Published Aug 1, 2025