RLAIF-V

community

AI & ML interests

None defined yet.

Recent Activity

Yirany authored a paper 1 day ago

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

Yirany submitted a paper 3 days ago

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

Yirany authored a paper 4 days ago

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

View all activity

authored a paper 1 day ago

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

Paper • 2605.08985 • Published 6 days ago • 19

submitted a paper to Daily Papers 3 days ago

LLaVA-UHD v4: What Makes Efficient Visual Encoding in MLLMs?

Paper • 2605.08985 • Published 6 days ago • 19

authored a paper 4 days ago

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published 15 days ago • 68

submitted a paper to Daily Papers 7 days ago

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published 15 days ago • 68

authored a paper 8 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16, 2025 • 57

authored 4 papers 11 months ago

Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages

Paper • 2308.12038 • Published Aug 23, 2023 • 2

A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs

Paper • 2411.17265 • Published Nov 26, 2024 • 1

EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents

Paper • 2501.11858 • Published Jan 21, 2025 • 7

RLPR: Extrapolating RLVR to General Domains without Verifiers

Paper • 2506.18254 • Published Jun 23, 2025 • 35

updated 2 models 11 months ago

openbmb/RLPR-Llama3.1-8B-Inst

Text Generation • 8B • Updated Jun 30, 2025 • 46 • 4

openbmb/RLPR-Gemma2-2B-it

Text Generation • 3B • Updated Jun 30, 2025 • 58 • 4

updated a model 11 months ago

openbmb/RLPR-Llama3.1-8B-Inst

Text Generation • 8B • Updated Jun 30, 2025 • 46 • 4

published a model 11 months ago

openbmb/RLPR-Llama3.1-8B-Inst

Text Generation • 8B • Updated Jun 30, 2025 • 46 • 4

updated a model 11 months ago

RLAIF-V/RLPR-Qwen2.5-7B-Base

8B • Updated Jun 22, 2025 • 2 • 1

updated 2 datasets 11 months ago

RLAIF-V/RLPR-Benchmarks

Viewer • Updated Jun 22, 2025 • 638 • 145

RLAIF-V/RLPR-Train-Dataset

Viewer • Updated Jun 22, 2025 • 77.7k • 18

updated a dataset 11 months ago

RLAIF-V/RLPR-Train-Dataset

Viewer • Updated Jun 22, 2025 • 77.7k • 18

updated a model 11 months ago

RLAIF-V/RLPR-Qwen2.5-7B-Base

8B • Updated Jun 22, 2025 • 2 • 1

updated a dataset 11 months ago

RLAIF-V/RLPR-Benchmarks

Viewer • Updated Jun 22, 2025 • 638 • 145

authored a paper over 1 year ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3, 2025 • 62