Kaicheng Yang's picture

Kaicheng Yang

Kaichengalex

·

https://kaichengyang0828.github.io/Kaicheng-Yang0828.github.io/

kaichengyang0828

AI & ML interests

Multimodal Representation Learning/ Vision-Language Pretraining/DeepResearch

Recent Activity

upvoted a paper 15 days ago

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

upvoted a paper 18 days ago

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

upvoted a paper 18 days ago

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

View all activity

Organizations

upvoted a paper 15 days ago

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

Paper • 2606.12195 • Published 17 days ago • 23

upvoted 2 papers 18 days ago

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Paper • 2606.09290 • Published 19 days ago • 7

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Paper • 2606.09304 • Published 19 days ago • 6

liked a dataset 21 days ago

nvidia/NitroGen

Updated Jan 12 • 1.84k • 213

updated a collection 21 days ago

VideoDataset

3 items • Updated 21 days ago

updated a collection 26 days ago

Vision-Language Dataset

4 items • Updated 26 days ago

upvoted an article 29 days ago

Article

Codex 正在推动 AI 模型的开源与训练流程

burtenshaw, evalstate

•

Dec 11, 2025

• 16

upvoted a paper about 1 month ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published May 27 • 75

updated a collection about 1 month ago

VideoDataset

3 items • Updated 21 days ago

liked a dataset about 1 month ago

mvp-lab/LLaVA-OneVision-2-Data

Viewer • Updated May 11 • 24 • 160k • 30

liked a Space about 1 month ago

Paddleocr 3.5 Transformers Demo

Run OCR & Doc-Parsing with PaddleOCR 3.5 and Transforformers

upvoted 3 papers about 2 months ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published Apr 30 • 80

Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

Paper • 2605.05242 • Published May 3 • 126

liked a dataset about 2 months ago

raivn/VideoNet

Viewer • Updated May 6 • 5k • 413 • 8

upvoted a paper 2 months ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published Apr 22 • 77

authored a paper 2 months ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Paper • 2604.14967 • Published Apr 16 • 15

updated a collection 2 months ago

UniDoc-RL

4 items • Updated Apr 17