Kaicheng Yang

Kaichengalex

131 88 70

https://kaichengyang0828.github.io/Kaicheng-Yang0828.github.io/

kaichengyang0828

AI & ML interests

Multimodal Representation Learning/ Vision-Language Pretraining/DeepResearch

Recent Activity

liked a model 3 days ago

uer/gpt2-chinese-cluecorpussmall

updated a collection 5 days ago

Vision-Language Dataset

upvoted a paper 22 days ago

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

View all activity

Organizations

upvoted a paper 22 days ago

InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning

Paper • 2606.12195 • Published 24 days ago • 23

upvoted 2 papers 25 days ago

Visual Para-Thinker++: A Single-Policy Multi-Agent Framework for Visual Reasoning

Paper • 2606.09290 • Published 26 days ago • 7

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Paper • 2606.09304 • Published 26 days ago • 6

upvoted an article about 1 month ago

Article

Codex 正在推动 AI 模型的开源与训练流程

burtenshaw, evalstate

•

Dec 11, 2025

• 16

upvoted a paper about 1 month ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published May 27 • 75

upvoted 3 papers about 2 months ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

Paper • 2604.27393 • Published Apr 30 • 81

Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

Paper • 2605.05242 • Published May 3 • 126

upvoted a paper 2 months ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published Apr 22 • 77

upvoted 4 papers 3 months ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Paper • 2604.14967 • Published Apr 16 • 15

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 168

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published Apr 6 • 236

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published Mar 29 • 149

upvoted 3 papers 4 months ago

upvoted a paper 5 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

upvoted 3 papers 6 months ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 196

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published Jan 15 • 33

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 37

Kaicheng Yang

AI & ML interests

Recent Activity

Organizations

Kaichengalex's activity

Codex 正在推动 AI 模型的开源与训练流程