Byeongho Heo's picture

Byeongho Heo PRO

bhheo

·

https://sites.google.com/view/byeongho-heo/home

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Retrieve, Don't Retrain: Extending Vision Language Action Models to New Tasks at Test Time

liked a dataset about 2 months ago

ianncity/KIMI-K2.5-1000000x

liked a dataset about 2 months ago

Jackrong/Kimi-K2.5-Reasoning-1M-Cleaned

View all activity

Organizations

upvoted a paper 11 days ago

Retrieve, Don't Retrain: Extending Vision Language Action Models to New Tasks at Test Time

Paper • 2606.15631 • Published 13 days ago • 16

upvoted a paper about 2 months ago

RLDX-1 Technical Report

Paper • 2605.03269 • Published May 5 • 126

upvoted a collection 3 months ago

MuCo

MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model [CVPR 2026] • 4 items • Updated Apr 13 • 2

upvoted a paper 3 months ago

Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published Mar 16 • 155

upvoted 3 papers 8 months ago

Exploring Conditions for Diffusion models in Robotic Control

Paper • 2510.15510 • Published Oct 17, 2025 • 40

Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs

Paper • 2510.13251 • Published Oct 15, 2025 • 14

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published Oct 18, 2025 • 49

upvoted a paper 12 months ago

Token Bottleneck: One Token to Remember Dynamics

Paper • 2507.06543 • Published Jul 9, 2025 • 20

upvoted 2 collections about 1 year ago

HyperCLOVA X SEED

HyperCLOVA X SEED is NAVER's lightweight open-source lineup with a strong focus on Korean language performance • 6 items • Updated Dec 24, 2025 • 42

ProLIP

Official ProLIP weights, Probabilistic Language-Image Pre-Training (ICLR 2025) • 7 items • Updated Apr 18, 2025 • 10

upvoted a paper over 1 year ago

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

Paper • 2411.19067 • Published Nov 28, 2024 • 8

upvoted a collection over 1 year ago

Cosmos-Tokenizer1

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 22 items • Updated 15 days ago • 44

upvoted 2 papers over 1 year ago

Unified Speech-Text Pretraining for Spoken Dialog Modeling

Paper • 2402.05706 • Published Feb 8, 2024 • 7

Rethinking Spatial Dimensions of Vision Transformers

Paper • 2103.16302 • Published Mar 30, 2021 • 2

upvoted 2 collections over 1 year ago

RDNet

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs [ECCV 2024] • 9 items • Updated Oct 16, 2024 • 3

rope-vit

Rotary Position Embedding for Vision Transformer [ECCV 2024] • 22 items • Updated Oct 16, 2024 • 5

upvoted a paper over 1 year ago

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Paper • 2403.19588 • Published Mar 28, 2024 • 4

upvoted a paper almost 2 years ago

Rotary Position Embedding for Vision Transformer

Paper • 2403.13298 • Published Mar 20, 2024 • 6