7 25 25

Jihwan Kim

jjihwannn

https://jjihwan.github.io/

AI & ML interests

Computer Vision, Diffusion Models, Generative Models

Recent Activity

authored a paper 21 days ago

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

commentedon a paper 22 days ago

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

upvoted a paper 22 days ago

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

View all activity

Organizations

upvoted a paper 22 days ago

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs

Paper • 2605.17260 • Published 24 days ago • 25

upvoted an article 2 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 907

upvoted a paper 4 months ago

Language Self-Play For Data-Free Training

Paper • 2509.07414 • Published Sep 9, 2025 • 31

upvoted an article 7 months ago

Article

Streaming datasets: 100x More Efficient

andito, lhoestq, burtenshaw, pcuenq, merve

•

Oct 27, 2025

• 86

upvoted a paper 8 months ago

VideoNSA: Native Sparse Attention Scales Video Understanding

Paper • 2510.02295 • Published Oct 2, 2025 • 10

upvoted a paper 10 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 224

upvoted a collection 10 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 563

upvoted a paper 11 months ago

STR-Match: Matching SpatioTemporal Relevance Score for Training-Free Video Editing

Paper • 2506.22868 • Published Jun 28, 2025 • 5

upvoted a paper about 1 year ago

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 56

upvoted a collection about 1 year ago

Unofficial Mamba2 for Hf Transformers

Collection

Just the original weights converted to be compatible with transformers. • 5 items • Updated Oct 16, 2024 • 2

upvoted 3 papers about 1 year ago

upvoted 5 papers over 1 year ago

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4, 2025 • 66

Weak-to-Strong Diffusion with Reflection

Paper • 2502.00473 • Published Feb 1, 2025 • 24

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Paper • 2502.03639 • Published Feb 5, 2025 • 9

Scaling Embedding Layers in Language Models

Paper • 2502.01637 • Published Feb 3, 2025 • 24

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 182

upvoted 2 papers almost 2 years ago

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126

TVG: A Training-free Transition Video Generation Method with Diffusion Models

Paper • 2408.13413 • Published Aug 24, 2024 • 14

Jihwan Kim

AI & ML interests

Recent Activity

Organizations

jjihwannn's activity

Welcome Gemma 4: Frontier multimodal intelligence on device

Streaming datasets: 100x More Efficient