12 16

Song-ha Jo

SJ2048

AI & ML interests

None yet

Recent Activity

updated a model 14 days ago

SJ2048/qwen35ae-v6-trajectory

published a model 18 days ago

SJ2048/qwen35ae-v6-trajectory

upvoted a collection 6 months ago

Emu3

View all activity

Organizations

None yet

updated a model 14 days ago

SJ2048/qwen35ae-v6-trajectory

Updated 14 days ago • 1

published a model 18 days ago

SJ2048/qwen35ae-v6-trajectory

Updated 14 days ago • 1

upvoted 2 collections 6 months ago

Emu3

Collection

Emu3: Next-Token Prediction is All You Need • 7 items • Updated Feb 4 • 81

Ministral 3

Collection

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 168

upvoted 3 papers over 1 year ago

liked 3 models over 1 year ago

google/gemma-7b

Text Generation • 9B • Updated Jun 27, 2024 • 27.1k • • 3.35k

meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 184 • 4.5k

microsoft/Phi-3-small-128k-instruct

Text Generation • 7B • Updated Dec 10, 2025 • 1.5k • 182

liked 2 models almost 2 years ago

google/gemma-2-9b

Text Generation • 9B • Updated Aug 7, 2024 • 71.2k • • 709

microsoft/Phi-3-small-8k-instruct

Text Generation • 7B • Updated Dec 10, 2025 • 15.3k • 178

upvoted a paper almost 2 years ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 262

liked a model almost 2 years ago

meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 1.3M • • 6.57k

upvoted a paper almost 2 years ago

Shortened LLaMA: A Simple Depth Pruning for Large Language Models

Paper • 2402.02834 • Published Feb 5, 2024 • 17

liked a dataset almost 2 years ago

mit-han-lab/awq-model-zoo

Updated Apr 13, 2025 • 2.06k • 19

upvoted 4 papers almost 2 years ago

ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 32

POA: Pre-training Once for Models of All Sizes

Paper • 2408.01031 • Published Aug 2, 2024 • 27

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 175

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 157

Song-ha Jo

AI & ML interests

Recent Activity

Organizations

SJ2048's activity