rbgo (Rajdeep Borgohain)

upvoted an article 10 months ago

Article

Ettin Suite: SoTA Paired Encoders and Decoders

+4

orionweller, kdricci, mmarone, NohTow, dlawrie, vandurme

•

Jul 16, 2025

• 81

upvoted a paper 11 months ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published Jul 16, 2025 • 43

upvoted an article 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted a collection about 1 year ago

Gemma 3 Release

Collection

28 items • Updated Mar 12 • 643

upvoted an article about 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

ariG23498, merve, pcuenq, reach-vb

•

Mar 12, 2025

• 497

upvoted 2 articles over 1 year ago

Article

Inside the family of Smol models

Kseniase

•

Feb 27, 2025

• 14

Article

SmolLM - blazingly fast and remarkably powerful

+1

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 459

upvoted a paper over 1 year ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published Feb 26, 2025 • 66

upvoted a collection over 1 year ago

Phi-4

Collection

Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10, 2025 • 212

upvoted 2 papers over 1 year ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19, 2025 • 219

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 153

upvoted 2 articles over 1 year ago

Article

Mastering Long Contexts in LLMs with KVPress

nvidia

•

Jan 23, 2025

• 76

Article

Open-R1: a fully open reproduction of DeepSeek-R1

+1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

upvoted 4 collections over 1 year ago

upvoted a paper over 1 year ago

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Paper • 2412.06071 • Published Dec 8, 2024 • 11

upvoted an article over 1 year ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

+3

ariG23498, rwightman, qubvel-hf, pcuenq, reach-vb

•

Jan 16, 2025

• 55

upvoted a paper over 1 year ago

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 124

Rajdeep Borgohain

AI & ML interests

Organizations

Ettin Suite: SoTA Paired Encoders and Decoders

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

SmolLM3: smol, multilingual, long-context reasoner

Gemma 3 Release

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Inside the family of Smol models

SmolLM - blazingly fast and remarkably powerful

Kanana: Compute-efficient Bilingual Language Models

Phi-4

Qwen2.5-VL Technical Report

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Mastering Long Contexts in LLMs with KVPress

Open-R1: a fully open reproduction of DeepSeek-R1

Qwen2.5-VL

Qwen2.5-1M

DeepSeek-V2

DeepSeek-LLM

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Timm ❤️ Transformers: Use any timm model with transformers

Phi-4 Technical Report

Rajdeep Borgohain

AI & ML interests

Organizations

rbgo's activity

Ettin Suite: SoTA Paired Encoders and Decoders

SmolLM3: smol, multilingual, long-context reasoner

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Inside the family of Smol models

SmolLM - blazingly fast and remarkably powerful

Mastering Long Contexts in LLMs with KVPress

Open-R1: a fully open reproduction of DeepSeek-R1

Timm ❤️ Transformers: Use any timm model with transformers