25 36 98

Noob

noobmldude

AI & ML interests

Explainable AI

Recent Activity

liked a dataset 4 days ago

ibm-research/AssetOpsBench

liked a model 6 days ago

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4

liked a model about 2 months ago

XiaomiMiMo/MiMo-V2-Flash

View all activity

Organizations

upvoted 4 collections 2 months ago

upvoted a collection 4 months ago

LLM Training

Collection

54 items • Updated Oct 13, 2025 • 5

upvoted a paper 6 months ago

Shadow-FT: Tuning Instruct via Base

Paper • 2505.12716 • Published May 19, 2025 • 4

upvoted 2 collections 6 months ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 254

💜 Kotlin ML Pack

Collection

A collection of datasets, fine-tuned models and benchmarks to train your models for perfect Kotlin code generation. • 9 items • Updated Jun 11, 2024 • 27

upvoted an article 7 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez

•

Sep 11, 2025

• 188

upvoted a paper 7 months ago

TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar

Paper • 2510.14972 • Published Oct 16, 2025 • 35

upvoted a collection 12 months ago

H-Net

Collection

The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 • 6 items • Updated Mar 2 • 20

upvoted 2 articles 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

Article

Welcome Gemma 2 - Google’s new open LLM

philschmid, osanseviero, pcuenq, lewtun, tomaarsen, reach-vb

•

Jun 27, 2024

• 132

upvoted a paper about 1 year ago

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 133

upvoted an article about 1 year ago

Article

RegMix: Data Mixture as Regression for Language Model Pre-training

SivilTaram

•

Jul 11, 2024

• 16

upvoted a paper about 1 year ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7, 2024 • 16

upvoted an article about 1 year ago

Article

Selective fine-tuning of Language Models with Spectrum

anakin87

•

Sep 3, 2024

• 36

upvoted 2 papers about 1 year ago

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 69

Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure

Paper • 2506.12278 • Published Jun 13, 2025 • 16

upvoted an article about 1 year ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

yuxiang630, cassanof, ganler, YifengDing, StringChaos, harmdevries, lvwerra, arjunguha, lingming

•

Apr 29, 2024

• 79

Noob

AI & ML interests

Recent Activity

Organizations

noobmldude's activity

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

SmolLM3: smol, multilingual, long-context reasoner

Welcome Gemma 2 - Google’s new open LLM

RegMix: Data Mixture as Regression for Language Model Pre-training

Selective fine-tuning of Language Models with Spectrum

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation