Kirill Gelvan

Kirili4ik

·

https://github.com/Kirili4ik

AI & ML interests

NLP, DL for Audio, Generative Models

Recent Activity

upvoted a paper 8 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

liked a Space 12 days ago

AlexWortega/same-data-different-losses

upvoted a collection about 1 month ago

View all activity

Organizations

upvoted a paper 8 days ago

OpenThoughts-Agent: Data Recipes for Agentic Models

Paper • 2606.24855 • Published 11 days ago • 46

upvoted a collection about 1 month ago

Mellum 2

Mellum2 model weights • 6 items • Updated Jun 1 • 125

upvoted a collection 3 months ago

SWE-rebench-V2

SWE-rebench-V2 is a curated dataset of software-engineering tasks derived from real GitHub issues and pull requests. • 3 items • Updated Mar 3 • 20

upvoted 2 articles 4 months ago

Article

Mixture of Experts (MoEs) in Transformers

+5

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

Article

Mixture of Experts Explained

+4

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.15k

upvoted an article 6 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

burtenshaw, evalstate

•

Dec 4, 2025

• 629

upvoted an article 8 months ago

Article

Granite 4.0 Nano: Just how small can you go?

ibm-granite

•

Oct 28, 2025

• 125

upvoted a collection 8 months ago

🦫 PIPer

All the resources for our paper "PIPer: On-Device Environment Setup via Online Reinforcement Learning"! • 9 items • Updated Oct 1, 2025 • 3

upvoted a paper 9 months ago

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Paper • 2509.25455 • Published Sep 29, 2025 • 38

upvoted an article about 1 year ago

Article

CircleGuardBench: New Standard for Evaluating AI Moderation Models

whitecircle

•

May 7, 2025

• 59

upvoted a paper about 1 year ago

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers

Paper • 2504.20752 • Published Apr 29, 2025 • 96

upvoted an article over 1 year ago

Article

Introduction to State Space Models (SSM)

lbourdois

•

Jul 19, 2024

• 233

upvoted a collection over 1 year ago

📊 Commit Message Generation Evaluation 🔍

All the resources for our "Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings" study on CMG metrics! • 7 items • Updated Mar 14, 2025 • 2

upvoted a paper almost 3 years ago

Wuerstchen: Efficient Pretraining of Text-to-Image Models

Paper • 2306.00637 • Published Jun 1, 2023 • 13