tkob

RuggingHace

1 15 12

AI & ML interests

None yet

Recent Activity

liked a model about 13 hours ago

MiniMaxAI/MiniMax-H3

updated a model 6 days ago

MiniMaxAI/MiniMax-H3

upvoted an article 2 months ago

Continuous batching from first principles

View all activity

Organizations

liked a model about 13 hours ago

MiniMaxAI/MiniMax-H3

Image-Text-to-Video • 33B • Updated about 9 hours ago • 1.2k

updated a model 6 days ago

MiniMaxAI/MiniMax-H3

Image-Text-to-Video • 33B • Updated about 9 hours ago • 1.2k

upvoted an article 2 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 427

New activity in facebook/layerskip-llama2-13B 5 months ago

layerskip-llama2-13B access denied

#2 opened 5 months ago by

RuggingHace

upvoted 2 articles 5 months ago

Article

Ulysses Sequence Parallelism: Training with Million-Token Contexts

kashif, stas

•

Mar 9

• 33

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 174

upvoted an article 6 months ago

Article

Custom Kernels for All from Codex and Claude

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 80

liked a model 6 months ago

MiniMaxAI/MiniMax-M2.5

Text Generation • 229B • Updated Mar 10 • 772k • • 1.5k

upvoted 2 articles 6 months ago

Article

Training Design for Text-to-Image Models: Lessons from Ablations

Photoroom

•

Feb 3

• 77

Article

Text-to-image Architectural Experiments

Photoroom

•

Nov 13, 2025

• 61

upvoted a paper 6 months ago

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 108

upvoted 2 articles 6 months ago

Article

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

huggingface

•

Jan 27

• 45

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 80

liked a model 7 months ago

MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated Feb 13 • 12.8k • • 1.36k

upvoted an article 9 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 492

liked a Space 9 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

liked a model 9 months ago

bigscience/bloom

Text Generation • 176B • Updated 5 days ago • 10.7k • 5.03k

liked 2 Spaces 9 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.4k

Explore and download the FineWeb web‑scale text dataset

The Smol Training Playbook

📚

3.25k

The secrets to building world-class LLMs

liked a model 9 months ago

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated Dec 23, 2025 • 108k • • 1.5k

tkob

AI & ML interests

Recent Activity

Organizations

RuggingHace's activity

Continuous batching from first principles

layerskip-llama2-13B access denied

Ulysses Sequence Parallelism: Training with Million-Token Contexts

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Custom Kernels for All from Codex and Claude

Training Design for Text-to-Image Models: Lessons from Ablations

Text-to-image Architectural Experiments

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

You could have designed state of the art positional encoding

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

FineWeb: decanting the web for the finest text data at scale

The Smol Training Playbook