Makar Gorokhovik

Lofftavelglarn

9 21

AI & ML interests

Hallucination Detection

Recent Activity

liked a Space about 1 month ago

AlexWortega/same-data-different-losses

liked a model 3 months ago

talkie-lm/talkie-1930-13b-it

liked a model 3 months ago

Qwen/Qwen3.6-27B

View all activity

Organizations

None yet

liked a Space about 1 month ago

Weight-Space Geometry of Offline Reasoning Training

🧭

Interactive weight-space geometry of six reasoning losses

liked 2 models 3 months ago

talkie-lm/talkie-1930-13b-it

Updated Apr 23 • 288

Qwen/Qwen3.6-27B

Image-Text-to-Text • 28B • Updated Apr 24 • 6.01M • • 2.07k

liked a model 4 months ago

google/gemma-4-E4B-it

Any-to-Any • 8B • Updated 7 days ago • 5.73M • 1.41k

upvoted 2 papers 4 months ago

Hyperagents

Paper • 2603.19461 • Published Mar 19 • 51

GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published Mar 14 • 36

liked a Space 5 months ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

267

Visualize synthetic‑data experiments as an interactive bookshelf

upvoted a paper 5 months ago

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Paper • 2602.23866 • Published Feb 27 • 92

liked a model 5 months ago

unsloth/Qwen3.5-4B-GGUF

Image-Text-to-Text • 4B • Updated Mar 2 • 1.16M • 346

liked a dataset 5 months ago

peteromallet/dataclaw-peteromallet

Viewer • Updated Feb 25 • 549 • 183 • 301

liked 2 models 5 months ago

LiquidAI/LFM2-24B-A2B-GGUF

Text Generation • 24B • Updated Mar 30 • 4.87k • 140

Qwen/Qwen2.5-72B-Instruct

Text Generation • 73B • Updated Jan 12, 2025 • 525k • • 965

upvoted 3 articles 6 months ago

Article

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

huggingface

•

Jan 27

• 45

Article

One Year Since the “DeepSeek Moment”

huggingface

•

Jan 20

• 63

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

drbh, danieldk

•

Aug 18, 2025

• 109

liked 3 models 6 months ago

liked a dataset 6 months ago

KbsdJames/Omni-MATH

Viewer • Updated Oct 12, 2024 • 4.43k • 3.89k • 132

liked a Space 7 months ago

The Ultra-Scale Playbook

🌌

3.95k

The ultimate guide to training LLM on large GPU Clusters

Makar Gorokhovik

AI & ML interests

Recent Activity

Organizations

Lofftavelglarn's activity

Weight-Space Geometry of Offline Reasoning Training

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

One Year Since the “DeepSeek Moment”

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

The Ultra-Scale Playbook