amico's picture

amico

amico

·

AI & ML interests

None yet

Recent Activity

upvoted a collection about 12 hours ago

Qwen-AgentWorld

liked a model about 12 hours ago

baidu/Unlimited-OCR

liked a model 5 days ago

Datadog/Toto-2.0-313m

View all activity

Organizations

None yet

upvoted a collection about 12 hours ago

Qwen-AgentWorld

3 items • Updated 3 days ago • 51

upvoted a paper 11 days ago

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 84

upvoted an article 22 days ago

Article

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

JetBrains

•

25 days ago

• 32

upvoted an article 29 days ago

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

about 1 month ago

• 17

upvoted a collection about 2 months ago

Gemma 4

15 items • Updated 16 days ago • 992

upvoted a paper about 2 months ago

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Paper • 2604.26779 • Published Apr 29 • 14

upvoted an article 2 months ago

Article

The PR you would have opened yourself

pcuenq, awni

•

Apr 16

• 72

upvoted a collection 2 months ago

VRAG

6 items • Updated Apr 2 • 12

upvoted 4 papers 3 months ago

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 116

Synthetic Sandbox for Training Machine Learning Engineering Agents

Paper • 2604.04872 • Published Apr 6 • 14

Hyperagents

Paper • 2603.19461 • Published Mar 19 • 51

Mixture-of-Depths Attention

Paper • 2603.15619 • Published Mar 16 • 81

upvoted an article 4 months ago

Article

How NVIDIA Builds Open Data for AI

nvidia

•

Mar 10

• 17

upvoted a paper 4 months ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 266

upvoted an article 4 months ago

Article

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

+3

christian-washington, ajasuja, santosh-iima, lewtun, burtenshaw

•

Feb 12

• 35

upvoted 2 articles 5 months ago

Article

Optimizing GLM4-MoE for Production: 65% Faster TTFT with SGLang

novita

•

Jan 22

• 10

Article

Differential Transformer V2

microsoft

•

Jan 20

• 52

upvoted a paper 5 months ago

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56

upvoted a collection 7 months ago

Mistral Large 3

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 100

upvoted an article 7 months ago

Article

Continuous batching from first principles

+1

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 411