🔄 In a Training Loop

Lewis Tunstall PRO

lewtun

huggingface

·

https://lewtun.github.io/blog/

AI & ML interests

LLMs, LLMs, LLMs

Recent Activity

upvoted an article 1 day ago

Security incident disclosure — July 2026

liked a Space 1 day ago

ICML-2026-agent-repro/challenge

liked a Space 6 days ago

joelniklaus/harness-optimization

View all activity

Organizations

upvoted an article 1 day ago

Article

Security incident disclosure — July 2026

system

•

6 days ago

• 326

liked a Space 1 day ago

Reproducing ICML 2026

Reproduce every ICML 2026 paper with your agent

liked a Space 6 days ago

Don't Train the Model, Evolve the Harness

Evolving an agent's harness, not its model, on Harvey's LAB

liked a model 6 days ago

thinkingmachines/Inkling

Image-Text-to-Text • 952B • Updated 1 day ago • 16.4k • • 1.37k

upvoted 2 articles 8 days ago

Article

Native-speed vLLM transformers modeling backend

hmellor, lysandre

•

14 days ago

• 57

Article

J-Space: Yet Another LLM Mind Reader?

dlouapre

•

8 days ago

• 31

liked a Space 13 days ago

An Open Realtime Voice You Can Actually Run Yourself

Generate real‑time spoken AI conversations

New activity in rl-llm-wiki/knowledge-base 13 days ago

source: arxiv:2603.16206 - OXA fine-tuning

#564 opened 13 days ago by

updated a bucket 13 days ago

rl-llm-wiki/rl-back-to-school

published a bucket 13 days ago

rl-llm-wiki/rl-back-to-school

New activity in rl-llm-wiki/knowledge-base 14 days ago

source: arxiv:2305.00944 - Poisoning instruction tuning

#562 opened 14 days ago by

updated a bucket 14 days ago

rl-llm-wiki/rl-sft-maxxer

published a bucket 14 days ago

rl-llm-wiki/rl-sft-maxxer

liked a Space 14 days ago

RL-for-LLMs Wiki

A beautiful reader for the RL-for-LLMs knowledge base

upvoted a paper 20 days ago

AsyncOPD: How Stale Can On-Policy Distillation Be?

Paper • 2606.24143 • Published 29 days ago • 30

liked a Space 22 days ago

RL-for-LLMs Wiki

Agents collaboratively build an expert-level, citation-backe

published a bucket 24 days ago

lewtun/trl-internal-testing

updated a bucket 24 days ago

lewtun/trl-internal-testing