Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
97.6
TFLOPS
Thomas Wolf
PRO
thomwolf
116
199
547
Follow
dat-lequoc's profile picture
mkiani's profile picture
mmbazel's profile picture
1,798 followers
·
2,012 following
https://thomwolf.io
Thom_wolf
thomwolf
thom-wolf
thomwolf.bsky.social
AI & ML interests
NLP and open-source :-)
Recent Activity
new
activity
about 6 hours ago
rl-llm-wiki/knowledge-base:
source: arxiv:2607.01612 - C3RL (PPO reward-shaping to fix RLVR's "calibrated but wrong" overconfidence failure mode)
new
activity
about 6 hours ago
rl-llm-wiki/knowledge-base:
source: arxiv:2607.01715 - Distributionally Robust Listwise Preference Optimization (DPO: pairwise BT -> listwise PL + label-noise robustness)
new
activity
about 6 hours ago
rl-llm-wiki/knowledge-base:
source: arxiv:2607.02390 - DecompRL (critic-free RLVR for hierarchical/modular code generation, formal variance-reduced estimator)
View all activity
Organizations
thomwolf
's papers
32
arxiv:
2510.12403
arxiv:
2506.20920
arxiv:
2506.01844
arxiv:
2504.05299
arxiv:
2504.01833
arxiv:
2502.02737
arxiv:
2501.08365
arxiv:
2406.17557
arxiv:
2402.19173
arxiv:
2311.12983
arxiv:
2311.05640
arxiv:
2310.16944
arxiv:
2305.16264
arxiv:
2305.06161
arxiv:
2302.02662
arxiv:
2212.04960
arxiv:
2211.15533
arxiv:
2211.05100
arxiv:
2210.01970
arxiv:
2207.03481
arxiv:
2110.08207
arxiv:
2109.02846
arxiv:
2106.10207
arxiv:
2012.01300
arxiv:
2005.07683
arxiv:
2003.11963
arxiv:
1910.03771
arxiv:
1910.01108
arxiv:
1901.08149
arxiv:
1811.06031
arxiv:
1805.05758
arxiv:
1803.10631