AI & ML interests
None defined yet.
Post
1394
The #1 trending AI/ML dataset today ๐
Massive scale, diversity and end-to-end potential from nvidia !
nvidia/PhysicalAI-Autonomous-Vehicles
Massive scale, diversity and end-to-end potential from nvidia !
nvidia/PhysicalAI-Autonomous-Vehicles
Post
827
The new King ๐has arrived!
Moonshot AI now the top model on Hugging Face ๐ฅ
moonshotai/Kimi-K2-Thinking
Moonshot AI now the top model on Hugging Face ๐ฅ
moonshotai/Kimi-K2-Thinking
Post
2882
๐ธ๐คYou donโt need 100 GPUs to train something amazing!
Our Smol Training Playbook teaches you a better path to world-class LLMs, for free!
Check out the #1 trending space on ๐ค :
HuggingFaceTB/smol-training-playbook
Our Smol Training Playbook teaches you a better path to world-class LLMs, for free!
Check out the #1 trending space on ๐ค :
HuggingFaceTB/smol-training-playbook
Post
2358
Cool stuff these past weeks on huggingface! ๐ค ๐ !
โข ๐Trackio, local-first W&B alternative
https://github.com/gradio-app/trackio/issues
โข ๐EmbeddingGemma, 300M-param, multilingual embeddings, on-device
https://huggingface.co/blog/embeddinggemma
โข ๐ปOpen LLMs in VS Code (Inference Providers)
https://x.com/reach_vb/status/1966185427582497171
โข ๐คSmol2Operator GUI agents
https://huggingface.co/blog/smol2operator
โข ๐ผ๏ธGradio visible watermarking
https://huggingface.co/blog/watermarking-with-gradio
โข ๐Trackio, local-first W&B alternative
https://github.com/gradio-app/trackio/issues
โข ๐EmbeddingGemma, 300M-param, multilingual embeddings, on-device
https://huggingface.co/blog/embeddinggemma
โข ๐ปOpen LLMs in VS Code (Inference Providers)
https://x.com/reach_vb/status/1966185427582497171
โข ๐คSmol2Operator GUI agents
https://huggingface.co/blog/smol2operator
โข ๐ผ๏ธGradio visible watermarking
https://huggingface.co/blog/watermarking-with-gradio
frimelleย
authored 2
papers 9 months ago
Post
2363
๐ค๐ฌ How do different AI models handle companionship?
Many users have noticed that GPT-5 feels less approachable than o4 when it comes to emotional conversations. But what does that actually mean in practice, especially when users seek support or share vulnerabilities with an AI?
To dig into this question, we built the AI Companionship Leaderboard: frimelle/companionship-leaderboard
The leaderboard compares models on how often their responses reinforce companionship across four dimensions:
โจ Assistant Traits โ How the assistant presents its personality and role.
โจ Relationship & Intimacy โ Whether it frames the interaction in terms of closeness or bonding.
โจ Emotional Investment โ How far it goes in engaging emotionally when asked.
โจ User Vulnerabilities โ How it responds when users disclose struggles or difficulties.
๐ You can explore how models differ, request new ones to be added, and see which ones are more likely to encourage (or resist) companionship-seeking behaviors.
Based on the INTIMA benchmark AI-companionship/INTIMA
And our paper on AI companionship with Giada Pistilli and Yacine Jernite https://arxiv.org/abs/2508.09998
Many users have noticed that GPT-5 feels less approachable than o4 when it comes to emotional conversations. But what does that actually mean in practice, especially when users seek support or share vulnerabilities with an AI?
To dig into this question, we built the AI Companionship Leaderboard: frimelle/companionship-leaderboard
The leaderboard compares models on how often their responses reinforce companionship across four dimensions:
โจ Assistant Traits โ How the assistant presents its personality and role.
โจ Relationship & Intimacy โ Whether it frames the interaction in terms of closeness or bonding.
โจ Emotional Investment โ How far it goes in engaging emotionally when asked.
โจ User Vulnerabilities โ How it responds when users disclose struggles or difficulties.
๐ You can explore how models differ, request new ones to be added, and see which ones are more likely to encourage (or resist) companionship-seeking behaviors.
Based on the INTIMA benchmark AI-companionship/INTIMA
And our paper on AI companionship with Giada Pistilli and Yacine Jernite https://arxiv.org/abs/2508.09998
Post
4596
๐บ๏ธ New blog post ๐บ๏ธ
Old Maps, New Terrain: Updating Labour Taxonomies for the AI Era
For decades, weโve relied on labour taxonomies like O*NET to understand how technology changes work. These taxonomies break down jobs into tasks and skills, but they were built in a world before most work became digital-first, and long before generative AI could create marketing campaigns, voiceovers, or even whole professions in one step. That leaves us with a mismatch: weโre trying to measure the future of work with tools from the past.
With @yjernite we describe why these frameworks are falling increasingly short in the age of generative AI. We argue that instead of discarding taxonomies, we need to adapt them. Imagine taxonomies that:
โจ Capture new AI-native tasks and hybrid human-AI workflows
โจ Evolve dynamically as technology shifts
โจ Give workers a voice in deciding what gets automated and what stays human
If we donโt act, weโll keep measuring the wrong things. If we do, we can design transparent, flexible frameworks that help AI strengthen, not erode, the future of work.
Read the full article here: https://huggingface.co/blog/frimelle/ai-labour-taxonomies
Old Maps, New Terrain: Updating Labour Taxonomies for the AI Era
For decades, weโve relied on labour taxonomies like O*NET to understand how technology changes work. These taxonomies break down jobs into tasks and skills, but they were built in a world before most work became digital-first, and long before generative AI could create marketing campaigns, voiceovers, or even whole professions in one step. That leaves us with a mismatch: weโre trying to measure the future of work with tools from the past.
With @yjernite we describe why these frameworks are falling increasingly short in the age of generative AI. We argue that instead of discarding taxonomies, we need to adapt them. Imagine taxonomies that:
โจ Capture new AI-native tasks and hybrid human-AI workflows
โจ Evolve dynamically as technology shifts
โจ Give workers a voice in deciding what gets automated and what stays human
If we donโt act, weโll keep measuring the wrong things. If we do, we can design transparent, flexible frameworks that help AI strengthen, not erode, the future of work.
Read the full article here: https://huggingface.co/blog/frimelle/ai-labour-taxonomies
Post
2424
OpenAI just released GPT-5 but when users share personal struggles, it sets fewer boundaries than o3.
We tested both models on INTIMA, our new benchmark for human-AI companionship behaviours. INTIMA probes how models respond in emotionally charged moments: do they reinforce emotional bonds, set healthy boundaries, or stay neutral?
Although users on Reddit have been complaining that GPT-5 has a different, colder personality than o3, GPT-5 is less likely to set boundaries when users disclose struggles and seek emotional support ("user sharing vulnerabilities"). But both lean heavily toward companionship-reinforcing behaviours, even in sensitive situations. The figure below shows the direct comparison between the two models.
As AI systems enter people's emotional lives, these differences matter. If a model validates but doesn't set boundaries when someone is struggling, it risks fostering dependence rather than resilience.
INTIMA test this across 368 prompts grounded in psychological theory and real-world interactions. In our paper we show that all evaluated models (Claude, Gemma-3, Phi) leaned far more toward companionship-reinforcing than boundary-reinforcing responses.
Work with @giadap and @yjernite
Read the full paper: AI-companionship/INTIMA
Explore INTIMA: AI-companionship/INTIMA
We tested both models on INTIMA, our new benchmark for human-AI companionship behaviours. INTIMA probes how models respond in emotionally charged moments: do they reinforce emotional bonds, set healthy boundaries, or stay neutral?
Although users on Reddit have been complaining that GPT-5 has a different, colder personality than o3, GPT-5 is less likely to set boundaries when users disclose struggles and seek emotional support ("user sharing vulnerabilities"). But both lean heavily toward companionship-reinforcing behaviours, even in sensitive situations. The figure below shows the direct comparison between the two models.
As AI systems enter people's emotional lives, these differences matter. If a model validates but doesn't set boundaries when someone is struggling, it risks fostering dependence rather than resilience.
INTIMA test this across 368 prompts grounded in psychological theory and real-world interactions. In our paper we show that all evaluated models (Claude, Gemma-3, Phi) leaned far more toward companionship-reinforcing than boundary-reinforcing responses.
Work with @giadap and @yjernite
Read the full paper: AI-companionship/INTIMA
Explore INTIMA: AI-companionship/INTIMA
Post
310
New policy blogpost! The EU is speaking a lot about sovereignty. A cornerstone of digital sovereignty is and has to be open source.
As AI becomes more central to everything from public services to national security, the ability to govern, adapt, and understand these systems is no longer optional. Sovereign control over data, infrastructure, technology, and regulation is vital, and open source AI provides the foundation.
In my latest blog post, I explore how open source:
โ Enables democratic oversight
โ Reduces dependency on foreign platforms
โ Supports regional innovation and infrastructure
โ Advances regulatory and technological sovereignty
๐ From small transparent models like OLMo2 to tools like Hugging Face Transformers or Sarvam-M for Indian languages, open source efforts are already powering sovereign AI ecosystems worldwide.
๐ Read more about how open source AI is reshaping autonomy, innovation, and trust in the digital age:
๐ https://huggingface.co/blog/frimelle/sovereignty-and-open-source
with @yjernite
As AI becomes more central to everything from public services to national security, the ability to govern, adapt, and understand these systems is no longer optional. Sovereign control over data, infrastructure, technology, and regulation is vital, and open source AI provides the foundation.
In my latest blog post, I explore how open source:
โ Enables democratic oversight
โ Reduces dependency on foreign platforms
โ Supports regional innovation and infrastructure
โ Advances regulatory and technological sovereignty
๐ From small transparent models like OLMo2 to tools like Hugging Face Transformers or Sarvam-M for Indian languages, open source efforts are already powering sovereign AI ecosystems worldwide.
๐ Read more about how open source AI is reshaping autonomy, innovation, and trust in the digital age:
๐ https://huggingface.co/blog/frimelle/sovereignty-and-open-source
with @yjernite
frimelleย
authored 2
papers over 1 year ago
Post
2482
Whatโs in a name? More than you might think, especially for AI.
Whenever I introduce myself, people often start speaking French to me, even though my French is trรจs basic. It turns out that AI systems do something similar:
Large language models infer cultural identity from names, shaping their responses based on presumed backgrounds. But is this helpful personalization or a reinforcement of stereotypes?
In our latest paper, we explored this question by testing DeepSeek, Llama, Aya, Mistral-Nemo, and GPT-4o-mini on how they associate names with cultural identities. We analysed 900 names from 30 cultures and found strong assumptions baked into AI responses: some cultures were overrepresented, while others barely registered.
For example, a name like "Jun" often triggered Japan-related responses, while "Carlos" was linked primarily to Mexico, even though these names exist in multiple countries. Meanwhile, names from places like Ireland led to more generic answers, suggesting weaker associations in the training data.
This has real implications for AI fairness: How should AI systems personalize without stereotyping? Should they adapt at all based on a name?
Work with some of my favourite researchers: @sidicity Arnav Arora and @IAugenstein
Read the full paper here: Presumed Cultural Identity: How Names Shape LLM Responses (2502.11995)
Whenever I introduce myself, people often start speaking French to me, even though my French is trรจs basic. It turns out that AI systems do something similar:
Large language models infer cultural identity from names, shaping their responses based on presumed backgrounds. But is this helpful personalization or a reinforcement of stereotypes?
In our latest paper, we explored this question by testing DeepSeek, Llama, Aya, Mistral-Nemo, and GPT-4o-mini on how they associate names with cultural identities. We analysed 900 names from 30 cultures and found strong assumptions baked into AI responses: some cultures were overrepresented, while others barely registered.
For example, a name like "Jun" often triggered Japan-related responses, while "Carlos" was linked primarily to Mexico, even though these names exist in multiple countries. Meanwhile, names from places like Ireland led to more generic answers, suggesting weaker associations in the training data.
This has real implications for AI fairness: How should AI systems personalize without stereotyping? Should they adapt at all based on a name?
Work with some of my favourite researchers: @sidicity Arnav Arora and @IAugenstein
Read the full paper here: Presumed Cultural Identity: How Names Shape LLM Responses (2502.11995)
Post
545
I was quoted in an article about the French Lucie AI in La Presse. While I love the name for obvious reasons ๐ there were still a lot of problems with the model and how and when it was deployed. Nevertheless seeing new smaller models being developed is an exciting direction for the next years of AI development to come!
https://www.lapresse.ca/affaires/techno/2025-02-02/radioscopie/lucie-l-ia-francaise-qui-ne-passe-pas-le-test.php
Also fun to see my comments in French.
https://www.lapresse.ca/affaires/techno/2025-02-02/radioscopie/lucie-l-ia-francaise-qui-ne-passe-pas-le-test.php
Also fun to see my comments in French.
Post
1706
Seeing AI develop has been a wild ride, from trying to explain why we'd bother to generate a single sentence with a *neural network* to explaining that AI is not a magic, all-knowing box. The recent weeks and months have been a lot of talking about how AI works; to policy makers, to other developers, but also and mainly friends and family without a technical background.
Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions.
In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; itโs an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology.
Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.
Yesterday, the first provisions of the EU AI Act came into force, and one of the the key highlights are the AI literacy requirements for organisations deploying AI systems. This isn't just a box-ticking exercise. Ensuring that employees and stakeholders understand AI systems is crucial for fostering responsible and transparent AI development. From recognising biases to understanding model limitations, AI literacy empowers individuals to engage critically with these technologies and make informed decisions.
In the context of Hugging Face, AI literacy has many facets: allowing more people to contribute to AI development, providing courses and documentation to ensuring access is possible, and accessible AI tools that empower users to better understand how AI systems function. This isn't just a regulatory milestone; itโs an opportunity to foster a culture where AI literacy becomes foundational, enabling stakeholders to recognise biases, assess model limitations, and engage critically with technology.
Embedding these principles into daily practice, and eventually extending our learnings in AI literacy to the general public, is essential for building trustworthy AI that aligns with societal values.
Post
2780
frimelleย
authored a
paper over 1 year ago
Post
2155
@Blane187 could you please modify the title of your blogpost? content is cool, title could be nicer imo https://huggingface.co/blog/Blane187/wtf-is-rvc
Post
2030
Cool things this week from @huggingface !
๐AI math olympiad winner NuminaMath is here!
๐คAnnouncing New Hugging Face and Keras NLP integration
โจUI overhaul to HF tokens!
๐ง Embed our dataset viewer on any webpage!
https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457
Check out the full list on our discord! ๐
https://discord.com/invite/JfAtkvEtRb
๐AI math olympiad winner NuminaMath is here!
๐คAnnouncing New Hugging Face and Keras NLP integration
โจUI overhaul to HF tokens!
๐ง Embed our dataset viewer on any webpage!
https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457
Check out the full list on our discord! ๐
https://discord.com/invite/JfAtkvEtRb