AI & ML interests

None defined yet.

Recent Activity

IlyasMoutawwakilΒ 
posted an update 3 days ago
view post
Post
2720
Transformers v5 just landed! πŸš€
It significantly unifies and reduces modeling code across architectures, while opening the door to a whole new class of performance optimizations.

My favorite new feature? πŸ€”
The new dynamic weight loader + converter. Here’s why πŸ‘‡

Over the last few months, the core Transformers maintainers built an incredibly fast weight loader, capable of converting tensors on the fly while loading them in parallel threads. This means we’re no longer constrained by how parameters are laid out inside the safetensors weight files.

In practice, this unlocks two big things:
- Much more modular modeling code. You can now clearly see how architectures build on top of each other (DeepSeek v2 β†’ v3, Qwen v2 β†’ v3 β†’ MoE, etc.). This makes shared bottlenecks obvious and lets us optimize the right building blocks once, for all model families.
- Performance optimizations beyond what torch.compile can do alone. torch.compile operates on the computation graph, but it can’t change parameter layouts. With the new loader, we can restructure weights at load time: fusing MoE expert projections, merging attention QKV projections, and enabling more compute-dense kernels that simply weren’t possible before.

Personally, I'm honored to have contributed in this direction, including the work on optimizing MoE implementations and making modeling code more torch-exportable, so these optimizations can be ported cleanly across runtimes.

Overall, Transformers v5 is a strong signal of where the community and industry are converging: Modularity and Performance, without sacrificing Flexibility.

Transformers v5 makes its signature from_pretrained an entrypoint where you can mix and match:
- Parallelism
- Quantization
- Custom kernels
- Flash/Paged attention
- Continuous batching
- ...

Kudos to everyone involved! I highly recommend the:
Release notes: https://github.com/huggingface/transformers/releases/tag/v5.0.0
Blog post: https://huggingface.co/blog/transformers-v5
Β·
IlyasMoutawwakilΒ 
posted an update 7 days ago
view post
Post
2288
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly πŸ”₯

Why it had to be done πŸ‘‡
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
βš™οΈ Compile end-to-end with torch.compile backends
πŸ“¦ Export reliably via torch.export and torch.onnx.export
πŸš€ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !
  • 1 reply
Β·
regisssΒ 
in AIEnergyScore/README about 1 month ago
megΒ 
posted an update 3 months ago
view post
Post
3976
πŸ€– Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: ✨The Voice Consent Gate.✨
Details, code, here: https://huggingface.co/blog/voice-consent-gate
  • 3 replies
Β·
megΒ 
posted an update 5 months ago
view post
Post
2931
πŸ€– As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit πŸ‡ to help know what's real: Visible watermarks. With the Gradio team, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how: https://huggingface.co/blog/watermarking-with-gradio
Thanks to the code collab in particular from @abidlabs and Yuvraj Sharma.
yjerniteΒ 
posted an update 5 months ago
view post
Post
2698
Tremendous quality of life upgrade on the Hugging Face Hub - we now have auto-complete emojis πŸ€— πŸ₯³ πŸ‘ πŸ™Œ πŸŽ‰

Get ready for lots more very serious analysis on a whole range of topics from yours truly now that we have unlocked this full range of expression πŸ˜„ πŸ€” πŸ—£ πŸ™Š
megΒ 
posted an update 6 months ago
megΒ 
posted an update 6 months ago
view post
Post
457
πŸ€– ICYMI: Yesterday, Hugging Face and OpenAI partnered to bring open source GPT to the public. This is a Big Deal in "AI world".

0. Common ground setting: OpenAI is the ChatGPT people. An β€œopen source” model is one whose weights are available β€” that means the model can be β€œyours”.
1. You don’t have to interact with the company directly, nor give them your interactions, to use the system. The company can't "surveil" you.
2. You can evaluate the unique contributions of their SOTA model much more rigorously than you can when there are collections of models+code behind a closed API. You can find out specifically what the model can and can't do.
3. And you can directly customize it for whatever you'd like. Fine-tuning, wherein you give the model data that's tailored to your use cases and train it some more on that data, is trivial* when you have the model weights.
*Provided you have the compute.
4. You can directly benchmark whatever you'd like. Biases? Energy usage? Strengths/weaknesses? Go for it. You wants it you gots it--this transparency helps people understand SOTA *in general*, not just for this model, but points to, e.g., what's going on with closed Google models as well.
5. One of the most powerful things about "openness" that I've learned is that it cultivates ecosystems of collaborators building on top of one another's brilliance to make systems that are significantly better than they would be if created in isolation.
But, caveat wrt my own philosophy...
6. I do not take it as a given that advancing LLMs is good, and have a lot more to say wrt where I think innovation should focus more. For example, a focus on *data* -- curation, measurement, consent, credit, compensation, safety -- would deeply improve technology for everyone.
7. The transparency this release provides is massive for people who want to *learn* about LLMs. For the next generation of technologists to advance over the current, they MUST be able to learn about what's happening now. (cont...)
  • 1 reply
Β·
megΒ 
posted an update 6 months ago
view post
Post
509
πŸ€– πŸ‘Ύ Thanks so much to BBC News and the stellar Suranjana Tewari for having me on to talk about US <β€”> China relationship in AI, and what it means for AI ethics.