Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
2
3
Nicolas PREVOT
Nomiia
Follow
lbourdois's profile picture
1 follower
·
3 following
AI & ML interests
None yet
Recent Activity
upvoted
an
article
6 days ago
Introduction to Trimming ✂
liked
a Space
about 1 year ago
CATIE-AQ/FAT5-report
reacted
to
lbourdois
's
post
with ❤️
about 1 year ago
We introduce FAT5 (Flash Attention T5) ⚡ An implementation of T5 in PyTorch with UL2 objective optimized for GPGPU for both training and inference thanks to 13 different optimizations. The main one is that we have designed a CUDA kernel to expand the Flash Attention by @tridao with RPE biases and supports other PE such as RoPE, ALiBi or FIRE. The result kernel is 2 times faster than a SPDA implementation. We also use Triton kernels to optimize certain parts of the architecture, such as the cross-entropy and RMSNorm layer. The various kernels have been carefully built to be compatible with BF16 and torch.compile to go even faster and achieve efficient pretraining. All other optimizations are described in a 📝 subsequent blog post available on @huggingface 🤗: https://huggingface.co/spaces/CATIE-AQ/FAT5-report. This methodology enabled us to efficiently pretrain as a proof of concept a FAT5 with 147M parameters in French in a reasonable time (1,461H for 419B tokens), with limited resources (1 A100 i.e. a computational budget of ~ €1,900) and a low carbon footprint (13.5kg eq CO2). The model's weights are also available on Hugging Face: https://huggingface.co/CATIE-AQ/FAT5-small. Not very useful in practice, it's a PoC and not an instructed model (it's planned for later). All the code is available on GitHub if you want to pretrain your own model in your own language or for a specific domain: https://github.com/catie-aq/flashT5 ⭐ Ending by indicating that was a joint project with @BorisAlbar at hf.co/CATIE-AQ.
View all activity
Organizations
models
0
None public yet
datasets
0
None public yet