Anthonny Olime

Aviv-anthonnyolime

4 196 631

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

deepseek-ai/DeepSeek-V4-Pro-DSpark

liked a model 9 days ago

LiquidAI/LFM2.5-230M

updated a collection 10 days ago

Model - Misc

View all activity

Organizations

upvoted a collection 3 months ago

TIPSv2

Collection

TIPSv2 foundational vision-language models. Webpage: https://gdm-tipsv2.github.io/ • 9 items • Updated Apr 14 • 38

upvoted 2 papers 3 months ago

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published Apr 9 • 54

Attention Residuals

Paper • 2603.15031 • Published Mar 16 • 189

upvoted a paper 4 months ago

FireRed-Image-Edit-1.0 Techinical Report

Paper • 2602.13344 • Published Feb 12 • 8

upvoted a collection 4 months ago

TADA

Collection

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment | https://huggingface.co/papers/2602.23068 • 7 items • Updated Mar 24 • 71

upvoted 4 articles 4 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 167

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb

•

Jun 12, 2025

• 164

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

Article

PRX Part 3 — Training a Text-to-Image Model in 24h!

Photoroom

•

Mar 3

• 67

upvoted 2 papers 5 months ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 105

Qwen3-TTS Technical Report

Paper • 2601.15621 • Published Jan 22 • 77

upvoted a collection 7 months ago

Ministral 3

Collection

A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 170

upvoted an article 7 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 312

upvoted a paper 7 months ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 248

upvoted an article 7 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

NormalUhr

•

Aug 9, 2025

• 128

upvoted 2 articles 8 months ago

Article

Text-to-image Architectural Experiments

Photoroom

•

Nov 13, 2025

• 60

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

codelion

•

Nov 3, 2025

• 65

upvoted an article 10 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez

•

Sep 11, 2025

• 188

upvoted a paper 11 months ago

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Paper • 2411.05007 • Published Nov 7, 2024 • 25

upvoted an article 11 months ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

drbh, danieldk

•

Aug 18, 2025

• 104

Anthonny Olime

AI & ML interests

Recent Activity

Organizations

Aviv-anthonnyolime's activity

NEO-unify: Building Native Multimodal Unified Models End to End

Learn the Hugging Face Kernel Hub in 5 Minutes

Mixture of Experts (MoEs) in Transformers

PRX Part 3 — Training a Text-to-Image Model in 24h!

Transformers v5: Simple model definitions powering the AI ecosystem

From GRPO to DAPO and GSPO: What, Why, and How

Text-to-image Architectural Experiments

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels