1 2

Gustavo A Lujan PRO

lujangusface

AI & ML interests

None yet

Recent Activity

published an article 23 days ago

SpecJAX: A Speculative Decoding Library for TPUs

published an article 26 days ago

We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.

published an article 28 days ago

1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next

View all activity

Organizations

published an article 23 days ago

Article

SpecJAX: A Speculative Decoding Library for TPUs

lujangusface

•

23 days ago

published an article 26 days ago

Article

We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.

lujangusface

•

26 days ago

• 1

published an article 28 days ago

Article

1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next

lujangusface

•

28 days ago

updated a model 28 days ago

thoughtworks/Qwen3-Coder-Next-Eagle3

Text Generation • 0.1B • Updated 28 days ago • 548 • 1

published a model 28 days ago

thoughtworks/Qwen3-Coder-Next-Eagle3

Text Generation • 0.1B • Updated 28 days ago • 548 • 1

published an article 29 days ago

Article

1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7

lujangusface

•

29 days ago

• 1

updated a model 29 days ago

thoughtworks/GLM-4.7-FP8-Eagle3

Text Generation • 0.6B • Updated 29 days ago • 33 • 1

published a model 29 days ago

thoughtworks/GLM-4.7-FP8-Eagle3

Text Generation • 0.6B • Updated 29 days ago • 33 • 1

updated 2 models about 1 month ago

thoughtworks/MiniMax-M2.5-Eagle3

Text Generation • 0.2B • Updated Apr 12 • 4.54k • 5

thoughtworks/GLM-4.7-Flash-Eagle3

Text Generation • 0.1B • Updated Apr 12 • 333 • 2

published an article about 1 month ago

Article

2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5

lujangusface

•

Apr 9

• 3

published a model about 1 month ago

thoughtworks/MiniMax-M2.5-Eagle3

Text Generation • 0.2B • Updated Apr 12 • 4.54k • 5

upvoted an article about 1 month ago

Article

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.

lujangusface

•

Apr 7

• 2

published an article about 1 month ago

Article

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.

lujangusface

•

Apr 7

• 2

updated a model about 1 month ago

thoughtworks/Gemma-4-31B-Eagle3

Text Generation • 0.6B • Updated Apr 7 • 451 • 5

published a model about 1 month ago

thoughtworks/Gemma-4-31B-Eagle3

Text Generation • 0.6B • Updated Apr 7 • 451 • 5

Gustavo A Lujan PRO

AI & ML interests

Recent Activity

Organizations

lujangusface's activity

SpecJAX: A Speculative Decoding Library for TPUs

We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.

1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next

1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7

2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.