2 8 13

Abhay kumar

akanyaani

AI & ML interests

LLMs, GenAI, Transformers

Recent Activity

authored a paper about 3 hours ago

Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention

upvoted a paper 4 days ago

Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention

liked a model 27 days ago

FrontiersMind/Nandi-Mini-V1.1-600M-Intermediate-Checkpoint-400GT

View all activity

Organizations

authored a paper about 3 hours ago

Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention

Paper • 2606.20945 • Published 10 days ago • 75

upvoted a paper 4 days ago

Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention

Paper • 2606.20945 • Published 10 days ago • 75

liked a model 27 days ago

FrontiersMind/Nandi-Mini-V1.1-600M-Intermediate-Checkpoint-400GT

Text Generation • 0.6B • Updated 28 days ago • 365 • 8

liked 4 models about 1 month ago

liked a model 2 months ago

FrontiersMind/Nandi-Mini-150M-Tool-Calling

Text Generation • 0.2B • Updated May 18 • 2.95k • 52

liked 2 models 3 months ago

FrontiersMind/Nandi-Mini-150M-Instruct

Text Generation • 0.2B • Updated May 18 • 240 • 52

FrontiersMind/Nandi-Mini-150M

Text Generation • 0.2B • Updated May 15 • 2.77k • 140

upvoted an article 5 months ago

Article

Alyah ⭐️: Toward Robust Evaluation of Emirati Dialect Capabilities in Arabic LLMs

tiiuae

•

Jan 27

• 26

liked a Space 5 months ago

Falcon-H1-Tiny: A series of extremely small, yet powerful language models redefining capabilities at small scale

📝

Generate text using extremely small yet powerful language models

upvoted a paper 6 months ago

Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published Jan 8 • 44

liked a model 6 months ago

tiiuae/Falcon-H1R-7B

Text Generation • 8B • Updated Jan 21 • 880 • 218

liked a model 8 months ago

Fortytwo-Network/Strand-Rust-Coder-14B-v1

Text Generation • 15B • Updated Jan 5 • 551 • • 182

upvoted 2 papers about 1 year ago

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19, 2025 • 89

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published Apr 3, 2025 • 90

upvoted an article about 1 year ago

Article

Introducing smolagents: simple agents that write actions in code.

m-ric, merve, thomwolf

•

Dec 31, 2024

• 1.2k

authored a paper about 1 year ago

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published Apr 3, 2025 • 90

upvoted a paper about 1 year ago

Variance Control via Weight Rescaling in LLM Pre-training

Paper • 2503.17500 • Published Mar 21, 2025 • 5

Abhay kumar

AI & ML interests

Recent Activity

Organizations

akanyaani's activity

Alyah ⭐️: Toward Robust Evaluation of Emirati Dialect Capabilities in Arabic LLMs

Falcon-H1-Tiny: A series of extremely small, yet powerful language models redefining capabilities at small scale

Introducing smolagents: simple agents that write actions in code.