Suchit G

asquirous

1 9 21

https://suchitg04.github.io/blog/

AI & ML interests

None yet

Organizations

upvoted 4 articles about 1 year ago

Article

Mastering Tensor Dimensions in Transformers

not-lain

•

Jan 12, 2025

• 185

Article

Decoding Strategies in Large Language Models

mlabonne

•

Oct 29, 2024

• 114

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 355

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 488

upvoted 2 articles over 1 year ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

sirluk

•

Oct 7, 2024

• 71

Article

SmolLM - blazingly fast and remarkably powerful

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 460

upvoted a collection over 1 year ago

BigBanyanTree CommonCrawl (2018-2024)

Collection

7 items • Updated Oct 9, 2024 • 1

upvoted an article almost 2 years ago

Article

Agentic Task Delegation - Making Agents whole again

adarshxs

•

Aug 5, 2024

• 6

upvoted a paper over 2 years ago

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 252

Suchit G

AI & ML interests

Organizations

asquirous's activity

Mastering Tensor Dimensions in Transformers

Decoding Strategies in Large Language Models

KV Caching Explained: Optimizing Transformer Inference Efficiency

You could have designed state of the art positional encoding

Efficient LLM Pretraining: Packed Sequences and Masked Attention

SmolLM - blazingly fast and remarkably powerful

Agentic Task Delegation - Making Agents whole again