Piotr

piotr-ai

·

AI & ML interests

None yet

Recent Activity

published a model 6 days ago

piotr-ai/polanka_3.7b_exp_wip_260706

updated a model 6 days ago

piotr-ai/polanka_3.7b_exp_wip_260706

liked a model about 1 month ago

google/diffusiongemma-26B-A4B-it

View all activity

Organizations

None yet

upvoted a collection about 1 month ago

Gemma 4 QAT Q4_0

19 items • Updated Jun 5 • 142

upvoted a collection 3 months ago

Gemma 4

15 items • Updated Jun 10 • 1.02k

upvoted a collection 6 months ago

FLUX.2

Our second generation of FLUX • 21 items • Updated Apr 6 • 257

upvoted an article 6 months ago

Article

The Optimal Architecture for Small Language Models

codelion

•

Dec 26, 2025

• 121

upvoted a collection 11 months ago

DeepSeek-V3.1

3 items • Updated Mar 2 • 264

upvoted 2 collections about 1 year ago

LLaMA-Omni

13 items • Updated May 17, 2025 • 20

NextCoder

NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 6 items • Updated Jul 9, 2025 • 80

upvoted a paper over 1 year ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 210

upvoted an article over 1 year ago

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

EuroBERT

•

Mar 10, 2025

• 149

upvoted 2 collections over 1 year ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 190

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Oct 7, 2025 • 67

upvoted a paper over 1 year ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published Jan 29, 2025 • 58

upvoted a collection over 1 year ago

Cosmos-Preidct1

⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 14 items • Updated about 1 month ago • 304

upvoted 2 papers over 1 year ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 110

Transformers Can Navigate Mazes With Multi-Step Prediction

Paper • 2412.05117 • Published Dec 6, 2024 • 5

upvoted 3 collections over 1 year ago

Common Models

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 43

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 309

LayerSkip

Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated Nov 21, 2024 • 49

upvoted a paper almost 2 years ago

The AdEMAMix Optimizer: Better, Faster, Older

Paper • 2409.03137 • Published Sep 5, 2024 • 6

upvoted a collection almost 2 years ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 675