Ivan Baldo's picture

Ivan Baldo

ibaldonl

·

https://www.netlabs.us/

AI & ML interests

MLOps, Scalability, Performance, OnPremises

Recent Activity

liked a dataset 12 days ago

harborframework/terminal-bench-2-leaderboard

upvoted a collection 15 days ago

new activity 15 days ago

deucebucket/Qwen3.6-35B-A3B-Cerebellum-GGUF:16gb card owners: post your numbers

View all activity

Organizations

None yet

upvoted a collection 15 days ago

GSQ

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling, https://huggingface.co/papers/2604.18556 • 9 items • Updated May 25 • 9

upvoted 2 collections 2 months ago

DFlash

Block Diffusion for Flash Speculative Decoding • 23 items • Updated about 9 hours ago • 139

Qwen3.6

4 items • Updated Apr 22 • 419

upvoted 2 collections 3 months ago

Gemma 4

15 items • Updated 18 days ago • 998

ParoQuant

Pairwise Rotation Quantization for Efficient Reasoning LLM Inference • 24 items • Updated 21 days ago • 27

upvoted an article 4 months ago

Article

FINAL Bench: The Real Bottleneck to AGI Is Self-Correction

FINAL-Bench

•

Feb 21

• 20

upvoted a collection 4 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.7k

upvoted an article 5 months ago

Article

🔢 INT4 vs FP4: The Future of 4-Bit Quantization

onekq

•

Nov 19, 2025

• 7

upvoted a collection 9 months ago

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with Model Optimizer. • 79 items • Updated 2 days ago • 176

upvoted an article about 1 year ago

Article

Gemma 3n fully available in the open-source ecosystem!

+6

ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif

•

Jun 26, 2025

• 121

upvoted 4 collections about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.82k

Qwen3-Reranker

3 items • Updated Dec 31, 2025 • 71

Qwen3-Embedding

6 items • Updated Dec 31, 2025 • 173

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 219

upvoted a paper about 2 years ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 630