🏗️ Building on HF

Sergio Paniego PRO

sergiopaniego

230 192 124

quantum11's profile picture

BramVanroy's profile picture

dawnieando's profile picture

https://sergiopaniego.github.io/

sergiopaniego
sergiopaniego
sergio-paniego-blanco

AI & ML interests

None yet

Recent Activity

updated a dataset about 13 hours ago

agents-course/final-certificates

updated a dataset about 13 hours ago

agents-course/course-certificates-of-excellence

updated a dataset 3 days ago

huggingface-projects/Deep-RL-Course-Certification

View all activity

Organizations

sergiopaniego 's collections 9

Bringing Autonomous Driving RL to OpenEnv and TRL resources

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/

Runtime error

RL

CARLA Environment Server

🚗

Control a Carla driving simulation with custom actions
Runtime error

RL

CARLA Environment Server

🚗

Control a CARLA driving simulator with custom actions
Sleeping

Agents

Carla Grpo Trolley

🚀

Visualize your program’s I/O activity in real time
sergiopaniego/Qwen3-0.6B-carla-trolley-escape

0.8B • Updated Feb 26 • 6

Amazing design resources

Running

118

HFBA

🤗

118

Hugging Face brand assets — Huggies, logos, and graphics
Running

14

HF Thumbnail Crafter

🎨

14

Create custom thumbnails for your videos

GUI Grounding datasets

bevaya/ScreenSpot

Viewer • Updated Apr 10, 2024 • 1.27k • 3.04k • 50
OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 3.71k • 46

👁 Vision comparison ftw

Spaces to compare vision models — there’s no single best model, only the best one for your specific use case.

Running

Agents

42

comparevlms

🏃

42

Compare Vision Language Models
Running on Zero

Agents

68

OCR Time Machine

📚

68

Extract text from images and XML files using OCR models
Running

Agents

26

Compare Docvqa Models

🦀

26

Compare different visual question answering
Running on CPU Upgrade

Agents

23

Compare Clip Siglip

🏃

23

Compare strong zero-shot image classification models

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update

Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30, 2025 • 653k • 1.91k
Running

Agents

Featured

372

Qwen2.5 Omni 7B Demo

🏆

372

Chat with text, audio, images, and video, get spoken replies
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 173
openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5, 2025 • 438k • 1.29k

📝 Research & Long-Form Blog Posts

In-depth technical articles and research pieces published by Hugging Face

Running

3.91k

The Ultra-Scale Playbook

🌌

3.91k

The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade

Featured

3.22k

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs
Running

330

Evaluation Guidebook

📝

330

Explore LLM benchmark scores over time
Running

231

FineVision: Open Data is All You Need

📝

231

A new open-source dataset for training VLMs

Vision reasoning datasets

deepcs233/Visual-CoT

Preview • Updated Mar 11, 2025 • 2.52k • 63
lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27, 2025 • 7.69k • 1.15k • 76
leonardPKU/GEOQA_R1V_Train_8K

Viewer • Updated Feb 11, 2025 • 8.03k • 194 • 14
leonardPKU/clevr_cogen_a_train

Viewer • Updated Feb 2, 2025 • 70k • 476 • 41

My vision Spaces

Vision Spaces created by me

Running on Zero

Agents

Featured

119

VLM Object Understanding

🦀

119

Explore object detection, visual grounding, keypoint Detecti
Runtime error

Agents

4

VQA Autonomous Driving SmolVLM2

🌖

4

Visual Question Answering - Autonomous Driving - SmolVLM2

😎 Awesome vision Spaces

Spaces where I've collaborated or that I consider unique!

Running

Agents

42

comparevlms

🏃

42

Compare Vision Language Models
Runtime error

Agents

4

Gemma3 License Plate Detection

📈

4

Gemma 3 for license plate detection
Running on Zero

Agents

Featured

143

Gemma 3n E4B It

⚡

143

Chat with an AI that understands text, images, audio, and video
Running on Zero

Agents

Featured

43

Moondream3

🏢

43

Image and video tasks with moondream3.

Bringing Autonomous Driving RL to OpenEnv and TRL resources

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/

Runtime error

RL

CARLA Environment Server

🚗

Control a Carla driving simulation with custom actions
Runtime error

RL

CARLA Environment Server

🚗

Control a CARLA driving simulator with custom actions
Sleeping

Agents

Carla Grpo Trolley

🚀

Visualize your program’s I/O activity in real time
sergiopaniego/Qwen3-0.6B-carla-trolley-escape

0.8B • Updated Feb 26 • 6

📝 Research & Long-Form Blog Posts

In-depth technical articles and research pieces published by Hugging Face

Running

3.91k

The Ultra-Scale Playbook

🌌

3.91k

The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade

Featured

3.22k

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs
Running

330

Evaluation Guidebook

📝

330

Explore LLM benchmark scores over time
Running

231

FineVision: Open Data is All You Need

📝

231

A new open-source dataset for training VLMs

Amazing design resources

Running

118

HFBA

🤗

118

Hugging Face brand assets — Huggies, logos, and graphics
Running

14

HF Thumbnail Crafter

🎨

14

Create custom thumbnails for your videos

Vision reasoning datasets

deepcs233/Visual-CoT

Preview • Updated Mar 11, 2025 • 2.52k • 63
lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27, 2025 • 7.69k • 1.15k • 76
leonardPKU/GEOQA_R1V_Train_8K

Viewer • Updated Feb 11, 2025 • 8.03k • 194 • 14
leonardPKU/clevr_cogen_a_train

Viewer • Updated Feb 2, 2025 • 70k • 476 • 41

GUI Grounding datasets

bevaya/ScreenSpot

Viewer • Updated Apr 10, 2024 • 1.27k • 3.04k • 50
OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 3.71k • 46

My vision Spaces

Vision Spaces created by me

Running on Zero

Agents

Featured

119

VLM Object Understanding

🦀

119

Explore object detection, visual grounding, keypoint Detecti
Runtime error

Agents

4

VQA Autonomous Driving SmolVLM2

🌖

4

Visual Question Answering - Autonomous Driving - SmolVLM2

👁 Vision comparison ftw

Spaces to compare vision models — there’s no single best model, only the best one for your specific use case.

Running

Agents

42

comparevlms

🏃

42

Compare Vision Language Models
Running on Zero

Agents

68

OCR Time Machine

📚

68

Extract text from images and XML files using OCR models
Running

Agents

26

Compare Docvqa Models

🦀

26

Compare different visual question answering
Running on CPU Upgrade

Agents

23

Compare Clip Siglip

🏃

23

Compare strong zero-shot image classification models

😎 Awesome vision Spaces

Spaces where I've collaborated or that I consider unique!

Running

Agents

42

comparevlms

🏃

42

Compare Vision Language Models
Runtime error

Agents

4

Gemma3 License Plate Detection

📈

4

Gemma 3 for license plate detection
Running on Zero

Agents

Featured

143

Gemma 3n E4B It

⚡

143

Chat with an AI that understands text, images, audio, and video
Running on Zero

Agents

Featured

43

Moondream3

🏢

43

Image and video tasks with moondream3.

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update

Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30, 2025 • 653k • 1.91k
Running

Agents

Featured

372

Qwen2.5 Omni 7B Demo

🏆

372

Chat with text, audio, images, and video, get spoken replies
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 173
openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5, 2025 • 438k • 1.29k

Sergio Paniego PRO

AI & ML interests

Recent Activity

Organizations

sergiopaniego 's collections 9

CARLA Environment Server

CARLA Environment Server

Carla Grpo Trolley

HFBA

HF Thumbnail Crafter

comparevlms

OCR Time Machine

Compare Docvqa Models

Compare Clip Siglip

Qwen2.5 Omni 7B Demo

The Ultra-Scale Playbook

The Smol Training Playbook

Evaluation Guidebook

FineVision: Open Data is All You Need

VLM Object Understanding

VQA Autonomous Driving SmolVLM2

comparevlms

Gemma3 License Plate Detection

Gemma 3n E4B It

Moondream3

CARLA Environment Server

CARLA Environment Server

Carla Grpo Trolley

The Ultra-Scale Playbook

The Smol Training Playbook

Evaluation Guidebook

FineVision: Open Data is All You Need

HFBA

HF Thumbnail Crafter

VLM Object Understanding

VQA Autonomous Driving SmolVLM2

comparevlms

OCR Time Machine

Compare Docvqa Models

Compare Clip Siglip

comparevlms

Gemma3 License Plate Detection

Gemma 3n E4B It

Moondream3

Qwen2.5 Omni 7B Demo