-
The Ultra-Scale Playbook
๐3.67kThe ultimate guide to training LLM on large GPU Clusters
-
The Smol Training Playbook
๐2.95kThe secrets to building world-class LLMs
-
Evaluation Guidebook
๐266Display benchmark evaluation data for LLMs
-
FineVision: Open Data is All You Need
๐217A new open-source dataset for training VLMs
Sergio Paniego PRO
AI & ML interests
Recent Activity
Organizations
-
Sleeping41
comparevlms
๐41Compare Vision Language Models
-
Runtime error4
Gemma3 License Plate Detection
๐4Gemma 3 for license plate detection
-
Running on ZeroFeatured142
Gemma 3n E4B It
โก142Generate text responses to images, videos, and audio
-
Running on ZeroFeatured37
Moondream3
๐ข37Image and video tasks with moondream3.
-
Sleeping41
comparevlms
๐41Compare Vision Language Models
-
Running on Zero66
OCR Time Machine
๐66Extract text from images and XML files using OCR models
-
Sleeping26
Compare Docvqa Models
๐ฆ26Compare different visual question answering
-
Running on CPU Upgrade23
Compare Clip Siglip
๐23Compare strong zero-shot image classification models
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any โข 11B โข Updated โข 209k โข 1.85k -
RunningFeatured365
Qwen2.5 Omni 7B Demo
๐365Generate text and speech responses from text, audio, images, or video input
-
Qwen2.5-Omni Technical Report
Paper โข 2503.20215 โข Published โข 169 -
openbmb/MiniCPM-o-2_6
Any-to-Any โข 9B โข Updated โข 89.8k โข 1.28k
-
Running3.67k
The Ultra-Scale Playbook
๐3.67kThe ultimate guide to training LLM on large GPU Clusters
-
Runtime errorFeatured2.95k
The Smol Training Playbook
๐2.95kThe secrets to building world-class LLMs
-
Running266
Evaluation Guidebook
๐266Display benchmark evaluation data for LLMs
-
Running217
FineVision: Open Data is All You Need
๐217A new open-source dataset for training VLMs
-
Sleeping41
comparevlms
๐41Compare Vision Language Models
-
Running on Zero66
OCR Time Machine
๐66Extract text from images and XML files using OCR models
-
Sleeping26
Compare Docvqa Models
๐ฆ26Compare different visual question answering
-
Running on CPU Upgrade23
Compare Clip Siglip
๐23Compare strong zero-shot image classification models
-
Sleeping41
comparevlms
๐41Compare Vision Language Models
-
Runtime error4
Gemma3 License Plate Detection
๐4Gemma 3 for license plate detection
-
Running on ZeroFeatured142
Gemma 3n E4B It
โก142Generate text responses to images, videos, and audio
-
Running on ZeroFeatured37
Moondream3
๐ข37Image and video tasks with moondream3.
-
Qwen/Qwen2.5-Omni-7B
Any-to-Any โข 11B โข Updated โข 209k โข 1.85k -
RunningFeatured365
Qwen2.5 Omni 7B Demo
๐365Generate text and speech responses from text, audio, images, or video input
-
Qwen2.5-Omni Technical Report
Paper โข 2503.20215 โข Published โข 169 -
openbmb/MiniCPM-o-2_6
Any-to-Any โข 9B โข Updated โข 89.8k โข 1.28k