> Quality β 3β4B dense, yet faster than Qwen3-1.7B > MoE designed to run on phones/laptops (llama.cpp / vLLM) > Pre-trained on 12T tokens β strong math/code/IF
Liquid just released two 450M and 1.6B param VLMs!
They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.
It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.
𧬠Breaking news in Clinical AI: Introducing the OpenMed NER Model Discovery App on Hugging Face π¬
OpenMed is back! π₯ Finding the right biomedical NER model just became as precise as a PCR assay!
I'm thrilled to unveil my comprehensive OpenMed Named Entity Recognition Model Discovery App that puts 384 specialized biomedical AI models at your fingertips.
π― Why This Matters in Healthcare AI: Traditional clinical text mining required hours of manual model evaluation. My Discovery App instantly connects researchers, clinicians, and data scientists with the exact NER models they need for their biomedical entity extraction tasks.
π¬ What You Can Discover: β Pharmacological Models - Extract "chemical compounds", "drug interactions", and "pharmaceutical" entities from clinical notes β Genomics & Proteomics - Identify "DNA sequences", "RNA transcripts", "gene variants", "protein complexes", and "cell lines" β Pathology & Disease Detection - Recognize "pathological formations", "cancer types", and "disease entities" in medical literature β Anatomical Recognition - Map "anatomical systems", "tissue types", "organ structures", and "cellular components" β Clinical Entity Extraction - Detect "organism species", "amino acids", 'protein families", and "multi-tissue structures"
π‘ Advanced Features: π Intelligent Entity Search - Find models by specific biomedical entities (e.g., "Show me models detecting CHEM + DNA + Protein") π₯ Domain-Specific Filtering - Browse by Oncology, Pharmacology, Genomics, Pathology, Hematology, and more π Model Architecture Insights - Compare BERT, RoBERTa, and DeBERTa implementations β‘ Real-Time Search - Auto-filtering as you type, no search buttons needed π¨ Clinical-Grade UI - Beautiful, intuitive interface designed for medical professionals
Ready to revolutionize your biomedical NLP pipeline?
π Try it now: OpenMed/openmed-ner-models 𧬠Built with: Gradio, Transformers, Advanced Entity Mapping
Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.
A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.
Today, I'm thrilled to introduce our first step in this direction.
In a nutshell:
π Effortlessly run prompts and models over your data. π Agentic search for accuracy and real-time information. πΌοΈ Familiar, minimalistic interface for interacting with data. π― Human feedback 2.0: Your input directly improves generated data. π― Access hundreds of open models and leading inference providers.
I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.
The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.
I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.
π Announcing Global-MMLU: an improved MMLU Open dataset with evaluation coverage across 42 languages, built with Argilla and the Hugging Face community.
π·οΈ +200 contributors used Argilla MMLU questions where regional, dialect, or cultural knowledge was required to answer correctly. 85% of the questions required Western-centric knowledge!
Thanks to this annotation process, the open dataset contains two subsets:
1. π½ Culturally Agnostic: no specific regional, cultural knowledge is required. 2. βοΈ Culturally Sensitive: requires dialect, cultural knowledge or geographic knowledge to answer correctly.
Moreover, we provide high quality translations of 25 out of 42 languages, thanks again to the community and professional annotators leveraging Argilla on the Hub.
I hope this will ensure a better understanding of the limitations and challenges for making open AI useful for many languages.
Build datasets for AI on the Hugging Face Hubβ10x easier than ever!
Today, I'm excited to share our biggest feature since we joined Hugging Face.
Hereβs how it works:
1. Pick a datasetβupload your own or choose from 240K open datasets. 2. Paste the Hub dataset ID into Argilla and set up your labeling interface. 3. Share the URL with your team or the whole community!
And the best part? Itβs: - No code β no Python needed - Integrated β all within the Hub - Scalable β from solo labeling to 100s of contributors
I am incredibly proud of the team for shipping this after weeks of work and many quick iterations.
Let's make this sentence obsolete: "Everyone wants to do the model work, not the data work."