unalign
community
AI & ML interests
None defined yet.
mlabonneย
authored 2
papers 3 months ago
Post
10335
New family of 1B models just dropped!
> LiquidAI/LFM2.5-1.2B-Base: 10T โ 28T tokens
> LiquidAI/LFM2.5-1.2B-Instruct: new large-scale multi-stage RL
> LiquidAI/LFM2.5-1.2B-JP: our most polite model
> LiquidAI/LFM2.5-VL-1.6B: multi-image multilingual
> LiquidAI/LFM2.5-Audio-1.5B: 8x times faster, no quality loss
Super proud of this release ๐ค
> LiquidAI/LFM2.5-1.2B-Base: 10T โ 28T tokens
> LiquidAI/LFM2.5-1.2B-Instruct: new large-scale multi-stage RL
> LiquidAI/LFM2.5-1.2B-JP: our most polite model
> LiquidAI/LFM2.5-VL-1.6B: multi-image multilingual
> LiquidAI/LFM2.5-Audio-1.5B: 8x times faster, no quality loss
Super proud of this release ๐ค
fernandofernandesย
authored 4
papers 5 months ago
Spectrum: Targeted Training on Signal to Noise Ratio
Paper โข 2406.06623 โข Published โข 16
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
Paper โข 2406.14971 โข Published
Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit
Paper โข 2506.06607 โข Published โข 3
LFM2 Technical Report
Paper โข 2511.23404 โข Published โข 59
Post
8436
LiquidAI/LFM2-8B-A1B just dropped!
8.3B params with only 1.5B active/token ๐
> Quality โ 3โ4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens โ strong math/code/IF
8.3B params with only 1.5B active/token ๐
> Quality โ 3โ4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens โ strong math/code/IF
Post
3888
โ๏ธ New drop of tiny task-specific models!
Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! โ
These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.
You can deploy them today on-device or even on GPUs for big data operations!
LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a
Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! โ
These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.
You can deploy them today on-device or even on GPUs for big data operations!
LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a
Post
6986
Liquid just released two 450M and 1.6B param VLMs!
They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.
It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.
LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B
They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.
It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.
LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B
Post
5766
Based on a new hybrid architecture, these 350M, 700M, and 1.2B models are both fast and performant, ideal for on-device deployment.
I recommend fine-tuning them to power your next edge application. We already provide Colab notebooks to guide you. More to come soon!
๐ Blog post: https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models
๐ค Models: LiquidAI/lfm2-686d721927015b2ad73eaa38
Post
19585
โ๏ธ AutoAbliteration
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
๐ป Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
๐ป Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing
Post
6625
โ๏ธ Gemma 3 Abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated
Post
7260
๐ LLM Course 2025 edition!
I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.
The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.
I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.
Thanks everyone, hope you'll enjoy it!
๐ป LLM Course: https://huggingface.co/blog/mlabonne/llm-course
I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.
The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.
I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.
Thanks everyone, hope you'll enjoy it!
๐ป LLM Course: https://huggingface.co/blog/mlabonne/llm-course
mlabonneย
authored a
paper over 1 year ago
Post
19725
Large models are surprisingly bad storytellers.
I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."
Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.
In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.
Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.
I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.
Do you know why smaller models outperform the frontier models here?
I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."
Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.
In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.
Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.
I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.
Do you know why smaller models outperform the frontier models here?
Post
19749
โ๏ธ Uncensor any LLM with abliteration
I wrote an article about abliteration and how NeuralDaredevil-8B was created. Beyond removing alignment, I believe it's an interesting technique with a lot of potential. It's basically fine-tuning without retraining.
In this article, we see how it works, implement it in Google Colab, and heal the abliterated model to recover the performance drop due to this technique. The final model is an uncensored and high-quality model with the highest MMLU score on the Open LLM Leaderboard (8B category).
https://huggingface.co/blog/mlabonne/abliteration
I wrote an article about abliteration and how NeuralDaredevil-8B was created. Beyond removing alignment, I believe it's an interesting technique with a lot of potential. It's basically fine-tuning without retraining.
In this article, we see how it works, implement it in Google Colab, and heal the abliterated model to recover the performance drop due to this technique. The final model is an uncensored and high-quality model with the highest MMLU score on the Open LLM Leaderboard (8B category).
https://huggingface.co/blog/mlabonne/abliteration
Post
12870
๐ AutoMerger created the best 7B model on the Open LLM Leaderboard
By randomly combining top models from the Open LLM Leaderboard, AutoMerger created YamshadowExperiment28-7B. The model is three weeks old and has been at the top of the leaderboard for a week now. It was created through a simple SLERP merge of:
- automerger/YamShadow-7B (another top model created by AutoMerger)
- yam-peleg/Experiment28-7B (a top model from @yam-peleg )
1/ On the Open LLM Leaderboard, it managed to outperform the excellent M7-7b model, which has been the #1 7B model for a while now.
2/ On the YALL leaderboard, YamshadowExperiment28-7B is ranked as the 9th best-performing automerge (but note that the scores are very close to each other). Compared to others, it does not perform particularly well on AGIEval or Bigbench.
3/ Thanks to @sam-paech , I have scores on EQ-Bench, where it managed to outperform all of my previous models. It even surpasses recent models such as DBRX instruct, Qwen1.5 32B Chat, and Cohere's Command R+.
Surprisingly, it does not support ChatML or Mistral Instruct, unlike my other merges (which are part of its family tree). Alpaca works well 99% of the time, but the model can sometimes produce a lot of "INST" tokens for no reason.
In my experiments, YamshadowExperiment28-7B doesn't seem smarter than other successful merges like AlphaMonarch. On the contrary, I found several mathematical or reasoning problems where it fails.
Considering these results, it looks like it might overfit the Open LLM Leaderboard. I guess it's anything but surprising when you randomly merge 156 models.
๐ค Model: automerger/YamshadowExperiment28-7B
๐ AutoMerger: mlabonne/AutoMerger
By randomly combining top models from the Open LLM Leaderboard, AutoMerger created YamshadowExperiment28-7B. The model is three weeks old and has been at the top of the leaderboard for a week now. It was created through a simple SLERP merge of:
- automerger/YamShadow-7B (another top model created by AutoMerger)
- yam-peleg/Experiment28-7B (a top model from @yam-peleg )
1/ On the Open LLM Leaderboard, it managed to outperform the excellent M7-7b model, which has been the #1 7B model for a while now.
2/ On the YALL leaderboard, YamshadowExperiment28-7B is ranked as the 9th best-performing automerge (but note that the scores are very close to each other). Compared to others, it does not perform particularly well on AGIEval or Bigbench.
3/ Thanks to @sam-paech , I have scores on EQ-Bench, where it managed to outperform all of my previous models. It even surpasses recent models such as DBRX instruct, Qwen1.5 32B Chat, and Cohere's Command R+.
Surprisingly, it does not support ChatML or Mistral Instruct, unlike my other merges (which are part of its family tree). Alpaca works well 99% of the time, but the model can sometimes produce a lot of "INST" tokens for no reason.
In my experiments, YamshadowExperiment28-7B doesn't seem smarter than other successful merges like AlphaMonarch. On the contrary, I found several mathematical or reasoning problems where it fails.
Considering these results, it looks like it might overfit the Open LLM Leaderboard. I guess it's anything but surprising when you randomly merge 156 models.
๐ค Model: automerger/YamshadowExperiment28-7B
๐ AutoMerger: mlabonne/AutoMerger
Post
9603
โก AutoQuant
AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats:
- GGUF: perfect for inference on CPUs (and LM Studio)
- GPTQ/EXL2: fast inference on GPUs
- AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm)
- HQQ: extreme quantization with decent 2-bit and 3-bit models
Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU.
Here's an example of a model I quantized using HQQ and AutoQuant: mlabonne/AlphaMonarch-7B-2bit-HQQ
I hope you'll enjoy it and quantize lots of models! :)
๐ป AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4
AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats:
- GGUF: perfect for inference on CPUs (and LM Studio)
- GPTQ/EXL2: fast inference on GPUs
- AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm)
- HQQ: extreme quantization with decent 2-bit and 3-bit models
Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU.
Here's an example of a model I quantized using HQQ and AutoQuant: mlabonne/AlphaMonarch-7B-2bit-HQQ
I hope you'll enjoy it and quantize lots of models! :)
๐ป AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4