lfm-Nanotron / README.md
Ex0bit's picture
Update README.md
c106a5a verified
---
license: other
license_name: lfm-nanotron-prism-research
license_link: LICENSE.md
language:
- en
tags:
- lfm
- prism
- gspo
- hybrid-architecture
- tool-use
- Thinking
pipeline_tag: text-generation
library_name: transformers
---
![image](https://cdn-uploads.huggingface.co/production/uploads/63adf1fa42fd3b8dbaeb0c92/9RwVQ2zsEqvFDNaGkOBTO.png)
<div align="center">
# lfm-Nanotron: 2.6B-PRISM-SFT-GSPO-AutoRoundV2
</div>
<div align="center">
** LFM Architeture model SFT + GSPO RL + PRISM **
[![Model](https://img.shields.io/badge/Model-2.6B-blue)]()
[![Architecture](https://img.shields.io/badge/Architecture-LFM2%20Hybrid-green)]()
[![Context](https://img.shields.io/badge/Context-128K-orange)]()
</div>
## Model Description
**lfm-Nanotron**: Limited Edition 2.6B PRISM Model Access. Unlock a cutting-edge Nano sized AI model!
This is **lfm-Nanotron** β€” A Nano Sized 2.6B parameter hybrid architecture language model fine-tuned with advanced techniques you won't find in mainstream releases:
- SFT (Test-Time Supervised-Fine-Tuning) β€” Adaptive optimization at inference
- GSPO (Group Sequence Policy Optimization) β€” RL Enhanced reasoning, Instruction following, thinking, tool calling & logic
- PRISM (Projected Refusal Isolation via Subspace Modification) β€” State-of-the-art over-refusal/propaganda removal from LLMs
- 128K Context Window β€” Handle massive prompts with ease
- Agentic Tool Calling β€” Built for multi-turn, thinking, and instruction-following tasks
### Architecture Details
| Parameter | Value |
|-----------|-------|
| Parameters | ~2.6B |
| Hidden Size | 2048 |
| Layers | 30 (22 Conv + 8 Full Attention) |
| Attention Heads | 32 |
| KV Heads | 8 (GQA) |
| Vocabulary | 65,536 |
| Max Context | 128,000 tokens |
| Architecture | Hybrid Conv + Attention (LFM2) |
### Available Quantizations
| File | Quantization | Size | Use Case |
|------|-------------|------|----------|
| `lfm2-nanotron-ttft-gspo-prism-bf16.gguf` | BF16 | ~4.8GB | Full precision, best quality |
| `lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf (+W4A16)` | Q4_K_M | ~1.5GB | Balanced quality/size |
| `lfm2-nanotron-ttft-gspo-prism-Q2_K.gguf` | Q2_K (+W2A16)| ~0.9GB | Maximum compression |
## Usage
### With llama.cpp
```bash
./llama-cli -m lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf -p "Your prompt here" --temp 0.3 --min-p 0.15 --repeat-penalty 1.05
```
### Recommended Generation Parameters
```json
{
"temperature": 0.3,
"min_p": 0.15,
"repeat_penalty": 1.05
}
```
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{lfm2-nanotron-2026,
title={lfm2-Nanotron: Test-Time Fine-Tuned LFM2 with GSPO+PRISM},
author={Exobit (Eric Elbaz)},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/Ex0bit/lfm2-Nanotron}
}
```
## License
This model is released under a custom research license. See LICENSE.md for details.
## Acknowledgments
- [@mlabonne](https://huggingface.co/mlabonne) & [@liquidai](https://huggingface.co/LiquidAI) for the LFM2 architecture
- [@anakin87](https://huggingface.co/anakin87) for inspiring the idea
- The open-source AI community