--- license: other license_name: lfm-nanotron-prism-research license_link: LICENSE.md language: - en tags: - lfm - prism - gspo - hybrid-architecture - tool-use - Thinking pipeline_tag: text-generation library_name: transformers --- ![image](https://cdn-uploads.huggingface.co/production/uploads/63adf1fa42fd3b8dbaeb0c92/9RwVQ2zsEqvFDNaGkOBTO.png)
# lfm-Nanotron: 2.6B-PRISM-SFT-GSPO-AutoRoundV2
** LFM Architeture model SFT + GSPO RL + PRISM ** [![Model](https://img.shields.io/badge/Model-2.6B-blue)]() [![Architecture](https://img.shields.io/badge/Architecture-LFM2%20Hybrid-green)]() [![Context](https://img.shields.io/badge/Context-128K-orange)]()
## Model Description **lfm-Nanotron**: Limited Edition 2.6B PRISM Model Access. Unlock a cutting-edge Nano sized AI model! This is **lfm-Nanotron** — A Nano Sized 2.6B parameter hybrid architecture language model fine-tuned with advanced techniques you won't find in mainstream releases: - SFT (Test-Time Supervised-Fine-Tuning) — Adaptive optimization at inference - GSPO (Group Sequence Policy Optimization) — RL Enhanced reasoning, Instruction following, thinking, tool calling & logic - PRISM (Projected Refusal Isolation via Subspace Modification) — State-of-the-art over-refusal/propaganda removal from LLMs - 128K Context Window — Handle massive prompts with ease - Agentic Tool Calling — Built for multi-turn, thinking, and instruction-following tasks ### Architecture Details | Parameter | Value | |-----------|-------| | Parameters | ~2.6B | | Hidden Size | 2048 | | Layers | 30 (22 Conv + 8 Full Attention) | | Attention Heads | 32 | | KV Heads | 8 (GQA) | | Vocabulary | 65,536 | | Max Context | 128,000 tokens | | Architecture | Hybrid Conv + Attention (LFM2) | ### Available Quantizations | File | Quantization | Size | Use Case | |------|-------------|------|----------| | `lfm2-nanotron-ttft-gspo-prism-bf16.gguf` | BF16 | ~4.8GB | Full precision, best quality | | `lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf (+W4A16)` | Q4_K_M | ~1.5GB | Balanced quality/size | | `lfm2-nanotron-ttft-gspo-prism-Q2_K.gguf` | Q2_K (+W2A16)| ~0.9GB | Maximum compression | ## Usage ### With llama.cpp ```bash ./llama-cli -m lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf -p "Your prompt here" --temp 0.3 --min-p 0.15 --repeat-penalty 1.05 ``` ### Recommended Generation Parameters ```json { "temperature": 0.3, "min_p": 0.15, "repeat_penalty": 1.05 } ``` ## Citation If you use this model in your research, please cite: ```bibtex @misc{lfm2-nanotron-2026, title={lfm2-Nanotron: Test-Time Fine-Tuned LFM2 with GSPO+PRISM}, author={Exobit (Eric Elbaz)}, year={2026}, publisher={Hugging Face}, url={https://huggingface.co/Ex0bit/lfm2-Nanotron} } ``` ## License This model is released under a custom research license. See LICENSE.md for details. ## Acknowledgments - [@mlabonne](https://huggingface.co/mlabonne) & [@liquidai](https://huggingface.co/LiquidAI) for the LFM2 architecture - [@anakin87](https://huggingface.co/anakin87) for inspiring the idea - The open-source AI community