|
|
--- |
|
|
license: other |
|
|
license_name: lfm-nanotron-prism-research |
|
|
license_link: LICENSE.md |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- lfm |
|
|
- prism |
|
|
- gspo |
|
|
- hybrid-architecture |
|
|
- tool-use |
|
|
- Thinking |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
 |
|
|
<div align="center"> |
|
|
# lfm-Nanotron: 2.6B-PRISM-SFT-GSPO-AutoRoundV2 |
|
|
</div> |
|
|
<div align="center"> |
|
|
|
|
|
** LFM Architeture model SFT + GSPO RL + PRISM ** |
|
|
|
|
|
[]() |
|
|
[]() |
|
|
[]() |
|
|
|
|
|
</div> |
|
|
|
|
|
## Model Description |
|
|
|
|
|
|
|
|
**lfm-Nanotron**: Limited Edition 2.6B PRISM Model Access. Unlock a cutting-edge Nano sized AI model! |
|
|
|
|
|
|
|
|
This is **lfm-Nanotron** β A Nano Sized 2.6B parameter hybrid architecture language model fine-tuned with advanced techniques you won't find in mainstream releases: |
|
|
- SFT (Test-Time Supervised-Fine-Tuning) β Adaptive optimization at inference |
|
|
- GSPO (Group Sequence Policy Optimization) β RL Enhanced reasoning, Instruction following, thinking, tool calling & logic |
|
|
- PRISM (Projected Refusal Isolation via Subspace Modification) β State-of-the-art over-refusal/propaganda removal from LLMs |
|
|
- 128K Context Window β Handle massive prompts with ease |
|
|
- Agentic Tool Calling β Built for multi-turn, thinking, and instruction-following tasks |
|
|
|
|
|
### Architecture Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Parameters | ~2.6B | |
|
|
| Hidden Size | 2048 | |
|
|
| Layers | 30 (22 Conv + 8 Full Attention) | |
|
|
| Attention Heads | 32 | |
|
|
| KV Heads | 8 (GQA) | |
|
|
| Vocabulary | 65,536 | |
|
|
| Max Context | 128,000 tokens | |
|
|
| Architecture | Hybrid Conv + Attention (LFM2) | |
|
|
|
|
|
### Available Quantizations |
|
|
|
|
|
| File | Quantization | Size | Use Case | |
|
|
|------|-------------|------|----------| |
|
|
| `lfm2-nanotron-ttft-gspo-prism-bf16.gguf` | BF16 | ~4.8GB | Full precision, best quality | |
|
|
| `lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf (+W4A16)` | Q4_K_M | ~1.5GB | Balanced quality/size | |
|
|
| `lfm2-nanotron-ttft-gspo-prism-Q2_K.gguf` | Q2_K (+W2A16)| ~0.9GB | Maximum compression | |
|
|
|
|
|
## Usage |
|
|
|
|
|
### With llama.cpp |
|
|
|
|
|
```bash |
|
|
./llama-cli -m lfm2-nanotron-ttft-gspo-prism-Q4_K_M.gguf -p "Your prompt here" --temp 0.3 --min-p 0.15 --repeat-penalty 1.05 |
|
|
``` |
|
|
|
|
|
### Recommended Generation Parameters |
|
|
|
|
|
```json |
|
|
{ |
|
|
"temperature": 0.3, |
|
|
"min_p": 0.15, |
|
|
"repeat_penalty": 1.05 |
|
|
} |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{lfm2-nanotron-2026, |
|
|
title={lfm2-Nanotron: Test-Time Fine-Tuned LFM2 with GSPO+PRISM}, |
|
|
author={Exobit (Eric Elbaz)}, |
|
|
year={2026}, |
|
|
publisher={Hugging Face}, |
|
|
url={https://huggingface.co/Ex0bit/lfm2-Nanotron} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under a custom research license. See LICENSE.md for details. |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- [@mlabonne](https://huggingface.co/mlabonne) & [@liquidai](https://huggingface.co/LiquidAI) for the LFM2 architecture |
|
|
- [@anakin87](https://huggingface.co/anakin87) for inspiring the idea |
|
|
- The open-source AI community |