Cleo Nano is a decoder-only Transformer model developed by **Inserloft** under the vision of **Jesus Heriberto Corona**. This version (v3.1) features surgical fine-tuning for bilingual stability (English/Spanish) and hallucination control.
## Model Details
-**Architecture:** Decoder-Only GPT (Custom)
-**Layers:** 8
-**Embedding Dim:** 384
-**Attention Heads:** 12
-**Context Window:** 256 tokens
-**Parameters:** ~15M
-**Training Data:** Mix of Wikipedia, Python Code (CodeFeedback), and Identity Anchoring.
## Usage
To use this model, you need the custom `CleoNanoV3` architecture defined in PyTorch. The weights can be loaded using `torch.load()` or via the Hugging Face `from_pretrained` if using the provided mapping logic.
### Capabilities
1.**Bilingual Chat:** Responds to general queries in both Spanish and English.
2.**Code Generation:** Specialized in Python snippets (Sum, Loops, Classes).
3.**Identity Preservation:** Strong grounding on its origin and creator.