NaNo / README.md
Inserloft's picture
Upload 2 files
e3e1377 verified
---
language:
- es
- en
license: mit
tags:
- gpt2
- code
- bilingual
- inserloft
model_name: Cleo Nano v3.1 Bilingual
---
# Cleo Nano v3.1 (Bilingual Optimization)
Cleo Nano is a decoder-only Transformer model developed by **Inserloft** under the vision of **Jesus Heriberto Corona**. This version (v3.1) features surgical fine-tuning for bilingual stability (English/Spanish) and hallucination control.
## Model Details
- **Architecture:** Decoder-Only GPT (Custom)
- **Layers:** 8
- **Embedding Dim:** 384
- **Attention Heads:** 12
- **Context Window:** 256 tokens
- **Parameters:** ~15M
- **Training Data:** Mix of Wikipedia, Python Code (CodeFeedback), and Identity Anchoring.
## Usage
To use this model, you need the custom `CleoNanoV3` architecture defined in PyTorch. The weights can be loaded using `torch.load()` or via the Hugging Face `from_pretrained` if using the provided mapping logic.
### Capabilities
1. **Bilingual Chat:** Responds to general queries in both Spanish and English.
2. **Code Generation:** Specialized in Python snippets (Sum, Loops, Classes).
3. **Identity Preservation:** Strong grounding on its origin and creator.
---
Developed by [Inserloft](https://inserloft.dev/)