File size: 1,198 Bytes
3a3363e 120dae8 3a3363e 120dae8 3a3363e 120dae8 3a3363e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
language:
- en
license: mit
tags:
- text-generation
library_name: transformers
---
# Nanochat
Nanochat is a small language model from Andrej Karpathy, converted to HuggingFace format.
## Model Details
- **Architecture**: GPT-style transformer with RoPE, QK normalization, ReLU², and logits softcap
- **Parameters**: ~393M
- **Hidden Size**: 1280
- **Layers**: 20
- **Attention Heads**: 10
- **Vocabulary**: 65536 tokens
- **Context Length**: 2048 tokens
## Usage
### With Transformers (PyTorch)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("<model-path>", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("<model-path>")
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))
```
### Converting to MLX
To use with Apple's MLX framework:
```bash
mlx_lm.convert --hf-path <model-path> --mlx-path nanochat-mlx --trust-remote-code
mlx_lm.generate --model nanochat-mlx --prompt "Once upon a time"
```
## Citation
Original model by Andrej Karpathy: https://github.com/karpathy/nanochat
|