File size: 875 Bytes
6bbf8c7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | ---
license: mit
---
# Custom Llama-style
This repository contains a single `.pt` checkpoint file from a fine-tuned model.
**This model is NOT directly usable with `transformers.AutoModel.from_pretrained()` yet.** It needs to be converted to the Hugging Face format first.
## Training Details
- **Framework:** [modded-nanoGPT-soap](https://github.com/nikhilvyas/modded-nanogpt-SOAP).
- **Architecture:** This model uses modern features and is NOT a standard GPT-2.
- **Positional Embeddings:** Rotary Position Embeddings (RoPE)
- **Normalization:** RMSNorm
- **Bias:** Linear layers trained with `bias=False`.
## Model Configuration
This is the information needed to perform the conversion:
- `n_layer`: 12
- `n_head`: 12
- `n_embd`: 768
- `vocab_size`: 50257
- `block_size`: 1024
## Tokenizer
The model was trained with the standard `gpt2` tokenizer. |