File size: 875 Bytes

6bbf8c7

---
license: mit
---

# Custom Llama-style 

This repository contains a single `.pt` checkpoint file from a fine-tuned model.

**This model is NOT directly usable with `transformers.AutoModel.from_pretrained()` yet.** It needs to be converted to the Hugging Face format first.

## Training Details

- **Framework:**  [modded-nanoGPT-soap](https://github.com/nikhilvyas/modded-nanogpt-SOAP).
- **Architecture:** This model uses modern features and is NOT a standard GPT-2.
  - **Positional Embeddings:** Rotary Position Embeddings (RoPE)
  - **Normalization:** RMSNorm
  - **Bias:** Linear layers trained with `bias=False`.

## Model Configuration

This is the information needed to perform the conversion:

- `n_layer`: 12  
- `n_head`: 12  
- `n_embd`: 768 
- `vocab_size`: 50257 
- `block_size`: 1024

## Tokenizer

The model was trained with the standard `gpt2` tokenizer.