File size: 875 Bytes
6bbf8c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: mit
---

# Custom Llama-style 

This repository contains a single `.pt` checkpoint file from a fine-tuned model.

**This model is NOT directly usable with `transformers.AutoModel.from_pretrained()` yet.** It needs to be converted to the Hugging Face format first.

## Training Details

- **Framework:**  [modded-nanoGPT-soap](https://github.com/nikhilvyas/modded-nanogpt-SOAP).
- **Architecture:** This model uses modern features and is NOT a standard GPT-2.
  - **Positional Embeddings:** Rotary Position Embeddings (RoPE)
  - **Normalization:** RMSNorm
  - **Bias:** Linear layers trained with `bias=False`.

## Model Configuration

This is the information needed to perform the conversion:

- `n_layer`: 12  
- `n_head`: 12  
- `n_embd`: 768 
- `vocab_size`: 50257 
- `block_size`: 1024

## Tokenizer

The model was trained with the standard `gpt2` tokenizer.