# Mini GPT1 Clone This is a decoder-only transformer model (GPT1-style) trained from scratch using PyTorch. ## Model Details - **Architecture**: Decoder-only Transformer - **Layers**: 6 - **Embedding Size**: 512 - **Heads**: 8 - **Feedforward Dim**: 2048 - **Sequence Length**: 256 - **Vocab Size**: 35,000 ## Tokenizer Trained using `ByteLevelBPETokenizer` from the `tokenizers` library. ## Inference Example ```python from transformers import PreTrainedTokenizerFast, AutoModelForCausalLM import torch tokenizer = PreTrainedTokenizerFast(tokenizer_file="tokenizer/tokenizer.json") model = AutoModelForCausalLM.from_pretrained("dilip025/mini-gpt1") prompt = "Once upon a time," input_ids = tokenizer(prompt, return_tensors="pt").input_ids outputs = model.generate(input_ids, max_length=50) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) License MIT