NanoCodeGPT

A GPT-style language model built entirely from scratch using PyTorch — no pretrained weights, no APIs, just math and gradient descent.

Trained on 8,000 Python functions to complete code given a prompt.

Model Architecture

Property	Value
Type	Decoder-only Transformer (GPT)
Parameters	~10M
Layers	6 Transformer blocks
Attention heads	6 per block
Embedding dim	384
Context length	256 tokens
Tokenizer	GPT-2 BPE (50,257 vocab)
Activation	GELU
Training steps	5,000
Final val loss	2.97

Training

Dataset: flytech/python-codes-25k (first 8k examples)
Hardware: Google Colab T4 GPU (free tier)
Time: ~58 minutes
Optimizer: AdamW (lr=3e-4)
Batch size: 32

Example Outputs

Prompt: def fibonacci(n):

def fibonacci(n):
    if n < 0:
        print("Incorrect input")
    elif n == 1:
        return 0
    elif n == 2:
        return 1
    else:
        return fibonacci(n-1) + fibonacci(n-2)

Prompt: def binary_search(arr, target):

def binary_search(arr, target):
    low = 0
    high = len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        else:
            low = mid + 1
    return -1

Built by

Hassaan Raza LinkedIn · GitHub

Downloads last month: -; Downloads are not tracked for this model. How to track