NanoCodeGPT

A GPT-style language model built entirely from scratch using PyTorch — no pretrained weights, no APIs, just math and gradient descent.

Trained on 8,000 Python functions to complete code given a prompt.


Model Architecture

Property Value
Type Decoder-only Transformer (GPT)
Parameters ~10M
Layers 6 Transformer blocks
Attention heads 6 per block
Embedding dim 384
Context length 256 tokens
Tokenizer GPT-2 BPE (50,257 vocab)
Activation GELU
Training steps 5,000
Final val loss 2.97

Training

  • Dataset: flytech/python-codes-25k (first 8k examples)
  • Hardware: Google Colab T4 GPU (free tier)
  • Time: ~58 minutes
  • Optimizer: AdamW (lr=3e-4)
  • Batch size: 32

Example Outputs

Prompt: def fibonacci(n):

def fibonacci(n):
    if n < 0:
        print("Incorrect input")
    elif n == 1:
        return 0
    elif n == 2:
        return 1
    else:
        return fibonacci(n-1) + fibonacci(n-2)

Prompt: def binary_search(arr, target):

def binary_search(arr, target):
    low = 0
    high = len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        else:
            low = mid + 1
    return -1

Built by

Hassaan Raza LinkedIn · GitHub

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support