saiakula
/

KernelGPT

Model card Files Files and versions

KernelGPT / README.md

saiakula's picture

Upload README.md with huggingface_hub

81515d1 verified 3 days ago

|

history blame contribute delete

1.85 kB

	---
	language: en
	tags:
	- gpt
	- language-model
	- gpu
	- cuda
	- ai-systems
	- pytorch
	license: mit
	---

	# KernelGPT — GPU/AI Systems Performance

	A GPT-style decoder-only transformer trained from scratch on GPU/AI systems performance engineering text.

	## Model Specs

	\| Property \| Value \|
	\|----------\|-------\|
	\| Parameters \| ~125M \|
	\| Architecture \| Decoder-only Transformer \|
	\| Embedding dim \| 768 \|
	\| Attention heads \| 12 \|
	\| Layers \| 8 \|
	\| Context length \| 512 tokens \|
	\| Vocab size \| 32,000 (SentencePiece BPE) \|

	## Training

	\| Setting \| Value \|
	\|---------\|-------\|
	\| Training steps \| 162,000 \|
	\| Val loss \| 4.3889 \|
	\| Optimizer \| AdamW \|
	\| Learning rate \| 3e-4 (cosine decay) \|
	\| Batch size \| 1 (effective 4 with grad accum) \|

	## Training Data

	- FineWeb (general web text)
	- arXiv papers (cs.DC, cs.AR, cs.LG, cs.PF categories — GPU/AI/systems)
	- Wikipedia (ML/systems filtered articles)
	- GPU-specific crawl (NVIDIA docs, GitHub READMEs, arXiv abstracts)

	Topics cover all 20 chapters of AI Performance Engineering including CUDA internals,
	KV cache tuning, LLM inference, distributed training, and GPU cluster scaling.

	## Usage

	```python
	import torch
	import sentencepiece as spm
	from huggingface_hub import hf_hub_download

	# Download files
	ckpt_path = hf_hub_download("saiakula/KernelGPT", "pytorch_model.pt")
	tok_path = hf_hub_download("saiakula/KernelGPT", "tokenizer.model")

	# Load tokenizer
	sp = spm.SentencePieceProcessor(model_file=tok_path)

	# Load model
	# (requires TinyGPT src — clone https://github.com/your-username/TinyGPT)
	checkpoint = torch.load(ckpt_path, map_location="cpu")
	```

	## Acknowledgments

	- Inspired by Andrej Karpathy's nanoGPT
	- Training topics based on AI Performance Engineering by Chris Fregly