Fix architecture: add RMSNorm, change GELU to ReLU (matches Julia training) fc98b5b DavinciDreams commited on 5 days ago
Mask phantom token index 28, add valid_vocab to generation 809ac78 DavinciDreams commited on 5 days ago
Switch to Python/FastAPI server (RandyGPT pattern) 4927bea DavinciDreams Claude Opus 4.6 commited on 5 days ago