Upload gpt_modern_1b_class.script.pt
Browse filesJiRackPyTorch 1B Model Definition
FIXED: Implemented numerical stability improvements (FP32 Attention, better weight initialization)
FIXED: Corrected gradient checkpointing usage.
FIXED: Added Dropout layers.
FIXED: Auto-detect device for RoPE buffer handling.