Add Deep-NanoGPT experiment (Phase 1 & 2): resumable training, inference, 72-layer models 671ce97 AdriBat1 commited on Jan 2