Nanochat D24 Nebius Run

This repository contains the saved model artifacts from a successful Nanochat training run completed on Nebius using H200 GPUs.

Companion repository

The codebase, report, and supporting project documentation for this run live in the companion GitHub repository:

github.com/CodingWCal/nanochat-nebius-run

Overview

This model represents a full Nanochat run that was completed, validated, and preserved after training. After the run finished, I was able to launch the browser chat interface successfully, confirm the model loaded correctly, and back up the final checkpoint and tokenizer assets before shutting down the remote GPU machine.

For me, this project was not just about getting a model to train. It was about following the process all the way through from setup and execution to validation, backup, and preservation of the final artifacts. That made this run especially meaningful because it reflects a cleaner and more complete outcome than an earlier attempt where the final save process did not go as planned.

What is included

This repository is intended to store the core artifacts needed to preserve and reload the final trained result:

model_000483.pt

meta_000483.json

optim_000483_rank*.pt

tokenizer.pkl

token_bytes.pt

Training context

The run was completed on a Nebius GPU VM using the official Nanochat speedrun workflow

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support