# Checkpoint Upload This model checkpoint was automatically uploaded from a distributed training run. ## Model Details - Training step: 21 - Architecture: Llama-style model - Hidden size: 2048 - Layers: 36 - Vocabulary size: 151,936 ## Checkpoint Information - Originally saved as distributed checkpoint across 4 ranks - Consolidated into single checkpoint for easier use - Contains model weights, optimizer states, and training configuration ## Usage ```python import torch # Load the checkpoint checkpoint = torch.load('pytorch_model.bin', map_location='cpu') # The checkpoint contains the model state dict # You'll need to initialize the appropriate model architecture # and load these weights ``` ## Note This is a raw training checkpoint. For inference, you may need to: 1. Initialize the correct model architecture 2. Load the weights properly 3. Convert to the desired format (e.g., Hugging Face Transformers format)