# Checkpoint Upload

This model checkpoint was automatically uploaded from a distributed training run.

## Model Details
- Training step: 21
- Architecture: Llama-style model
- Hidden size: 2048
- Layers: 36
- Vocabulary size: 151,936

## Checkpoint Information
- Originally saved as distributed checkpoint across 4 ranks
- Consolidated into single checkpoint for easier use
- Contains model weights, optimizer states, and training configuration

## Usage

```python
import torch

# Load the checkpoint
checkpoint = torch.load('pytorch_model.bin', map_location='cpu')

# The checkpoint contains the model state dict
# You'll need to initialize the appropriate model architecture
# and load these weights
```

## Note
This is a raw training checkpoint. For inference, you may need to:
1. Initialize the correct model architecture
2. Load the weights properly
3. Convert to the desired format (e.g., Hugging Face Transformers format)