derikk's picture
Upload README.md with huggingface_hub
12a29e9 verified
# Checkpoint Upload
This model checkpoint was automatically uploaded from a distributed training run.
## Model Details
- Training step: 21
- Architecture: Llama-style model
- Hidden size: 2048
- Layers: 36
- Vocabulary size: 151,936
## Checkpoint Information
- Originally saved as distributed checkpoint across 4 ranks
- Consolidated into single checkpoint for easier use
- Contains model weights, optimizer states, and training configuration
## Usage
```python
import torch
# Load the checkpoint
checkpoint = torch.load('pytorch_model.bin', map_location='cpu')
# The checkpoint contains the model state dict
# You'll need to initialize the appropriate model architecture
# and load these weights
```
## Note
This is a raw training checkpoint. For inference, you may need to:
1. Initialize the correct model architecture
2. Load the weights properly
3. Convert to the desired format (e.g., Hugging Face Transformers format)