derikk
/

training-checkpoint-step21

Model card Files Files and versions

training-checkpoint-step21 / README.md

derikk's picture

Upload README.md with huggingface_hub

12a29e9 verified 4 months ago

|

history blame contribute delete

932 Bytes

	# Checkpoint Upload

	This model checkpoint was automatically uploaded from a distributed training run.

	## Model Details
	- Training step: 21
	- Architecture: Llama-style model
	- Hidden size: 2048
	- Layers: 36
	- Vocabulary size: 151,936

	## Checkpoint Information
	- Originally saved as distributed checkpoint across 4 ranks
	- Consolidated into single checkpoint for easier use
	- Contains model weights, optimizer states, and training configuration

	## Usage

	```python
	import torch

	# Load the checkpoint
	checkpoint = torch.load('pytorch_model.bin', map_location='cpu')

	# The checkpoint contains the model state dict
	# You'll need to initialize the appropriate model architecture
	# and load these weights
	```

	## Note
	This is a raw training checkpoint. For inference, you may need to:
	1. Initialize the correct model architecture
	2. Load the weights properly
	3. Convert to the desired format (e.g., Hugging Face Transformers format)