derikk commited on
Commit
12a29e9
·
verified ·
1 Parent(s): 35faced

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Checkpoint Upload
2
+
3
+ This model checkpoint was automatically uploaded from a distributed training run.
4
+
5
+ ## Model Details
6
+ - Training step: 21
7
+ - Architecture: Llama-style model
8
+ - Hidden size: 2048
9
+ - Layers: 36
10
+ - Vocabulary size: 151,936
11
+
12
+ ## Checkpoint Information
13
+ - Originally saved as distributed checkpoint across 4 ranks
14
+ - Consolidated into single checkpoint for easier use
15
+ - Contains model weights, optimizer states, and training configuration
16
+
17
+ ## Usage
18
+
19
+ ```python
20
+ import torch
21
+
22
+ # Load the checkpoint
23
+ checkpoint = torch.load('pytorch_model.bin', map_location='cpu')
24
+
25
+ # The checkpoint contains the model state dict
26
+ # You'll need to initialize the appropriate model architecture
27
+ # and load these weights
28
+ ```
29
+
30
+ ## Note
31
+ This is a raw training checkpoint. For inference, you may need to:
32
+ 1. Initialize the correct model architecture
33
+ 2. Load the weights properly
34
+ 3. Convert to the desired format (e.g., Hugging Face Transformers format)