xingqiang commited on
Commit
45d0e02
·
verified ·
1 Parent(s): e4a17bd

Upload README.md - Upload Mac M3 GRPO model

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Mac M3 GRPO Model
2
+
3
+ Pure PyTorch implementation of GRPO (Generative Reinforcement Learning with Preference Optimization) for Mac M3.
4
+
5
+ ## Model Details
6
+ - **Model Type**: GRPO (DreamerV3-inspired)
7
+ - **Framework**: PyTorch
8
+ - **Vocabulary Size**: 102
9
+ - **Embedding Dimension**: 64
10
+ - **Latent Dimension**: 32
11
+ - **Compatible With**: Mac M3, MPS acceleration
12
+
13
+ ## Usage
14
+
15
+ ```python
16
+ import torch
17
+ from examples.mac_m3_grpo import WorldModel, PolicyNetwork
18
+
19
+ # Initialize the model
20
+ world_model = WorldModel(
21
+ vocab_size=102,
22
+ embed_dim=64,
23
+ latent_dim=32
24
+ )
25
+
26
+ # Load the weights
27
+ world_model.load_state_dict(torch.load("world_model.pt"))
28
+
29
+ # Create policy
30
+ policy = PolicyNetwork(world_model)
31
+
32
+ # Generate text
33
+ # [Your generation code here]
34
+ ```
35
+
36
+ ## Training Details
37
+ This model was trained using reinforcement learning with preference optimization,
38
+ similar to the approach used in DreamerV3 but adapted for text generation.