xingqiang
/

mac-m3-grpo

xingqiang commited on Apr 24, 2025

Commit

45d0e02

verified ·

1 Parent(s): e4a17bd

Upload README.md - Upload Mac M3 GRPO model

Files changed (1) hide show

README.md ADDED Viewed

+# Mac M3 GRPO Model
+Pure PyTorch implementation of GRPO (Generative Reinforcement Learning with Preference Optimization) for Mac M3.
+## Model Details
+- **Model Type**: GRPO (DreamerV3-inspired)
+- **Framework**: PyTorch
+- **Vocabulary Size**: 102
+- **Embedding Dimension**: 64
+- **Latent Dimension**: 32
+- **Compatible With**: Mac M3, MPS acceleration
+## Usage
+```python
+import torch
+from examples.mac_m3_grpo import WorldModel, PolicyNetwork
+# Initialize the model
+world_model = WorldModel(
+    vocab_size=102,
+    embed_dim=64,
+    latent_dim=32
+)
+# Load the weights
+world_model.load_state_dict(torch.load("world_model.pt"))
+# Create policy
+policy = PolicyNetwork(world_model)
+# Generate text
+# [Your generation code here]
+```
+## Training Details
+This model was trained using reinforcement learning with preference optimization,
+similar to the approach used in DreamerV3 but adapted for text generation.