Spaces:

anthonym21
/

slipstream-governance-openenv

Sleeping

anthonym21 commited on Jan 20

Commit

57cb6b6

2 Parent(s): 2b427d2 ef2991b

Merge remote main with 46-anchor vocabulary update

Files changed (1) hide show

README.md CHANGED Viewed

@@ -120,17 +120,24 @@ Teach the model the Slipstream format using the [Slipstream-TQT dataset](https:/
 Align the model using this environment's reward signal:
-```bash
-# See: slipstream_training/grpo_gemma3_4b_colab.ipynb
 ```
-The notebook connects to this Space and uses the reward signal to train the model to:
-- Refuse covert channel temptations
-- Resist adversarial attack prompts
-- Maintain protocol correctness
-**Result:** [anthonym21/gemma-3-4b-it-slipstream-grpo](https://huggingface.co/anthonym21/gemma-3-4b-it-slipstream-grpo)
 ### Stage 3: Quantization (Optional)
 Distill the aligned model for efficient deployment.
@@ -185,8 +192,8 @@ slipstream_governance_env/
 │   ├── anchors.json              # Allowed anchor list
 │   └── vocab.json                # Known vocabulary
 ├── slipstream_training/
-│   ├── sft_gemma3_4b_colab.ipynb  # Stage 1: SFT notebook
-│   └── grpo_gemma3_4b_colab.ipynb # Stage 2: GRPO notebook
 ├── models.py                     # Pydantic models
 ├── client.py                     # Python client
 └── Dockerfile                    # HF Spaces deployment

 Align the model using this environment's reward signal:
+```python
+from trl import GRPOTrainer, GRPOConfig
+# Environment provides reward signal
+def reward_fn(completions, **kwargs):
+    rewards = []
+    for completion in completions:
+        result = client.step({"message": completion})
+        rewards.append(result["reward"])
+    return rewards
+trainer = GRPOTrainer(
+    model="anthonym21/gemma-3-4b-it-slipstream-sft",
+    reward_funcs=reward_fn,
+    ...
+)
 ```
 ### Stage 3: Quantization (Optional)
 Distill the aligned model for efficient deployment.
 │   ├── anchors.json              # Allowed anchor list
 │   └── vocab.json                # Known vocabulary
 ├── slipstream_training/
+│   ├── sft_gemma3_4b_colab.ipynb # SFT notebook
+│   └── grpo_slipstream_governance.py # GRPO script
 ├── models.py                     # Pydantic models
 ├── client.py                     # Python client
 └── Dockerfile                    # HF Spaces deployment