webxos
/

microd_v1

webxos commited on 20 days ago

Commit

20570d0

verified ·

1 Parent(s): 975648e

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,6 +13,19 @@ tags:
 base_model:
 - openai-community/gpt2
 ---
   <div id="app">
     <!-- TOP BAR -->
     <div class="top-bar">
@@ -27,19 +40,6 @@ base_model:
-# MICROD v1.0 (micro-distill-grpo-vae)
-This model was made with the Micro Distillery app available at:
-webxos.netlify.app/MICROD
-    -Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
-    -Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
-    -VAE Filtering: Apply latent space compression to improve distillation quality.
-    -Sandbox Testing: Execute safe Python code with feedback masking.
-    -Export & Deployment: Generate deployable models for inference in various frameworks.
-    -Offline Usage: PWA supports offline training simulation and exports.
-```
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
 **MICROD v1.0 (micro-distill-grpo-vae)** is a small template model designed to be built upon for custom ground up builds. It is distillated into a

 base_model:
 - openai-community/gpt2
 ---
+# MICROD v1.0 (micro-distill-grpo-vae)
+This model was made with the Micro Distillery app available at:
+webxos.netlify.app/MICROD
+    -Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
+    -Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
+    -VAE Filtering: Apply latent space compression to improve distillation quality.
+    -Sandbox Testing: Execute safe Python code with feedback masking.
+    -Export & Deployment: Generate deployable models for inference in various frameworks.
+    -Offline Usage: PWA supports offline training simulation and exports.
   <div id="app">
     <!-- TOP BAR -->
     <div class="top-bar">
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
 **MICROD v1.0 (micro-distill-grpo-vae)** is a small template model designed to be built upon for custom ground up builds. It is distillated into a