webxos
/

microd_v1

webxos commited on 21 days ago

Commit

6328f90

verified ·

1 Parent(s): 7929118

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -17,6 +17,15 @@ tags:
 This model was made with the Micro Distillery app available at:
 webxos.netlify.app/MICROD
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.

 This model was made with the Micro Distillery app available at:
 webxos.netlify.app/MICROD
+    Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
+    Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
+    VAE Filtering: Apply latent space compression to improve distillation quality.
+    Sandbox Testing: Execute safe Python code with feedback masking.
+    Export & Deployment: Generate deployable models for inference in various frameworks.
+    Offline Usage: PWA supports offline training simulation and exports.
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.