Update README.md
Browse files
README.md
CHANGED
|
@@ -17,6 +17,15 @@ tags:
|
|
| 17 |
This model was made with the Micro Distillery app available at:
|
| 18 |
webxos.netlify.app/MICROD
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
## Model Description
|
| 21 |
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
|
| 22 |
|
|
|
|
| 17 |
This model was made with the Micro Distillery app available at:
|
| 18 |
webxos.netlify.app/MICROD
|
| 19 |
|
| 20 |
+
|
| 21 |
+
Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
|
| 22 |
+
Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
|
| 23 |
+
VAE Filtering: Apply latent space compression to improve distillation quality.
|
| 24 |
+
Sandbox Testing: Execute safe Python code with feedback masking.
|
| 25 |
+
Export & Deployment: Generate deployable models for inference in various frameworks.
|
| 26 |
+
Offline Usage: PWA supports offline training simulation and exports.
|
| 27 |
+
|
| 28 |
+
|
| 29 |
## Model Description
|
| 30 |
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
|
| 31 |
|