webxos commited on
Commit
6328f90
·
verified ·
1 Parent(s): 7929118

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -17,6 +17,15 @@ tags:
17
  This model was made with the Micro Distillery app available at:
18
  webxos.netlify.app/MICROD
19
 
 
 
 
 
 
 
 
 
 
20
  ## Model Description
21
  This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
22
 
 
17
  This model was made with the Micro Distillery app available at:
18
  webxos.netlify.app/MICROD
19
 
20
+
21
+ Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
22
+ Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
23
+ VAE Filtering: Apply latent space compression to improve distillation quality.
24
+ Sandbox Testing: Execute safe Python code with feedback masking.
25
+ Export & Deployment: Generate deployable models for inference in various frameworks.
26
+ Offline Usage: PWA supports offline training simulation and exports.
27
+
28
+
29
  ## Model Description
30
  This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
31