webxos
/

microd_v1

webxos commited on 14 days ago

Commit

ab96051

verified ·

1 Parent(s): fe8e7b3

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -33,13 +33,6 @@ This model was made with the Micro Distillery app available at:
 webxos.netlify.app/MICROD
--Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
--Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
--VAE Filtering: Apply latent space compression to improve distillation quality.
--Sandbox Testing: Execute safe Python code with feedback masking.
--Export & Deployment: Generate deployable models for inference in various frameworks.
--Offline Usage: PWA supports offline training simulation and exports.
   <div id="app">
     <!-- TOP BAR -->
     <div class="top-bar">
@@ -51,8 +44,6 @@ webxos.netlify.app/MICROD
         <div class="pill">- **Model type**: micro-distill-grpo-vae</div>
         <button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button>
       </div>
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
@@ -85,6 +76,17 @@ making it runnable on modest hardware like CPUs or even browsers via TensorFlow.
 ## Usage
 ## Citation

 webxos.netlify.app/MICROD
   <div id="app">
     <!-- TOP BAR -->
     <div class="top-bar">
         <div class="pill">- **Model type**: micro-distill-grpo-vae</div>
         <button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button>
       </div>
 ## Model Description
 This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
 ## Usage
+-Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
+-Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
+-VAE Filtering: Apply latent space compression to improve distillation quality.
+-Sandbox Testing: Execute safe Python code with feedback masking.
+-Export & Deployment: Generate deployable models for inference in various frameworks.
+-Offline Usage: PWA supports offline training simulation and exports.
 ## Citation