Update README.md
Browse files
README.md
CHANGED
|
@@ -2,8 +2,6 @@
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
-
base_model:
|
| 6 |
-
- openai/gpt-oss-20b
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
library_name: transformers.js
|
| 9 |
tags:
|
|
@@ -12,6 +10,8 @@ tags:
|
|
| 12 |
- grpo
|
| 13 |
- vae
|
| 14 |
- pytorch
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
<div id="app">
|
| 17 |
<!-- TOP BAR -->
|
|
@@ -26,24 +26,26 @@ tags:
|
|
| 26 |
<button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button>
|
| 27 |
</div>
|
| 28 |
---
|
| 29 |
-
|
| 30 |
-
# MICROD v1.0
|
| 31 |
This model was made with the Micro Distillery app available at:
|
| 32 |
|
|
|
|
| 33 |
webxos.netlify.app/MICROD
|
| 34 |
|
| 35 |
-
Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
|
| 36 |
-
Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
|
| 37 |
-
VAE Filtering: Apply latent space compression to improve distillation quality.
|
| 38 |
-
Sandbox Testing: Execute safe Python code with feedback masking.
|
| 39 |
-
Export & Deployment: Generate deployable models for inference in various frameworks.
|
| 40 |
-
Offline Usage: PWA supports offline training simulation and exports.
|
|
|
|
| 41 |
|
| 42 |
## Model Description
|
| 43 |
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
|
| 44 |
-
**MICROD v1.0** is a small template model designed to be built upon for custom ground up builds. It is distillated into a
|
| 45 |
small set of files the user can use to template their own agents. Designed for educational learning and micro scalling.
|
| 46 |
-
Use **MICROD V1.0** in your own custom projects and train it from the ground up.
|
| 47 |
|
| 48 |
## Model Details
|
| 49 |
- **Model type**: micro-distill-grpo-vae
|
|
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
|
|
|
|
|
|
| 5 |
pipeline_tag: text-generation
|
| 6 |
library_name: transformers.js
|
| 7 |
tags:
|
|
|
|
| 10 |
- grpo
|
| 11 |
- vae
|
| 12 |
- pytorch
|
| 13 |
+
base_model:
|
| 14 |
+
- openai-community/gpt2
|
| 15 |
---
|
| 16 |
<div id="app">
|
| 17 |
<!-- TOP BAR -->
|
|
|
|
| 26 |
<button id="invertBtn" class="btn-ghost">- **License**: Apache 2.0</button>
|
| 27 |
</div>
|
| 28 |
---
|
| 29 |
+
```
|
| 30 |
+
# MICROD v1.0 (micro-distill-grpo-vae)
|
| 31 |
This model was made with the Micro Distillery app available at:
|
| 32 |
|
| 33 |
+
```
|
| 34 |
webxos.netlify.app/MICROD
|
| 35 |
|
| 36 |
+
-Model Distillation Training: Simulate GRPO optimization with VAE filtering for small LLMs (42M-345M params).
|
| 37 |
+
-Policy Experimentation: Test group sizes, KL penalties, cache reuse for RLHF-like training.
|
| 38 |
+
-VAE Filtering: Apply latent space compression to improve distillation quality.
|
| 39 |
+
-Sandbox Testing: Execute safe Python code with feedback masking.
|
| 40 |
+
-Export & Deployment: Generate deployable models for inference in various frameworks.
|
| 41 |
+
-Offline Usage: PWA supports offline training simulation and exports.
|
| 42 |
+
```
|
| 43 |
|
| 44 |
## Model Description
|
| 45 |
This is a distilled language model trained using Group Relative Policy Optimization (GRPO) with VAE filtering.
|
| 46 |
+
**MICROD v1.0 (micro-distill-grpo-vae)** is a small template model designed to be built upon for custom ground up builds. It is distillated into a
|
| 47 |
small set of files the user can use to template their own agents. Designed for educational learning and micro scalling.
|
| 48 |
+
Use **MICROD V1.0 (micro-distill-grpo-vae)** in your own custom projects and train it from the ground up.
|
| 49 |
|
| 50 |
## Model Details
|
| 51 |
- **Model type**: micro-distill-grpo-vae
|