ProgramerSalar commited on
Commit
6c42947
·
verified ·
1 Parent(s): d481e5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - vae
5
+ - video-generation
6
+ - education
7
+ - fine-tuning
8
+ - pytorch
9
+ ---
10
+
11
+ # 🎓 Causal VAE Fine-Tuning Experiments (Indian Math Curriculum)
12
+
13
+ **Developing the "Imagination Engine" for [Zulense](https://huggingface.co/zulense)**
14
+
15
+ This repository contains experimental checkpoints for a **Causal VAE (Variational Autoencoder)** fine-tuned specifically on Indian educational content (NCERT Math).
16
+
17
+ The goal of these experiments is to adapt standard video generation VAEs to better reconstruct "blackboard style" line art, diagrams, and text-heavy educational videos, which often suffer from artifacts in general-purpose models.
18
+
19
+ ## 📂 Checkpoint Manifest
20
+
21
+ We are releasing two distinct checkpoints representing different stages of our training curriculum.
22
+
23
+ ### 1. `FineTune_2_checkpoint.pth` (Recommended)
24
+ * **Target Domain:** **Class 5 Numeracy & Foundation**
25
+ * **Status:** ✅ **Improved Stability**
26
+ * **Experiment Notes:** * This run focused on simpler, foundational concepts (Class 5 curriculum) to stabilize the loss.
27
+ * **Improvements:** Significantly reduced `kl_divergence` and reconstruction loss compared to the V1 baseline.
28
+ * **Use Case:** Better at handling basic shapes and slower temporal movements typical in primary education teaching.
29
+
30
+ ### 2. `checkpoint-0.pth` (Legacy / Research Artifact)
31
+ * **Target Domain:** **Class 8 Geometry & Algebra**
32
+ * **Status:** ⚠️ **Unstable / High Loss**
33
+ * **Experiment Notes:** * This was our initial attempt at modeling complex Class 8 geometry.
34
+ * **Known Issues:** The model struggled with high-frequency details (text/grid lines), resulting in higher `vae_loss` and unstable KL divergence.
35
+ * **Why we kept it:** Retained for comparative analysis to show the difficulty jump between primary and middle school visual complexity.
36
+
37
+ ## 🔬 Technical Context
38
+
39
+ Standard video VAEs are optimized for photorealism. Our experiments suggest that for **educational video synthesis**:
40
+ 1. **Text Preservation:** Standard VAEs struggle to reconstruct the sharp text found in math explanations.
41
+ 2. **Curriculum Learning:** Fine-tuning on simpler visual concepts (Class 5) before complex ones (Class 8) yields better convergence.
42
+
43
+ ## 💻 Usage (PyTorch)
44
+
45
+ ```python
46
+ import torch
47
+
48
+ # Load the Causal VAE checkpoint
49
+ checkpoint_path = "FineTune_2_checkpoint.pth" # Use the stable Class 5 checkpoint
50
+ state_dict = torch.load(checkpoint_path, map_location="cpu")
51
+
52
+ print(f"Loaded checkpoint: {checkpoint_path}")
53
+ # Note: This requires the specific Causal VAE architecture definition to load state_dict