| # TC-LeJEPA (Text-Conditioned LeJEPA) |
|
|
| An ablation of adding text conditioning to the predictor in [LeJEPA](https://arxiv.org/abs/2511.08544). We do **not** predict text — we condition the JEPA predictor on text and keep the original two-term objective `L = (1-λ)·L_inv + λ·SIGReg`. |
|
|
| ## Variants |
| | Variant | Conditioning | |
| |-------------|-----------------------------------------| |
| | `baseline` | vanilla LeJEPA, MLP predictor | |
| | `film` | FiLM on MLP predictor, text → (γ, β) | |
| | `xattn` | Patch tokens cross-attend to text | |
| | `wrong_text`| `xattn` with permuted label-text map | |
|
|
| Backbone: ViT-Small/16 at 128×128. Text tower: OpenCLIP ViT-B/32 (frozen). Dataset: CIFAR-100. |
|
|
| ## Reading order |
| 1. `comparison.md` — results, figures, and answers to the four research questions. |
| 2. `tclejepa_src/modules.py` — `TCLeJEPAModel`, predictor variants, `SIGReg`. |
| 3. `tclejepa_src/train.py` — training loop (same loss for every variant). |
| 4. `tclejepa_src/evaluate.py` — linear probe, SIGReg↔acc correlation, t-SNE steering. |
|
|
| ## Artifacts |
| Checkpoints, figures, logs, and `comparison.{md,json}` live at |
| **https://huggingface.co/adipanda/lejepa**. |
|
|
| ## Running |
| ```bash |
| set -a && source .env && set +a # loads WANDB_API_KEY and HF_TOKEN |
| uv sync |
| EPOCHS=30 BS=512 WORKERS=12 ./run_all.sh # runs all 4 variants sequentially |
| uv run python -m tclejepa_src.evaluate # produces comparison.{json,md} + figures |
| ``` |
|
|