lejepa / README.md
adipanda's picture
final write-up
db899ff verified

TC-LeJEPA (Text-Conditioned LeJEPA)

An ablation of adding text conditioning to the predictor in LeJEPA. We do not predict text — we condition the JEPA predictor on text and keep the original two-term objective L = (1-λ)·L_inv + λ·SIGReg.

Variants

Variant Conditioning
baseline vanilla LeJEPA, MLP predictor
film FiLM on MLP predictor, text → (γ, β)
xattn Patch tokens cross-attend to text
wrong_text xattn with permuted label-text map

Backbone: ViT-Small/16 at 128×128. Text tower: OpenCLIP ViT-B/32 (frozen). Dataset: CIFAR-100.

Reading order

  1. comparison.md — results, figures, and answers to the four research questions.
  2. tclejepa_src/modules.pyTCLeJEPAModel, predictor variants, SIGReg.
  3. tclejepa_src/train.py — training loop (same loss for every variant).
  4. tclejepa_src/evaluate.py — linear probe, SIGReg↔acc correlation, t-SNE steering.

Artifacts

Checkpoints, figures, logs, and comparison.{md,json} live at https://huggingface.co/adipanda/lejepa.

Running

set -a && source .env && set +a          # loads WANDB_API_KEY and HF_TOKEN
uv sync
EPOCHS=30 BS=512 WORKERS=12 ./run_all.sh  # runs all 4 variants sequentially
uv run python -m tclejepa_src.evaluate    # produces comparison.{json,md} + figures