--- license: cc-by-nd-4.0 language: - en base_model: - lakeAGI/PersonViTReID --- # PersonViT — Re‑ID Ablation on PRW This repository contains an evaluation and fine‑tuning pipeline for **PersonViT** (TransReID backbone) on the **PRW (Person Re‑Identification in the Wild)** dataset. The notebook is designed for Kaggle and includes: - pretrained checkpoint evaluation - a structured multi‑phase ablation study (strategy -> loss -> ViT‑Small) - automatic on‑disk checkpointing of results after every run to survive disconnects --- ## What the notebook does The notebook runs an ablation study in three phases: 1. **Pretrained evaluation (ViT‑Base)** Evaluates multiple pretrained PersonViT ViT‑Base checkpoints on PRW and selects the best baseline (highest mAP). 2. **Phase 1 — Strategy comparison (loss fixed = ArcFace)** Compares three fine‑tuning strategies: - **full**: unfreeze everything, very small LR - **partial**: freeze backbone, train head only - **freeze**: freeze backbone and reset head, retrain head 3. **Phase 2 — Loss comparison (strategy fixed = best Phase 1)** With the best strategy fixed (full), compares metric learning losses: - Triplet (with hard mining) - ArcFace (reuses the Phase 1 ArcFace run for the winning strategy) - Angular (configured to avoid “loss = 0” issues from overly restrictive mining) 4. **Phase 3 — ViT‑Small (best strategy + best loss)** Fine‑tunes **ViT‑Small** using the best strategy+loss from Phase 1/2, then adds it to the final comparison. Finally, it generates comparison plots and exports a summary CSV. --- ## Requirements ### Hardware - Single GPU recommended (tested design target: **1× NVIDIA T4 16GB**) ### Software - Python 3.x - PyTorch 2.x - Kaggle notebook environment (or equivalent) ### Python packages (installed by the notebook) - `albumentations` - `opencv-python-headless` - `scipy` - `torchmetrics` - `timm` - `einops` - `yacs` - `pytorch-metric-learning` - `thop` --- ## Dataset and checkpoints ### PRW dataset Configure the PRW root directory in the notebook `Config`: - `cfg.dataset_root = '/kaggle/input/datasets/edoardomerli/prw-person-re-identification-in-the-wild'` The notebook expects the standard PRW structure (frames, annotations, query_box, split mats, etc.). ### Pretrained checkpoints The notebook evaluates multiple pretrained **ViT‑Base** checkpoints and uses the best one for fine‑tuning. It also contains a Phase 3 configuration for **ViT‑Small** (Market‑1501 pretrained). Make sure the checkpoint paths in `Config` match your Kaggle inputs. --- ## How to run (recommended order) 1. **Install & Imports** 2. **Config** 3. **Datasets & DataLoaders** (this step takes time because PRW annotations are parsed) 4. **(Optional) Resume**: run `load_results()` to restore `RESULTS` from disk 5. Run the experiment blocks in order: - Pretrained evaluation - Phase 1 runs + Phase 1 plots (selects best strategy) - Phase 2 runs + Phase 2 plots (selects best loss) - Phase 3 (ViT‑Small) - Final comparison + CSV export The fine‑tuning helper automatically **skips runs already present** in the saved results, so you can safely re-run cells after a disconnect. --- ## Output files ### Progressive results checkpoint (anti-disconnect) Saved every time `RESULTS` is updated: - `evaluation_results/all_results.json` ### Final summary table Saved at the end: - `evaluation_results/all_results.csv` ### Plots Saved into: - `evaluation_results/plots/` Typical outputs include: - bar chart (mAP & Rank‑1) - heatmap (mAP across strategy × loss when applicable) - radar chart with 4 metrics: **mAP, Rank‑1, Rank‑5, Rank‑10** - per-run training curves: - `/kaggle/working/personvit_finetuning/curves_.png` ### Fine‑tuning checkpoints Best checkpoint per run: - `/kaggle/working/personvit_finetuning/best_.pth` --- ## Key concepts and run keys ### Fine‑tuning strategies - `full`: unfreeze all parameters (low LR, avoids catastrophic forgetting) - `partial`: freeze backbone, train only head - `freeze`: freeze backbone, reset head modules, retrain head from scratch ### Losses - `triplet`: TripletMarginLoss + hard mining - `arcface`: ArcFace loss (with its own internal parameters) - `angular`: Angular loss (configured to avoid empty mining → loss=0) ### Run key naming - Phase 1: - `full_arcface`, `partial_arcface`, `freeze_arcface` - Phase 2 (for best strategy `S`): - `S_triplet`, `S_arcface` (reused), `S_angular` - Phase 3: - `vit_small__` --- ## What is saved in RESULTS `RESULTS[run_key]` stores: - Metrics: `mAP`, `rank1`, `rank5`, `rank10`, `num_valid_queries` - Profiling: `total_params`, `flops_giga`, `inference_ms`, `throughput` - Fine‑tuning only: - `history` (training curves data) - `trainable_params` (**number of trainable parameters**) to quantify how much of the model was actually fine‑tuned - Metadata: `strategy`, `loss`, plus `display_name` for pretrained baselines --- ## AMP / Mixed precision note The notebook uses native PyTorch AMP to speed up training and reduce VRAM usage on T4-class GPUs. If you update PyTorch and see deprecation warnings, switch to the newer API: - `torch.amp.GradScaler('cuda', ...)` - `torch.amp.autocast(device_type='cuda', ...)` If you disable AMP (`cfg.use_amp = False`), training will run in FP32 (more stable, slower, higher VRAM usage). --- ## Credits - PersonViT codebase and pretrained weights belong to their respective authors. - This repository provides an ablation and fine‑tuning workflow tailored for PRW on a single GPU with robust persistence.