simoswish's picture
Update README.md
5a01eb5 verified
---
license: cc-by-nd-4.0
language:
- en
base_model:
- lakeAGI/PersonViTReID
---
# PersonViT — Re‑ID Ablation on PRW
This repository contains an evaluation and fine‑tuning pipeline for **PersonViT** (TransReID backbone) on the **PRW (Person Re‑Identification in the Wild)** dataset.
The notebook is designed for Kaggle and includes:
- pretrained checkpoint evaluation
- a structured multi‑phase ablation study (strategy -> loss -> ViT‑Small)
- automatic on‑disk checkpointing of results after every run to survive disconnects
---
## What the notebook does
The notebook runs an ablation study in three phases:
1. **Pretrained evaluation (ViT‑Base)**
Evaluates multiple pretrained PersonViT ViT‑Base checkpoints on PRW and selects the best baseline (highest mAP).
2. **Phase 1 — Strategy comparison (loss fixed = ArcFace)**
Compares three fine‑tuning strategies:
- **full**: unfreeze everything, very small LR
- **partial**: freeze backbone, train head only
- **freeze**: freeze backbone and reset head, retrain head
3. **Phase 2 — Loss comparison (strategy fixed = best Phase 1)**
With the best strategy fixed (full), compares metric learning losses:
- Triplet (with hard mining)
- ArcFace (reuses the Phase 1 ArcFace run for the winning strategy)
- Angular (configured to avoid “loss = 0” issues from overly restrictive mining)
4. **Phase 3 — ViT‑Small (best strategy + best loss)**
Fine‑tunes **ViT‑Small** using the best strategy+loss from Phase 1/2, then adds it to the final comparison.
Finally, it generates comparison plots and exports a summary CSV.
---
## Requirements
### Hardware
- Single GPU recommended (tested design target: **1× NVIDIA T4 16GB**)
### Software
- Python 3.x
- PyTorch 2.x
- Kaggle notebook environment (or equivalent)
### Python packages (installed by the notebook)
- `albumentations`
- `opencv-python-headless`
- `scipy`
- `torchmetrics`
- `timm`
- `einops`
- `yacs`
- `pytorch-metric-learning`
- `thop`
---
## Dataset and checkpoints
### PRW dataset
Configure the PRW root directory in the notebook `Config`:
- `cfg.dataset_root = '/kaggle/input/datasets/edoardomerli/prw-person-re-identification-in-the-wild'`
The notebook expects the standard PRW structure (frames, annotations, query_box, split mats, etc.).
### Pretrained checkpoints
The notebook evaluates multiple pretrained **ViT‑Base** checkpoints and uses the best one for fine‑tuning.
It also contains a Phase 3 configuration for **ViT‑Small** (Market‑1501 pretrained).
Make sure the checkpoint paths in `Config` match your Kaggle inputs.
---
## How to run (recommended order)
1. **Install & Imports**
2. **Config**
3. **Datasets & DataLoaders** (this step takes time because PRW annotations are parsed)
4. **(Optional) Resume**: run `load_results()` to restore `RESULTS` from disk
5. Run the experiment blocks in order:
- Pretrained evaluation
- Phase 1 runs + Phase 1 plots (selects best strategy)
- Phase 2 runs + Phase 2 plots (selects best loss)
- Phase 3 (ViT‑Small)
- Final comparison + CSV export
The fine‑tuning helper automatically **skips runs already present** in the saved results, so you can safely re-run cells after a disconnect.
---
## Output files
### Progressive results checkpoint (anti-disconnect)
Saved every time `RESULTS` is updated:
- `evaluation_results/all_results.json`
### Final summary table
Saved at the end:
- `evaluation_results/all_results.csv`
### Plots
Saved into:
- `evaluation_results/plots/`
Typical outputs include:
- bar chart (mAP & Rank‑1)
- heatmap (mAP across strategy × loss when applicable)
- radar chart with 4 metrics: **mAP, Rank‑1, Rank‑5, Rank‑10**
- per-run training curves:
- `/kaggle/working/personvit_finetuning/curves_<run_key>.png`
### Fine‑tuning checkpoints
Best checkpoint per run:
- `/kaggle/working/personvit_finetuning/best_<run_key>.pth`
---
## Key concepts and run keys
### Fine‑tuning strategies
- `full`: unfreeze all parameters (low LR, avoids catastrophic forgetting)
- `partial`: freeze backbone, train only head
- `freeze`: freeze backbone, reset head modules, retrain head from scratch
### Losses
- `triplet`: TripletMarginLoss + hard mining
- `arcface`: ArcFace loss (with its own internal parameters)
- `angular`: Angular loss (configured to avoid empty mining → loss=0)
### Run key naming
- Phase 1:
- `full_arcface`, `partial_arcface`, `freeze_arcface`
- Phase 2 (for best strategy `S`):
- `S_triplet`, `S_arcface` (reused), `S_angular`
- Phase 3:
- `vit_small_<best_strategy>_<best_loss>`
---
## What is saved in RESULTS
`RESULTS[run_key]` stores:
- Metrics: `mAP`, `rank1`, `rank5`, `rank10`, `num_valid_queries`
- Profiling: `total_params`, `flops_giga`, `inference_ms`, `throughput`
- Fine‑tuning only:
- `history` (training curves data)
- `trainable_params` (**number of trainable parameters**) to quantify how much of the model was actually fine‑tuned
- Metadata: `strategy`, `loss`, plus `display_name` for pretrained baselines
---
## AMP / Mixed precision note
The notebook uses native PyTorch AMP to speed up training and reduce VRAM usage on T4-class GPUs.
If you update PyTorch and see deprecation warnings, switch to the newer API:
- `torch.amp.GradScaler('cuda', ...)`
- `torch.amp.autocast(device_type='cuda', ...)`
If you disable AMP (`cfg.use_amp = False`), training will run in FP32 (more stable, slower, higher VRAM usage).
---
## Credits
- PersonViT codebase and pretrained weights belong to their respective authors.
- This repository provides an ablation and fine‑tuning workflow tailored for PRW on a single GPU with robust persistence.