Update README.md

5a01eb5 verified 11 days ago

5.71 kB

license: cc-by-nd-4.0
language:
  - en
base_model:
  - lakeAGI/PersonViTReID

PersonViT — Re‑ID Ablation on PRW

This repository contains an evaluation and fine‑tuning pipeline for PersonViT (TransReID backbone) on the PRW (Person Re‑Identification in the Wild) dataset.

The notebook is designed for Kaggle and includes:

pretrained checkpoint evaluation
a structured multi‑phase ablation study (strategy -> loss -> ViT‑Small)
automatic on‑disk checkpointing of results after every run to survive disconnects

What the notebook does

The notebook runs an ablation study in three phases:

Pretrained evaluation (ViT‑Base)
Evaluates multiple pretrained PersonViT ViT‑Base checkpoints on PRW and selects the best baseline (highest mAP).
Phase 1 — Strategy comparison (loss fixed = ArcFace)
Compares three fine‑tuning strategies:
- full: unfreeze everything, very small LR
- partial: freeze backbone, train head only
- freeze: freeze backbone and reset head, retrain head
Phase 2 — Loss comparison (strategy fixed = best Phase 1)
With the best strategy fixed (full), compares metric learning losses:
- Triplet (with hard mining)
- ArcFace (reuses the Phase 1 ArcFace run for the winning strategy)
- Angular (configured to avoid “loss = 0” issues from overly restrictive mining)
Phase 3 — ViT‑Small (best strategy + best loss)
Fine‑tunes ViT‑Small using the best strategy+loss from Phase 1/2, then adds it to the final comparison.

Finally, it generates comparison plots and exports a summary CSV.

Requirements

Hardware

Single GPU recommended (tested design target: 1× NVIDIA T4 16GB)

Software

Python 3.x
PyTorch 2.x
Kaggle notebook environment (or equivalent)

Python packages (installed by the notebook)

albumentations
opencv-python-headless
scipy
torchmetrics
timm
einops
yacs
pytorch-metric-learning
thop

Dataset and checkpoints

PRW dataset

Configure the PRW root directory in the notebook Config:

cfg.dataset_root = '/kaggle/input/datasets/edoardomerli/prw-person-re-identification-in-the-wild'

The notebook expects the standard PRW structure (frames, annotations, query_box, split mats, etc.).

Pretrained checkpoints

The notebook evaluates multiple pretrained ViT‑Base checkpoints and uses the best one for fine‑tuning.
It also contains a Phase 3 configuration for ViT‑Small (Market‑1501 pretrained).

Make sure the checkpoint paths in Config match your Kaggle inputs.

How to run (recommended order)

Install & Imports
Config
Datasets & DataLoaders (this step takes time because PRW annotations are parsed)
(Optional) Resume: run load_results() to restore RESULTS from disk
Run the experiment blocks in order:
- Pretrained evaluation
- Phase 1 runs + Phase 1 plots (selects best strategy)
- Phase 2 runs + Phase 2 plots (selects best loss)
- Phase 3 (ViT‑Small)
- Final comparison + CSV export

The fine‑tuning helper automatically skips runs already present in the saved results, so you can safely re-run cells after a disconnect.

Output files

Progressive results checkpoint (anti-disconnect)

Saved every time RESULTS is updated:

evaluation_results/all_results.json

Final summary table

Saved at the end:

evaluation_results/all_results.csv

Plots

Saved into:

evaluation_results/plots/

Typical outputs include:

bar chart (mAP & Rank‑1)
heatmap (mAP across strategy × loss when applicable)
radar chart with 4 metrics: mAP, Rank‑1, Rank‑5, Rank‑10
per-run training curves:
- /kaggle/working/personvit_finetuning/curves_<run_key>.png

Fine‑tuning checkpoints

Best checkpoint per run:

/kaggle/working/personvit_finetuning/best_<run_key>.pth

Key concepts and run keys

Fine‑tuning strategies

full: unfreeze all parameters (low LR, avoids catastrophic forgetting)
partial: freeze backbone, train only head
freeze: freeze backbone, reset head modules, retrain head from scratch

Losses

triplet: TripletMarginLoss + hard mining
arcface: ArcFace loss (with its own internal parameters)
angular: Angular loss (configured to avoid empty mining → loss=0)

Run key naming

Phase 1:
- full_arcface, partial_arcface, freeze_arcface
Phase 2 (for best strategy S):
- S_triplet, S_arcface (reused), S_angular
Phase 3:
- vit_small_<best_strategy>_<best_loss>

What is saved in RESULTS

RESULTS[run_key] stores:

Metrics: mAP, rank1, rank5, rank10, num_valid_queries
Profiling: total_params, flops_giga, inference_ms, throughput
Fine‑tuning only:
- history (training curves data)
- trainable_params (number of trainable parameters) to quantify how much of the model was actually fine‑tuned
Metadata: strategy, loss, plus display_name for pretrained baselines

AMP / Mixed precision note

The notebook uses native PyTorch AMP to speed up training and reduce VRAM usage on T4-class GPUs. If you update PyTorch and see deprecation warnings, switch to the newer API:

torch.amp.GradScaler('cuda', ...)
torch.amp.autocast(device_type='cuda', ...)

If you disable AMP (cfg.use_amp = False), training will run in FP32 (more stable, slower, higher VRAM usage).

Credits

PersonViT codebase and pretrained weights belong to their respective authors.
This repository provides an ablation and fine‑tuning workflow tailored for PRW on a single GPU with robust persistence.