Update README.md

5a01eb5 verified about 2 months ago

5.71 kB

	---
	license: cc-by-nd-4.0
	language:
	- en
	base_model:
	- lakeAGI/PersonViTReID
	---
	# PersonViT — Re‑ID Ablation on PRW

	This repository contains an evaluation and fine‑tuning pipeline for PersonViT (TransReID backbone) on the PRW (Person Re‑Identification in the Wild) dataset.

	The notebook is designed for Kaggle and includes:
	- pretrained checkpoint evaluation
	- a structured multi‑phase ablation study (strategy -> loss -> ViT‑Small)
	- automatic on‑disk checkpointing of results after every run to survive disconnects

	---

	## What the notebook does

	The notebook runs an ablation study in three phases:

	1. Pretrained evaluation (ViT‑Base)
	Evaluates multiple pretrained PersonViT ViT‑Base checkpoints on PRW and selects the best baseline (highest mAP).

	2. Phase 1 — Strategy comparison (loss fixed = ArcFace)
	Compares three fine‑tuning strategies:
	- full: unfreeze everything, very small LR
	- partial: freeze backbone, train head only
	- freeze: freeze backbone and reset head, retrain head

	3. Phase 2 — Loss comparison (strategy fixed = best Phase 1)
	With the best strategy fixed (full), compares metric learning losses:
	- Triplet (with hard mining)
	- ArcFace (reuses the Phase 1 ArcFace run for the winning strategy)
	- Angular (configured to avoid “loss = 0” issues from overly restrictive mining)

	4. Phase 3 — ViT‑Small (best strategy + best loss)
	Fine‑tunes ViT‑Small using the best strategy+loss from Phase 1/2, then adds it to the final comparison.

	Finally, it generates comparison plots and exports a summary CSV.

	---

	## Requirements

	### Hardware
	- Single GPU recommended (tested design target: 1× NVIDIA T4 16GB)

	### Software
	- Python 3.x
	- PyTorch 2.x
	- Kaggle notebook environment (or equivalent)

	### Python packages (installed by the notebook)
	- `albumentations`
	- `opencv-python-headless`
	- `scipy`
	- `torchmetrics`
	- `timm`
	- `einops`
	- `yacs`
	- `pytorch-metric-learning`
	- `thop`

	---

	## Dataset and checkpoints

	### PRW dataset
	Configure the PRW root directory in the notebook `Config`:
	- `cfg.dataset_root = '/kaggle/input/datasets/edoardomerli/prw-person-re-identification-in-the-wild'`

	The notebook expects the standard PRW structure (frames, annotations, query_box, split mats, etc.).

	### Pretrained checkpoints
	The notebook evaluates multiple pretrained ViT‑Base checkpoints and uses the best one for fine‑tuning.
	It also contains a Phase 3 configuration for ViT‑Small (Market‑1501 pretrained).

	Make sure the checkpoint paths in `Config` match your Kaggle inputs.

	---

	## How to run (recommended order)

	1. Install & Imports
	2. Config
	3. Datasets & DataLoaders (this step takes time because PRW annotations are parsed)
	4. (Optional) Resume: run `load_results()` to restore `RESULTS` from disk
	5. Run the experiment blocks in order:
	- Pretrained evaluation
	- Phase 1 runs + Phase 1 plots (selects best strategy)
	- Phase 2 runs + Phase 2 plots (selects best loss)
	- Phase 3 (ViT‑Small)
	- Final comparison + CSV export

	The fine‑tuning helper automatically skips runs already present in the saved results, so you can safely re-run cells after a disconnect.

	---

	## Output files

	### Progressive results checkpoint (anti-disconnect)
	Saved every time `RESULTS` is updated:
	- `evaluation_results/all_results.json`

	### Final summary table
	Saved at the end:
	- `evaluation_results/all_results.csv`

	### Plots
	Saved into:
	- `evaluation_results/plots/`

	Typical outputs include:
	- bar chart (mAP & Rank‑1)
	- heatmap (mAP across strategy × loss when applicable)
	- radar chart with 4 metrics: mAP, Rank‑1, Rank‑5, Rank‑10
	- per-run training curves:
	- `/kaggle/working/personvit_finetuning/curves_<run_key>.png`

	### Fine‑tuning checkpoints
	Best checkpoint per run:
	- `/kaggle/working/personvit_finetuning/best_<run_key>.pth`

	---

	## Key concepts and run keys

	### Fine‑tuning strategies
	- `full`: unfreeze all parameters (low LR, avoids catastrophic forgetting)
	- `partial`: freeze backbone, train only head
	- `freeze`: freeze backbone, reset head modules, retrain head from scratch

	### Losses
	- `triplet`: TripletMarginLoss + hard mining
	- `arcface`: ArcFace loss (with its own internal parameters)
	- `angular`: Angular loss (configured to avoid empty mining → loss=0)

	### Run key naming
	- Phase 1:
	- `full_arcface`, `partial_arcface`, `freeze_arcface`
	- Phase 2 (for best strategy `S`):
	- `S_triplet`, `S_arcface` (reused), `S_angular`
	- Phase 3:
	- `vit_small_<best_strategy>_<best_loss>`

	---

	## What is saved in RESULTS

	`RESULTS[run_key]` stores:
	- Metrics: `mAP`, `rank1`, `rank5`, `rank10`, `num_valid_queries`
	- Profiling: `total_params`, `flops_giga`, `inference_ms`, `throughput`
	- Fine‑tuning only:
	- `history` (training curves data)
	- `trainable_params` (number of trainable parameters) to quantify how much of the model was actually fine‑tuned
	- Metadata: `strategy`, `loss`, plus `display_name` for pretrained baselines

	---

	## AMP / Mixed precision note

	The notebook uses native PyTorch AMP to speed up training and reduce VRAM usage on T4-class GPUs.
	If you update PyTorch and see deprecation warnings, switch to the newer API:
	- `torch.amp.GradScaler('cuda', ...)`
	- `torch.amp.autocast(device_type='cuda', ...)`

	If you disable AMP (`cfg.use_amp = False`), training will run in FP32 (more stable, slower, higher VRAM usage).

	---

	## Credits

	- PersonViT codebase and pretrained weights belong to their respective authors.
	- This repository provides an ablation and fine‑tuning workflow tailored for PRW on a single GPU with robust persistence.