kaiyuyue
/

sphere-encoder-models

Model card Files Files and versions

sphere-encoder-models / README.md

kaiyuyue's picture

update readme

7c2c7b4 verified 1 day ago

|

history blame contribute delete

3.07 kB

	---
	license: cc-by-nc-4.0
	---

	This repository contains the model weights and configuration files for the [Sphere Encoder](https://github.com) project.

	> [!Note]
	> These model weights have been reproduced with the released code and yield slightly different evaluation results compared to those reported in the original paper.

	# Model Card

	\| dataset \| 🤗 hf model repo \| params \|
	\|:--:\|:--:\|:--:\|
	\| Animal-Faces \| [`sphere-l-af`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-af) \| 642M \|
	\| Oxford-Flowers \| [`sphere-l-of`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-of) \| 948M \|
	\| ImageNet \| [`sphere-l-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-imagenet) \| 950M \|
	\| ImageNet \| [`sphere-xl-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-xl-imagenet) \| 1.3B \|

	Download model checkpoints and put them in `./workspace/experiments`.
	The directory tree should look like this:

	```bash
	./workspace/experiments/
	├── sphere-l-af
	├── ckpt/ep0999.pth
	\|── config.json
	├── sphere-l-of
	\|── sphere-l-imagenet
	\|── sphere-xl-imagenet
	```

	<br>

	# Evaluation Results

	Evaluate ImageNet models with `CFG = 1.4`:

	```bash
	# --job_dir can be
	# sphere-l-imagenet, or sphere-xl-imagenet

	./run.sh eval.py \
	--job_dir sphere-xl-imagenet \
	--forward_steps 1 4 \
	--report_fid rfid gfid \
	--use_cfg True \
	--cfg_min 1.4 \
	--cfg_max 1.4 \
	--cfg_position combo \
	--rm_folder_after_eval True
	```

	The evaluation results will be saved in `./workspace/experiments/sphere-xl-imagenet/eval/`:

	\| dataset \| model \| steps \| rFID ↓ \| gFID ↓ \| IS ↑ \|
	\|:--:\|:--\|:--:\|:--:\|--:\|:--:\|
	ImageNet 256x256 \| Sphere-L \| 1 \| 0.62 \| 15.69 \| 274.5 \|
	\|\| Sphere-L \| 4 \| - \| 4.78 \| 259.1 \|
	\|\| Sphere-XL \| 1 \| 0.62 \| 14.52 \| 299.3 \|
	\|\| Sphere-XL \| 4 \| - \| 4.05 \| 266.0 \|

	Evaluate unconditional Animal-Faces model:

	```bash
	./run.sh eval.py \
	--job_dir sphere-l-af \
	--forward_steps 1 4 \
	--report_fid gfid \
	--rm_folder_after_eval True
	```

	\| dataset \| model \| steps \| rFID ↓ \| gFID ↓ \| IS ↑ \|
	\|:--:\|:--\|:--:\|:--:\|:--:\|:--:\|
	Animal-Faces 256x256 \| Sphere-L \| 1 \| - \| 21.56 \| 8.3 \|
	\|\| Sphere-L \| 4 \| - \| 18.73 \| 9.8 \|

	Evaluate Oxford-Flowers model with `CFG = 1.4`:

	```bash
	./run.sh eval.py \
	--job_dir sphere-l-of \
	--forward_steps 1 4 \
	--report_fid gfid \
	--use_cfg True \
	--cfg_min 1.6 \
	--cfg_max 1.6 \
	--cfg_position combo \
	--num_eval_samples 51000 \
	--rm_folder_after_eval True \
	--cache_sampling_noise False \
	```

	`--num_eval_samples = 51000` are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus.
	Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples.

	\| dataset \| model \| steps \| rFID ↓ \| gFID ↓ \| IS ↑ \|
	\|:--:\|:--\|:--:\|:--:\|:--:\|:--:\|
	\| Oxford-Flowers 256x256 \| Sphere-L \| 1 \| - \| 25.10 \| 3.4 \|
	\|\| Sphere-L \| 4 \| - \| 11.27 \| 3.2 \|