This repository contains the model weights and configuration files for the Sphere Encoder project.

These model weights have been reproduced with the released code and yield slightly different evaluation results compared to those reported in the original paper.

Model Card

dataset πŸ€— hf model repo params
Animal-Faces sphere-l-af 642M
Oxford-Flowers sphere-l-of 948M
ImageNet sphere-l-imagenet 950M
ImageNet sphere-xl-imagenet 1.3B

Download model checkpoints and put them in ./workspace/experiments. The directory tree should look like this:

./workspace/experiments/
β”œβ”€β”€ sphere-l-af
    β”œβ”€β”€ ckpt/ep0999.pth
    |── config.json
β”œβ”€β”€ sphere-l-of
|── sphere-l-imagenet
|── sphere-xl-imagenet

Evaluation Results

Evaluate ImageNet models with CFG = 1.4:

# --job_dir can be
#   sphere-l-imagenet, or sphere-xl-imagenet

./run.sh eval.py \
  --job_dir sphere-xl-imagenet \
  --forward_steps 1 4 \
  --report_fid rfid gfid \
  --use_cfg True \
  --cfg_min 1.4 \
  --cfg_max 1.4 \
  --cfg_position combo \
  --rm_folder_after_eval True

The evaluation results will be saved in ./workspace/experiments/sphere-xl-imagenet/eval/:

dataset model steps rFID ↓ gFID ↓ IS ↑
ImageNet 256x256 Sphere-L 1 0.62 15.69 274.5
Sphere-L 4 - 4.78 259.1
Sphere-XL 1 0.62 14.52 299.3
Sphere-XL 4 - 4.05 266.0

Evaluate unconditional Animal-Faces model:

./run.sh eval.py \
  --job_dir sphere-l-af \
  --forward_steps 1 4 \
  --report_fid gfid \
  --rm_folder_after_eval True
dataset model steps rFID ↓ gFID ↓ IS ↑
Animal-Faces 256x256 Sphere-L 1 - 21.56 8.3
Sphere-L 4 - 18.73 9.8

Evaluate Oxford-Flowers model with CFG = 1.4:

./run.sh eval.py \
  --job_dir sphere-l-of \
  --forward_steps 1 4 \
  --report_fid gfid \
  --use_cfg True \
  --cfg_min 1.6 \
  --cfg_max 1.6 \
  --cfg_position combo \
  --num_eval_samples 51000 \
  --rm_folder_after_eval True \
  --cache_sampling_noise False \

--num_eval_samples = 51000 are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus. Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples.

dataset model steps rFID ↓ gFID ↓ IS ↑
Oxford-Flowers 256x256 Sphere-L 1 - 25.10 3.4
Sphere-L 4 - 11.27 3.2
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support