cvt-21 / README.md

Fix usage example: import weights enum from lucid.models.weights

56180b9 verified 3 days ago

1.74 kB

	---
	library_name: lucid
	license: apache-2.0
	tags:
	- image-classification
	- cvt
	- lucid
	datasets:
	- imagenet-1k
	pipeline_tag: image-classification
	model-index:
	- name: cvt-21
	results:
	- task: { type: image-classification }
	dataset: { name: ImageNet-1k, type: imagenet-1k }
	metrics:
	- { type: acc@1, value: 82.5 }
	---

	# CvT-21

	> Wu et al., 2021 — CvT: Introducing Convolutions to Vision Transformers (arXiv:2103.15808)

	[Lucid](https://github.com/ChanLumerico/lucid) port of `transformers/microsoft/cvt-21`,
	converted to Lucid-native safetensors.

	## Available weights

	\| Tag \| acc@1 \| acc@5 \| Params \| GFLOPs \| Size \| Source \|
	\|---\|---\|---\|---\|---\|---\|---\|
	\| `IN1K` (default) \| 82.5 \| — \| 31.6M \| — \| 120.87 MB \| transformers \|

	## Usage

	```python
	import lucid.models as models
	from lucid.models.weights import CvT21Weights

	# default tag
	model = models.cvt_21_cls(pretrained=True)

	# explicit tag (enum or string)
	model = models.cvt_21_cls(weights=CvT21Weights.IN1K)
	model = models.cvt_21_cls(pretrained="IN1K")

	# preprocessing travels with the weights
	weights = CvT21Weights.IN1K
	preprocess = weights.transforms()
	logits = model(preprocess(image)[None]).logits
	```

	## Conversion

	Converted from `transformers/microsoft/cvt-21` via
	`python -m tools.convert_weights cvt_21 --tag IN1K`.
	Key mapping + numerical parity verified against the source.

	## License

	`apache-2.0` — inherited from the original weights.

	## Citation

	```
	@inproceedings{wu2021cvt,
	title={CvT: Introducing Convolutions to Vision Transformers},
	author={Wu, Haiping and Xiao, Bin and Codella, Noel and Liu, Mengchen and Dai, Xiyang and Yuan, Lu and Zhang, Lei},
	booktitle={ICCV}, year={2021}
	}
	```