| | --- |
| | library_name: keras-hub |
| | --- |
| | ### Model Overview |
| | Vision Transformer (ViT) model trained using the DINOv2 method. |
| |
|
| | **Reference** |
| |
|
| | - [Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193) |
| | - [Vision Transformers Need Registers](https://arxiv.org/abs/2309.16588) |
| |
|
| | DINOV2 offers a powerful, generalist visual backbone learned entirely from |
| | unlabeled images as described in DINOv2: Learning Robust Visual Features |
| | without Supervision |
| |
|
| | ## Links |
| |
|
| | * [DINOv2 Quickstart Notebook] - coming soon |
| | * [DINOv2 API Documentation] - coming soon |
| | * [DINOv2 Beginner Guide] - coming soon |
| | * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/) |
| |
|
| |
|
| | ## Installation |
| |
|
| | Keras and KerasHub can be installed with: |
| |
|
| | ``` |
| | pip install -U -q keras-hub |
| | pip install -U -q keras |
| | ``` |
| |
|
| | Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page. |
| |
|
| | ## Presets |
| |
|
| | The following model checkpoints are provided by the Keras team. Weights have been ported from: https://huggingface.co. Full code examples for each are available below. |
| |
|
| | | Preset name | Parameters | Description | |
| | |------------------------------------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
| | | dinov2_small | 22.58M | Vision Transformer (small-sized model) trained using DINOv2. | |
| | | dinov2_base | 87.63M | Vision Transformer (base-sized model) trained using DINOv2. | |
| | | dinov2_large | 305.77M | Vision Transformer (large-sized model) trained using DINOv2. | |
| | | dinov2_giant | 1.13B | Vision Transformer (giant-sized model) trained using DINOv2.| |
| | | dinov2_with_registers_small | 22.58M | Vision Transformer (small-sized model) trained using DINOv2, with registers. | |
| | | dinov2_with_registers_base | 87.63M | Vision Transformer (base-sized model) trained using DINOv2, with registers. | |
| | | dinov2_with_registers_large | 305.77M | Vision Transformer (large-sized model) trained using DINOv2, with registers. | |
| | | dinov2_with_registers_giant | 1.13B | Vision Transformer (giant-sized model) trained using DINOv2, with registers.| |
| |
|
| |
|