metadata
library_name: keras-hub
This is a CLIP model uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
Model config:
- name: clip_backbone
- trainable: True
- vision_encoder: {'module': 'keras_hub.src.models.clip.clip_vision_encoder', 'class_name': 'CLIPVisionEncoder', 'config': {'name': 'clip_vision_encoder', 'trainable': True, 'patch_size': 16, 'hidden_dim': 768, 'num_layers': 12, 'num_heads': 12, 'intermediate_dim': 3072, 'intermediate_activation': 'quick_gelu', 'intermediate_output_index': None, 'image_shape': [224, 224, 3]}, 'registered_name': 'keras_hub>CLIPVisionEncoder'}
- text_encoder: {'module': 'keras_hub.src.models.clip.clip_text_encoder', 'class_name': 'CLIPTextEncoder', 'config': {'name': 'clip_text_encoder', 'trainable': True, 'vocabulary_size': 49408, 'embedding_dim': 512, 'hidden_dim': 512, 'num_layers': 12, 'num_heads': 8, 'intermediate_dim': 2048, 'intermediate_activation': 'quick_gelu', 'intermediate_output_index': None, 'max_sequence_length': 77}, 'registered_name': 'keras_hub>CLIPTextEncoder'}
- projection_dim: 512
This model card has been generated automatically and should be completed by the model author. See Model Cards documentation for more information.