Zero-Shot Image Classification
Transformers
PyTorch
Chinese
altclip
Zero-Shot Image Classification
bilingual
en
English
Chinese
Instructions to use BAAI/AltCLIP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BAAI/AltCLIP with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="BAAI/AltCLIP") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("BAAI/AltCLIP") model = AutoModelForZeroShotImageClassification.from_pretrained("BAAI/AltCLIP") - Notebooks
- Google Colab
- Kaggle
Update config.json
Browse files- config.json +5 -5
config.json
CHANGED
|
@@ -1,12 +1,12 @@
|
|
| 1 |
{
|
| 2 |
-
"_name_or_path": "
|
| 3 |
"architectures": [
|
| 4 |
-
"
|
| 5 |
],
|
| 6 |
"direct_kd": false,
|
| 7 |
"initializer_factor": 1.0,
|
| 8 |
"logit_scale_init_value": 2.6592,
|
| 9 |
-
"model_type": "
|
| 10 |
"num_layers": 3,
|
| 11 |
"projection_dim": 768,
|
| 12 |
"text_config": {
|
|
@@ -52,7 +52,7 @@
|
|
| 52 |
"max_length": 20,
|
| 53 |
"max_position_embeddings": 514,
|
| 54 |
"min_length": 0,
|
| 55 |
-
"model_type": "
|
| 56 |
"no_repeat_ngram_size": 0,
|
| 57 |
"num_attention_heads": 16,
|
| 58 |
"num_beam_groups": 1,
|
|
@@ -141,7 +141,7 @@
|
|
| 141 |
"length_penalty": 1.0,
|
| 142 |
"max_length": 20,
|
| 143 |
"min_length": 0,
|
| 144 |
-
"model_type": "
|
| 145 |
"no_repeat_ngram_size": 0,
|
| 146 |
"num_attention_heads": 16,
|
| 147 |
"num_beam_groups": 1,
|
|
|
|
| 1 |
{
|
| 2 |
+
"_name_or_path": "",
|
| 3 |
"architectures": [
|
| 4 |
+
"AltCLIPModel"
|
| 5 |
],
|
| 6 |
"direct_kd": false,
|
| 7 |
"initializer_factor": 1.0,
|
| 8 |
"logit_scale_init_value": 2.6592,
|
| 9 |
+
"model_type": "altclip",
|
| 10 |
"num_layers": 3,
|
| 11 |
"projection_dim": 768,
|
| 12 |
"text_config": {
|
|
|
|
| 52 |
"max_length": 20,
|
| 53 |
"max_position_embeddings": 514,
|
| 54 |
"min_length": 0,
|
| 55 |
+
"model_type": "altclip_text_model",
|
| 56 |
"no_repeat_ngram_size": 0,
|
| 57 |
"num_attention_heads": 16,
|
| 58 |
"num_beam_groups": 1,
|
|
|
|
| 141 |
"length_penalty": 1.0,
|
| 142 |
"max_length": 20,
|
| 143 |
"min_length": 0,
|
| 144 |
+
"model_type": "altclip_vision_model",
|
| 145 |
"no_repeat_ngram_size": 0,
|
| 146 |
"num_attention_heads": 16,
|
| 147 |
"num_beam_groups": 1,
|