Updated links
Browse files
README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
library_name: transformers
|
| 3 |
license: mit
|
| 4 |
datasets:
|
| 5 |
-
-
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
base_model:
|
|
@@ -14,7 +14,7 @@ pipeline_tag: zero-shot-classification
|
|
| 14 |
|
| 15 |
## Model Details
|
| 16 |
|
| 17 |
-
This model is a finetuned version of the `openai/clip-vit-base-patch32` CLIP (Contrastive Language-Image Pretraining) model on the [`lego_brick_captions`](https://huggingface.co/datasets/
|
| 18 |
|
| 19 |
> [!NOTE]
|
| 20 |
> If you are interested on the code used refer to the finetuning script on my [GitHub](https://github.com/Armaggheddon/BricksFinder/blob/main/model_finetuning/src/finetune.py)
|
|
@@ -32,7 +32,7 @@ Perfect for LEGO enthusiasts, builders, or anyone who loves a good ol’ treasur
|
|
| 32 |
|
| 33 |
## Model Description
|
| 34 |
|
| 35 |
-
- **Developed by:** The base model has been developed by OpenAI and the finetuned model has been developed by me, [
|
| 36 |
- **Model type:** The model is a CLIP (Contrastive Language-Image Pretraining) model.
|
| 37 |
- **Language:** The model is expects English text as input.
|
| 38 |
- **License:** The model is licensed under the MIT license.
|
|
@@ -46,21 +46,21 @@ Perfect for LEGO enthusiasts, builders, or anyone who loves a good ol’ treasur
|
|
| 46 |
|
| 47 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 48 |
|
| 49 |
-
model = CLIPModel.from_pretrained("
|
| 50 |
-
processor = CLIPProcessor.from_pretrained("
|
| 51 |
```
|
| 52 |
- Using `Auto` classes:
|
| 53 |
```python
|
| 54 |
from transformers import AutoModelForZeroShotImageClassification, AutoProcessor
|
| 55 |
|
| 56 |
-
model = AutoModelForZeroShotImageClassification.from_pretrained("
|
| 57 |
-
processor = AutoProcessor.from_pretrained("
|
| 58 |
```
|
| 59 |
- Using with `pipeline`:
|
| 60 |
```python
|
| 61 |
from transformers import pipeline
|
| 62 |
|
| 63 |
-
model = "
|
| 64 |
clip_classifier = pipeline("zero-shot-image-classification", model=model)
|
| 65 |
```
|
| 66 |
|
|
@@ -70,8 +70,8 @@ The provided model is in float32 precision. To load the model in float16 precisi
|
|
| 70 |
```python
|
| 71 |
from transformers import CLIPProcessor, CLIPModel
|
| 72 |
|
| 73 |
-
model = CLIPModel.from_pretrained("
|
| 74 |
-
processor = CLIPProcessor.from_pretrained("
|
| 75 |
```
|
| 76 |
|
| 77 |
or alternatively using `torch` directly with:
|
|
@@ -79,7 +79,7 @@ or alternatively using `torch` directly with:
|
|
| 79 |
import torch
|
| 80 |
from transformers import CLIPModel
|
| 81 |
|
| 82 |
-
model = CLIPModel.from_pretrained("
|
| 83 |
model_fp16 = model.to(torch.float16)
|
| 84 |
```
|
| 85 |
|
|
@@ -93,8 +93,8 @@ model_fp16 = model.to(torch.float16)
|
|
| 93 |
|
| 94 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 95 |
|
| 96 |
-
model = CLIPModel.from_pretrained("
|
| 97 |
-
tokenizer = CLIPTokenizerFast.from_pretrained("
|
| 98 |
|
| 99 |
text = ["a photo of a lego brick"]
|
| 100 |
tokens = tokenizer(text, return_tensors="pt", padding=True).to(device)
|
|
@@ -108,8 +108,8 @@ model_fp16 = model.to(torch.float16)
|
|
| 108 |
|
| 109 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 110 |
|
| 111 |
-
model = CLIPModel.from_pretrained("
|
| 112 |
-
processor = CLIPProcessor.from_pretrained("
|
| 113 |
|
| 114 |
image = Image.open("path_to_image.jpg")
|
| 115 |
inputs = processor(images=image, return_tensors="pt").to(device)
|
|
@@ -125,10 +125,10 @@ from datasets import load_dataset
|
|
| 125 |
|
| 126 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 127 |
|
| 128 |
-
model = CLIPModel.from_pretrained("
|
| 129 |
-
processor = CLIPProcessor.from_pretrained("
|
| 130 |
|
| 131 |
-
dataset = load_dataset("
|
| 132 |
|
| 133 |
captions = [
|
| 134 |
"a photo of a lego brick with a 2x2 plate",
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
license: mit
|
| 4 |
datasets:
|
| 5 |
+
- Armaggheddon/lego_brick_captions
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
base_model:
|
|
|
|
| 14 |
|
| 15 |
## Model Details
|
| 16 |
|
| 17 |
+
This model is a finetuned version of the `openai/clip-vit-base-patch32` CLIP (Contrastive Language-Image Pretraining) model on the [`lego_brick_captions`](https://huggingface.co/datasets/Armaggheddon97/lego_brick_captions), specialized for matching images of Lego bricks with their corresponding textual description.
|
| 18 |
|
| 19 |
> [!NOTE]
|
| 20 |
> If you are interested on the code used refer to the finetuning script on my [GitHub](https://github.com/Armaggheddon/BricksFinder/blob/main/model_finetuning/src/finetune.py)
|
|
|
|
| 32 |
|
| 33 |
## Model Description
|
| 34 |
|
| 35 |
+
- **Developed by:** The base model has been developed by OpenAI and the finetuned model has been developed by me, [Armaggheddon](https://huggingface.co/Armaggheddon).
|
| 36 |
- **Model type:** The model is a CLIP (Contrastive Language-Image Pretraining) model.
|
| 37 |
- **Language:** The model is expects English text as input.
|
| 38 |
- **License:** The model is licensed under the MIT license.
|
|
|
|
| 46 |
|
| 47 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 48 |
|
| 49 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 50 |
+
processor = CLIPProcessor.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 51 |
```
|
| 52 |
- Using `Auto` classes:
|
| 53 |
```python
|
| 54 |
from transformers import AutoModelForZeroShotImageClassification, AutoProcessor
|
| 55 |
|
| 56 |
+
model = AutoModelForZeroShotImageClassification.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick")
|
| 57 |
+
processor = AutoProcessor.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick")
|
| 58 |
```
|
| 59 |
- Using with `pipeline`:
|
| 60 |
```python
|
| 61 |
from transformers import pipeline
|
| 62 |
|
| 63 |
+
model = "Armaggheddon/clip-vit-base-patch32_lego-brick"
|
| 64 |
clip_classifier = pipeline("zero-shot-image-classification", model=model)
|
| 65 |
```
|
| 66 |
|
|
|
|
| 70 |
```python
|
| 71 |
from transformers import CLIPProcessor, CLIPModel
|
| 72 |
|
| 73 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", dtype=torch.float16)
|
| 74 |
+
processor = CLIPProcessor.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick")
|
| 75 |
```
|
| 76 |
|
| 77 |
or alternatively using `torch` directly with:
|
|
|
|
| 79 |
import torch
|
| 80 |
from transformers import CLIPModel
|
| 81 |
|
| 82 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick")
|
| 83 |
model_fp16 = model.to(torch.float16)
|
| 84 |
```
|
| 85 |
|
|
|
|
| 93 |
|
| 94 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 95 |
|
| 96 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 97 |
+
tokenizer = CLIPTokenizerFast.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick")
|
| 98 |
|
| 99 |
text = ["a photo of a lego brick"]
|
| 100 |
tokens = tokenizer(text, return_tensors="pt", padding=True).to(device)
|
|
|
|
| 108 |
|
| 109 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 110 |
|
| 111 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 112 |
+
processor = CLIPProcessor.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 113 |
|
| 114 |
image = Image.open("path_to_image.jpg")
|
| 115 |
inputs = processor(images=image, return_tensors="pt").to(device)
|
|
|
|
| 125 |
|
| 126 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 127 |
|
| 128 |
+
model = CLIPModel.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 129 |
+
processor = CLIPProcessor.from_pretrained("Armaggheddon/clip-vit-base-patch32_lego-brick", device_map="auto").to(device)
|
| 130 |
|
| 131 |
+
dataset = load_dataset("Armaggheddon/lego_brick_captions", split="test")
|
| 132 |
|
| 133 |
captions = [
|
| 134 |
"a photo of a lego brick with a 2x2 plate",
|