breezedeus
/

coin-clip-vit-base-patch32

@@ -2,8 +2,11 @@
 tags:
 - vision
 - coin
 - coin-retrieval
 - coin-recognition
 widget:
 - src: >-
     https://huggingface.co/datasets/mishig/sample_images/resolve/main/cat-dog-music.png
@@ -13,13 +16,23 @@ license: apache-2.0
 library_name: transformers
 ---
-# Model Card: CLIP
 ## Model Details / 模型细节
-This model is fine-tuned on a coin dataset using **contrastive learning** techniques, based on OpenAI's CLIP (ViT-B/32). It aims to enhance the feature extraction capabilities for **Coin** images, thus achieving more accurate image-based search functionalities. The model combines the powerful features of the Vision Transformer (ViT) with the multimodal learning capabilities of CLIP, specifically optimized for coin imagery.
-这个模型是在 OpenAI 的 CLIP (ViT-B/32) 基础上，利用对比学习技术并使用硬币数据集进行微调得到的。它旨在提高硬币图像的特征提取能力，从而实现更准确的以图搜图功能。该模型结合了视觉变换器（ViT）的强大功能和 CLIP 的多模态学习能力，专门针对硬币图像进行了优化。
 ## Comparison: Coin-CLIP vs. CLIP / 效果对比
@@ -57,6 +70,7 @@ More examples can be found: [breezedeus/Coin-CLIP: Coin CLIP](https://github.com
 ## Model Use / 模型使用
 ```python3
 from PIL import Image
 import requests
@@ -74,6 +88,32 @@ img_features = model.get_image_features(**inputs)
 img_features = F.normalize(img_features, dim=1)
 ```
 ## Training Data / 训练数据

 tags:
 - vision
 - coin
+- clip
 - coin-retrieval
 - coin-recognition
+- coin-search-engine
+- multi-modal learning
 widget:
 - src: >-
     https://huggingface.co/datasets/mishig/sample_images/resolve/main/cat-dog-music.png
 library_name: transformers
 ---
+# Coin-CLIP 🪙 : Enhancing Coin Image Retrieval with CLIP
 ## Model Details / 模型细节
+This model (**Coin-CLIP**) is built upon
+OpenAI's **[CLIP](https://huggingface.co/openai/clip-vit-base-patch32) (ViT-B/32)** model and fine-tuned on
+a dataset of more than `340,000` coin images using contrastive learning techniques. This specialized model is designed to significantly improve feature extraction for coin images, leading to more accurate image-based search capabilities. Coin-CLIP combines the power of Visual Transformer (ViT) with CLIP's multimodal learning capabilities, specifically tailored for the numismatic domain.
+**Key Features:**
+- State-of-the-art coin image retrieval;
+- Enhanced feature extraction for numismatic images;
+- Seamless integration with CLIP's multimodal learning.
+本模型（**Coin-CLIP**）
+在 OpenAI 的 **[CLIP](https://huggingface.co/openai/clip-vit-base-patch32) (ViT-B/32)** 模型基础上，利用对比学习技术在超过 `340,000` 张硬币图片数据上微调得到的。
+**Coin-CLIP** 旨在提高模型针对硬币图片的特征提取能力，从而实现更准确的以图搜图功能。该模型结合了视觉变换器（ViT）的强大功能和 CLIP 的多模态学习能力，并专门针对硬币图片进行了优化。
 ## Comparison: Coin-CLIP vs. CLIP / 效果对比
 ## Model Use / 模型使用
+### Transformers
 ```python3
 from PIL import Image
 import requests
 img_features = F.normalize(img_features, dim=1)
 ```
+### Tool / 工具
+To further simplify the use of the **Coin-CLIP** model, we provide a simple Python library [breezedeus/Coin-CLIP: Coin CLIP](https://github.com/breezedeus/Coin-CLIP) for quickly building a coin image retrieval engine.
+为了进一步简化 **Coin-CLIP** 模型的使用，我们提供了一个简单的 Python 库 [breezedeus/Coin-CLIP: Coin CLIP](https://github.com/breezedeus/Coin-CLIP)，以便快速构建硬币图像检索引擎。
+#### Install
+```bash
+pip install coin_clip
+```
+#### Extract Feature Vectors
+```python
+from coin_clip import CoinClip
+# Automatically download the model from Huggingface
+model = CoinClip(model_name='breezedeus/coin-clip-vit-base-patch32')
+images = ['examples/10_back.jpg', 'examples/16_back.jpg']
+img_feats, success_ids = model.get_image_features(images)
+print(img_feats.shape)  # --> (2, 512)
+```
+More Tools can be found: [breezedeus/Coin-CLIP: Coin CLIP](https://github.com/breezedeus/Coin-CLIP) .
 ## Training Data / 训练数据