Update README.md

c0be859 verified 3 months ago

1.36 kB

license: cc-by-4.0
tags:
  - vision
  - image-text-retrieval
  - clip
  - pytorch
  - vision-transformer
library_name: pytorch
pipeline_tag: zero-shot-image-classification
language:
  - en

Custom CLIP (ViT-B/16) - Optimized

This model is a scratch-built, highly optimized implementation of the CLIP architecture, developed as part of an Academic Research Project.

It achieves 2.46x faster inference speed (Latency: 21ms vs 52ms) compared to the standard OpenAI CLIP model on consumer hardware (RTX 3050 Ti), while maintaining 97.7% Zero-Shot Accuracy.

🔗 Source Code & Usage

The full source code, training details, and inference scripts are available on GitHub: 👉 GitHub Repository: custom-clip-vit-b-coco

(Please verify the GitHub link matches your actual repo URL)

🚀 Performance Benchmark

Model	Optimization	Latency	Speedup	Accuracy
OpenAI CLIP	FP32	52.22 ms	1.0x	99.88%
Custom CLIP	FP16 + Compile	21.20 ms	2.46x	97.71%

⚠️ License & Citation

This model is licensed under CC-BY 4.0. You are free to use it for academic or commercial purposes, but you must provide attribution to the author:

Author: Muhammed Köse
Project: Custom CLIP Optimization