--- tags: - clip - vision-language - image-text - pytorch license: apache-2.0 --- # CLIP Model This is a fine-tuned CLIP model for vision-language tasks. ## Model Description This model was fine-tuned from a base CLIP model and includes custom temperature scaling. ## Usage ```python from transformers import CLIPModel, CLIPProcessor import torch # Load model and processor model = CLIPModel.from_pretrained("aprendesc/CLIP_model_v0") processor = CLIPProcessor.from_pretrained("aprendesc/CLIP_model_v0") # Load temperature parameter if available try: from huggingface_hub import hf_hub_download temperature_path = hf_hub_download(repo_id="aprendesc/CLIP_model_v0", filename="temperature.pth") temperature = torch.load(temperature_path, map_location='cpu') print(f"Temperature parameter: {temperature}") except: print("No temperature parameter found") # Use the model for inference # ... your inference code here ... ``` ## Training Details - Base model: CLIP - Custom temperature scaling included - Fine-tuned for specific vision-language tasks