--- license: mit tags: - clip - feature-extraction - remote-sensing --- # Remote-CLIP-ViT-L-14 This model is a mirror/redistribution of the original [RemoteCLIP](https://huggingface.co/chendelong/RemoteCLIP) model. ## Original Repository and Links - **Original Hugging Face Model**: [chendelong/RemoteCLIP](https://huggingface.co/chendelong/RemoteCLIP) - **Official GitHub Repository**: [ChenDelong1999/RemoteCLIP](https://github.com/ChenDelong1999/RemoteCLIP) ## Description RemoteCLIP is a vision-language foundation model for remote sensing, trained on a large-scale dataset of remote sensing image-text pairs. It is based on the CLIP architecture and is designed to handle the unique characteristics of remote sensing imagery. ## Citation If you use this model in your research, please cite the original work: ```bibtex @article{remoteclip, author = {Fan Liu and Delong Chen and Zhangqingyun Guan and Xiaocong Zhou and Jiale Zhu and Qiaolin Ye and Liyong Fu and Jun Zhou}, title = {RemoteCLIP: {A} Vision Language Foundation Model for Remote Sensing}, journal = {{IEEE} Transactions on Geoscience and Remote Sensing}, volume = {62}, pages = {1--16}, year = {2024}, url = {https://doi.org/10.1109/TGRS.2024.3390838}, doi = {10.1109/TGRS.2024.3390838}, } ```