| --- |
| tags: |
| - Image quality assessment |
| - GRMP-IQA |
| license: mit |
| metrics: |
| - PLCC |
| - SRCC |
| language: |
| - en |
| --- |
| |
| # GRMP-IQA Model Card |
|
|
| ### Installation |
|
|
| ```bash |
| pip install torch==1.12.0 torchvision==0.13.0 |
| pip install -r requirements.txt |
| ``` |
|
|
| ### Quick Start |
|
|
| #### 1. Meta-Learning Pre-training |
| ```bash |
| python pretrain.py |
| ``` |
|
|
| #### 2. Few-shot Fine-tuning |
| ```bash |
| # 50-shot fine-tuning on CLIVE |
| python finetune.py --dataset clive --num_image 50 --lda 5.0 |
| |
| # Fine-tuning on KonIQ-10K |
| python finetune.py --dataset koniq --num_image 50 --lda 5.0 |
| |
| # Using pre-trained model |
| python finetune.py --dataset clive --num_image 50 --pretrained --lda 5.0 |
| ``` |
|
|
| #### 3. Python API Usage |
|
|
| ```python |
| import torch |
| from CLIP import clip |
| from finetune import CustomCLIP, load_clip_to_cpu |
| |
| # Load pre-trained model |
| classnames = [['good', 'bad'], ['clear', 'unclear'], ['high quality', 'low quality']] |
| clip_model = load_clip_to_cpu('ViT-B/16').float() |
| model = CustomCLIP(classnames, clip_model) |
| |
| # Load checkpoint |
| checkpoint = torch.load('path/to/checkpoint.pt') |
| model.load_state_dict(checkpoint, strict=False) |
| |
| # Inference |
| model.eval() |
| with torch.no_grad(): |
| # image: torch.Tensor [B, 3, 224, 224] |
| logits = model(image) |
| quality_score = torch.softmax(logits[:, :2], dim=-1)[:, 0] |
| ``` |
|
|
| ## Hugging Face Model Hub |
|
|
| ### Available Resources |
|
|
| Our model and associated resources are available on the Hugging Face Model Hub: |
|
|
| **Repository**: [GRMP-IQA](https://huggingface.co/zzhowe/GRMP-IQA) |
|
|
| ### Usage Example with Hugging Face |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import torch |
| import scipy.io as sio |
| |
| # Download pre-trained model weights |
| model_path = hf_hub_download( |
| repo_id="zzhowe/GRMP-IQA", |
| filename="clive_50_prompt_lda_5.0.pt" |
| ) |
| |
| # Download dataset file |
| dataset_path = hf_hub_download( |
| repo_id="zzhowe/GRMP-IQA", |
| filename="LIVE_224.mat" |
| ) |
| |
| # Load model |
| model = torch.load(model_path, map_location='cpu') |
| |
| # Load dataset |
| dataset = sio.loadmat(dataset_path) |
| ``` |
|
|
| ## Citation |
|
|
| If you use this model in your research, please cite: |
|
|
| ```bibtex |
| @article{li2024boosting, |
| title={Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models}, |
| author={Li, Xudong and Huang, Zihao and Hu, Runze and Zhang, Yan and Cao, Liujuan and Ji, Rongrong}, |
| journal={arXiv preprint arXiv:2409.05381}, |
| year={2024} |
| } |
| ``` |
|
|
| ## License |
|
|
| This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
|
|
| ## Contact |
|
|
| For questions or issues, please contact: |
| - 📧 Email: [lxd761050753@gmail.com](mailto:lxd761050753@gmail.com) |
| - 📧 Email: [huangzihhhh@gmail.com](mailto:huangzihhhh@gmail.com) |
|
|
| ## Acknowledgments |
|
|
| - CLIP model from OpenAI |
| - PyTorch team for the deep learning framework |
| - All contributors to the IQA datasets used in training |
|
|
|
|