root
commited on
Commit
Β·
4a2b9ca
1
Parent(s):
a44108d
Initial model upload
Browse files- .gitattributes +2 -0
- README.md +81 -0
- agent_tools/checkpoints/DiffLL/model.pth.tar +3 -0
- agent_tools/checkpoints/ESRGAN/RealESRGAN_x4plus.pth +3 -0
- agent_tools/checkpoints/HVICIDNet/generalization.pth +3 -0
- agent_tools/checkpoints/IDT/epoch100.pth.tar +3 -0
- agent_tools/checkpoints/Img2img_turbo/rainy2day.pkl +3 -0
- agent_tools/checkpoints/Img2img_turbo/snow2day.pkl +3 -0
- agent_tools/checkpoints/KANet/trained_model_epoch1.pk +3 -0
- agent_tools/checkpoints/LightenDiffusion/stage2_weight.pth.tar +3 -0
- agent_tools/checkpoints/RIDCP/pretrained_HQPs.pth +3 -0
- agent_tools/checkpoints/RIDCP/pretrained_RIDCP.pth +3 -0
- agent_tools/checkpoints/RIDCP/weight_for_matching_dehazing_Flickr.pth +3 -0
- agent_tools/checkpoints/Retinexformer/FiveK.pth +3 -0
- agent_tools/checkpoints/S2Former/udrs2former_demo.pth +3 -0
- agent_tools/checkpoints/S2Former/udrs2former_raindrop_real.pth +3 -0
- agent_tools/checkpoints/S2Former/udrs2former_raindrop_syn.pth +3 -0
- agent_tools/checkpoints/SCUNet/scunet_color_real_gan.pth +3 -0
- agent_tools/checkpoints/SnowMaster/checkpoint_0318.pth +3 -0
- degradation_synthesis/rainy/GuidedDisent/weights/pretrained.pth +3 -0
- degradation_synthesis/snow/checkpoints/day2snow.pkl +3 -0
- config.json β pretrained/mrrhf/config.json +0 -0
- preprocessor_config.json β pretrained/mrrhf/preprocessor_config.json +0 -0
- pytorch_model.bin β pretrained/mrrhf/pytorch_model.bin +0 -0
- special_tokens_map.json β pretrained/mrrhf/special_tokens_map.json +0 -0
- tokenizer.json β pretrained/mrrhf/tokenizer.json +0 -0
- tokenizer_config.json β pretrained/mrrhf/tokenizer_config.json +0 -0
.gitattributes
CHANGED
|
@@ -34,3 +34,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
*.json filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
*.json filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
*.pth.tar filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
*.pk filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- cvpr25
|
| 5 |
+
- JarvisIR
|
| 6 |
+
- weights
|
| 7 |
+
description: |
|
| 8 |
+
This repository contains the official weights for the CVPR 2025 paper "JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration".
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
|
| 12 |
+
|
| 13 |
+
## Model Description
|
| 14 |
+
|
| 15 |
+
JarvisIR is a novel system that leverages a Vision-Language Model (VLM) to intelligently restore images for autonomous driving perception in adverse weather. It acts as a central controller, dynamically coordinating multiple expert restoration models to tackle complex degradations such as rain, fog, low-light, and snow.
|
| 16 |
+
|
| 17 |
+
## Key Features
|
| 18 |
+
|
| 19 |
+
- **VLM Controller**: The first framework to employ a Vision-Language Model for orchestrating image restoration workflows.
|
| 20 |
+
- **Multi-Expert Coordination**: Dynamically schedules specialized restoration models for tasks like denoising, super-resolution, and deraining.
|
| 21 |
+
- **Adaptive Restoration**: Effectively handles a wide range of adverse weather conditions, including night/low-light, rain, fog, and snow.
|
| 22 |
+
- **Advanced Training Strategy**: Utilizes a two-stage process of Supervised Fine-Tuning (SFT) followed by alignment with Mixed-Rank Reward-based Human Feedback (MRRHF).
|
| 23 |
+
|
| 24 |
+
## Model Architecture
|
| 25 |
+
|
| 26 |
+
The system comprises three core components:
|
| 27 |
+
|
| 28 |
+
1. **VLM Controller**: A LLaVA-v1.5-7B model serves as the core for task planning and expert model selection.
|
| 29 |
+
2. **Expert Models**: A suite of specialized networks, each tailored for a specific restoration task (e.g., deraining, defogging).
|
| 30 |
+
3. **Reward Models**: A set of Image Quality Assessment (IQA) models that provide feedback for quality assessment and alignment during training.
|
| 31 |
+
|
| 32 |
+
## Training Data
|
| 33 |
+
|
| 34 |
+
JarvisIR was trained on a large-scale, comprehensive dataset:
|
| 35 |
+
|
| 36 |
+
- **CleanBench-Synthetic**: A dataset of 150,000 synthetically degraded images with corresponding annotations.
|
| 37 |
+
- **CleanBench-Real**: A collection of 80,000 real-world images captured in adverse weather, used for alignment training.
|
| 38 |
+
- **Comprehensive Coverage**: The data covers four primary weather scenarios (night, rain, fog, snow) with various combinations of degradations.
|
| 39 |
+
|
| 40 |
+
## Performance
|
| 41 |
+
|
| 42 |
+
- Achieves a **50% average improvement** in perception metrics on the CleanBench-Real dataset compared to state-of-the-art all-in-one methods.
|
| 43 |
+
- Demonstrates superior performance across all tested weather conditions.
|
| 44 |
+
- Exhibits enhanced robustness and generalization capabilities in real-world driving scenarios.
|
| 45 |
+
|
| 46 |
+
## Intended Use
|
| 47 |
+
|
| 48 |
+
**Primary Use Cases:**
|
| 49 |
+
- Enhancing perception systems in autonomous vehicles.
|
| 50 |
+
- Building robust, multi-weather image restoration pipelines.
|
| 51 |
+
- Advancing research into the applications of Vision-Language Models in image processing.
|
| 52 |
+
|
| 53 |
+
## Model Checkpoints
|
| 54 |
+
|
| 55 |
+
This repository provides the following model weights:
|
| 56 |
+
- `pertained`: The complete model after both Supervised Fine-Tuning and MRRHF alignment stages.
|
| 57 |
+
- `agent-tools/`: The weights for each individual expert restoration model.
|
| 58 |
+
|
| 59 |
+
## Citation
|
| 60 |
+
|
| 61 |
+
If you find JarvisIR useful in your research, please cite our paper:
|
| 62 |
+
|
| 63 |
+
```bibtex
|
| 64 |
+
@inproceedings{lin2025jarvisir,
|
| 65 |
+
title={Jarvisir: Elevating autonomous driving perception with intelligent image restoration},
|
| 66 |
+
author={Lin, Yunlong and Lin, Zixu and Chen, Haoyu and Pan, Panwang and Li, Chenxin and Chen, Sixiang and Wen, Kairun and Jin, Yeying and Li, Wenbo and Ding, Xinghao},
|
| 67 |
+
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
|
| 68 |
+
pages={22369--22380},
|
| 69 |
+
year={2025}
|
| 70 |
+
}
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
## Related Resources
|
| 74 |
+
|
| 75 |
+
- **Project Page**: https://cvpr2025-jarvisir.github.io/
|
| 76 |
+
- **Code Repository**: https://github.com/LYL1015/JarvisIR
|
| 77 |
+
- **Paper**: https://arxiv.org/pdf/2504.04158
|
| 78 |
+
|
| 79 |
+
## Acknowledgments
|
| 80 |
+
|
| 81 |
+
This work contributes to the advancement of intelligent image restoration by integrating Vision-Language Models with expert system coordination.
|
agent_tools/checkpoints/DiffLL/model.pth.tar
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:acc3b4a41b7a0a12dbd24bbd904d44ddc070cffdcc638dbb038acc3c50715c9d
|
| 3 |
+
size 353865641
|
agent_tools/checkpoints/ESRGAN/RealESRGAN_x4plus.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4fa0d38905f75ac06eb49a7951b426670021be3018265fd191d2125df9d682f1
|
| 3 |
+
size 67040989
|
agent_tools/checkpoints/HVICIDNet/generalization.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:143e88e8a92d1bc21f05550f415f43ceae02d1c83360ec062428ebd6f8d06914
|
| 3 |
+
size 7971269
|
agent_tools/checkpoints/IDT/epoch100.pth.tar
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6a7291b98969ce4b0f5ce867ca2fc63369340d347258b078539da46eea056189
|
| 3 |
+
size 197713581
|
agent_tools/checkpoints/Img2img_turbo/rainy2day.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:09811f30d7cacfe33e618a7dcfaab3913a66b127972641ccedfffbe9a560d796
|
| 3 |
+
size 1229932962
|
agent_tools/checkpoints/Img2img_turbo/snow2day.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4c72af2be175e89b33aeac852f0e66c3d7054ce789dc95f1407533cda69d1c3d
|
| 3 |
+
size 1200345822
|
agent_tools/checkpoints/KANet/trained_model_epoch1.pk
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:adca80cf604f41b6bca3da236c2473b728e75698dfa8b626540d365f0f7bbf81
|
| 3 |
+
size 223509311
|
agent_tools/checkpoints/LightenDiffusion/stage2_weight.pth.tar
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4bde72decbd99c4bd23bb1748c521e52c3a2b41299b355d5ba0d12fda1c4e014
|
| 3 |
+
size 111512513
|
agent_tools/checkpoints/RIDCP/pretrained_HQPs.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2b63ca2a3cb0e65a7614f5f2377ef4c070a36ecf41749e7dddbec37e1f1288d6
|
| 3 |
+
size 25118706
|
agent_tools/checkpoints/RIDCP/pretrained_RIDCP.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1ea9f4344d2e46eb07d95a5a67b2c80cecb78d26ad0e3250c624031178c68271
|
| 3 |
+
size 122065395
|
agent_tools/checkpoints/RIDCP/weight_for_matching_dehazing_Flickr.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:050ce30a772299c4b6d30754235b910a0087761b8e8144c67a10ff02b230b4fc
|
| 3 |
+
size 8939
|
agent_tools/checkpoints/Retinexformer/FiveK.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:800f6a9281fe8d95daca3108f2b826d5a2adead09031e0e998d30a615286d9c1
|
| 3 |
+
size 6478393
|
agent_tools/checkpoints/S2Former/udrs2former_demo.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:89223310c5102e9fd7253fa3224947696bf2ca395b94ae81b02561abfaa98165
|
| 3 |
+
size 35369131
|
agent_tools/checkpoints/S2Former/udrs2former_raindrop_real.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4517e7f724cc0e3e2f094070b424221cf0b25851d857505b18e438659b4e7907
|
| 3 |
+
size 35379671
|
agent_tools/checkpoints/S2Former/udrs2former_raindrop_syn.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3fe58f8b7687cfac612bc13e04dd4bd13a8993adf95aff6e966baef65e0358ab
|
| 3 |
+
size 35378507
|
agent_tools/checkpoints/SCUNet/scunet_color_real_gan.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:892c83f812c59173273b74f4f34a14ecaf57a2fdb68df056664589beb55c966e
|
| 3 |
+
size 71982835
|
agent_tools/checkpoints/SnowMaster/checkpoint_0318.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:18342bc16e8fe1bfbcddf9f419103dccf4cd83598668b34e702e17ad9abb3899
|
| 3 |
+
size 274874370
|
degradation_synthesis/rainy/GuidedDisent/weights/pretrained.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b083eb487709d3e6193cef7ce99ffa13f4b1aa57340e006c442da607fa73d594
|
| 3 |
+
size 120316019
|
degradation_synthesis/snow/checkpoints/day2snow.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bd694b6d2d200f62ef9a8dbd1257c055f4a7fdb16e3e660766d48356fc6b95bb
|
| 3 |
+
size 1200347142
|
config.json β pretrained/mrrhf/config.json
RENAMED
|
File without changes
|
preprocessor_config.json β pretrained/mrrhf/preprocessor_config.json
RENAMED
|
File without changes
|
pytorch_model.bin β pretrained/mrrhf/pytorch_model.bin
RENAMED
|
File without changes
|
special_tokens_map.json β pretrained/mrrhf/special_tokens_map.json
RENAMED
|
File without changes
|
tokenizer.json β pretrained/mrrhf/tokenizer.json
RENAMED
|
File without changes
|
tokenizer_config.json β pretrained/mrrhf/tokenizer_config.json
RENAMED
|
File without changes
|