Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
## Model Details
|
| 2 |
+
|
| 3 |
+
This model is a port of the ViTMatte models, which are trained and tested on the Composition-1k and Distinctions-646 datasets. This port focuses on the performance and accuracy of the models.
|
| 4 |
+
|
| 5 |
+
Note: The porting of the model is for the convenience of use, and to better promote and learn from this excellent open-source project.
|
| 6 |
+
|
| 7 |
+
## Usage
|
| 8 |
+
|
| 9 |
+
This model aims to perform various image processing tasks, such as image segmentation, object recognition, and object detection.
|
| 10 |
+
|
| 11 |
+
## Training Data
|
| 12 |
+
|
| 13 |
+
The model undergoes training and validation using two datasets:
|
| 14 |
+
- Composition-1k, this dataset used for training and testing, includes 1000 samples.
|
| 15 |
+
- Distinctions-646, this dataset includes 646 samples and is used for model validation.
|
| 16 |
+
|
| 17 |
+
## Training Procedure
|
| 18 |
+
|
| 19 |
+
The model is trained using the gradient descent algorithm and evaluates its performance using the following four metrics:
|
| 20 |
+
- SAD (Sum of Absolute Differences)
|
| 21 |
+
- MSE (Mean Squared Error)
|
| 22 |
+
- Grad (Gradient)
|
| 23 |
+
- Conn (Connectivity)
|
| 24 |
+
|
| 25 |
+
## Performance
|
| 26 |
+
|
| 27 |
+
The models have shown the following performance on the two datasets:
|
| 28 |
+
|
| 29 |
+
On the Composition-1k dataset:
|
| 30 |
+
|
| 31 |
+
| Model | SAD | MSE | Grad | Conn |
|
| 32 |
+
|------|----|----|-----|-----|
|
| 33 |
+
| ViTMatte-S | 21.46 | 3.3 | 7.24 | 16.21 |
|
| 34 |
+
| ViTMatte-B | 20.33 | 3.0 | 6.74 | 14.78 |
|
| 35 |
+
|
| 36 |
+
On the Distinctions-646 dataset:
|
| 37 |
+
|
| 38 |
+
| Model | SAD | MSE | Grad | Conn |
|
| 39 |
+
|------|----|----|-----|-----|
|
| 40 |
+
| ViTMatte-S | 21.22 | 2.1 | 8.78 | 17.55 |
|
| 41 |
+
| ViTMatte-B | 17.05 | 1.5 | 7.03 | 12.95 |
|
| 42 |
+
|
| 43 |
+
Both models perform well on these datasets, with ViTMatte-B outperforming ViTMatte-S on most evaluation metrics.
|
| 44 |
+
|
| 45 |
+
## Disclaimer
|
| 46 |
+
|
| 47 |
+
This model is ported from [lufficc's ViTMatte](https://github.com/hustvl/ViTMatte) project. All original rights belong to [lufficc](https://github.com/lufficc).
|
| 48 |
+
|
| 49 |
+
## Citation
|
| 50 |
+
|
| 51 |
+
If you use these models, please cite the original author and project: https://github.com/hustvl/ViTMatte
|
| 52 |
+
|
| 53 |
+
Thank you for using these models. If you encounter any issues or have any feedback during your usage, please raise them on the original GitHub project page of the author.
|