| --- |
| license: mit |
| datasets: |
| - deepghs/chafen_arknights |
| - deepghs/monochrome_danbooru |
| metrics: |
| - accuracy |
| --- |
| |
| # imgutils-models |
|
|
| This repository includes all the models in [deepghs/imgutils](https://github.com/deepghs/imgutils). |
|
|
| ## LPIPS |
|
|
| This model is used for clustering anime images (named `差分` in Chinese), based on [richzhang/PerceptualSimilarity](https://github.com/richzhang/PerceptualSimilarity), trained with dataset [deepghs/chafen_arknights(private)](https://huggingface.co/datasets/deepghs/chafen_arknights). |
|
|
| When threshold is `0.45`, the [adjusted rand score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html) can reach `0.995`. |
|
|
| File lists: |
| * `lpips_diff.onnx`, feature difference. |
| * `lpips_feature.onnx`, feature extracting. |
|
|
| ## Monochrome |
|
|
| These model is used for monochrome image classification, based on CNNs and Transformers, trained with dataset [deepghs/monochrome_danbooru(private)](https://huggingface.co/datasets/deepghs/monochrome_danbooru). |
|
|
| The following are the checkpoints that have been formally put into use, all based on the Caformer architecture: |
|
|
| | Checkpoint | Algorithm | Safe Level | Accuracy | False Negative | False Positive | |
| |:----------------------------:|:---------:|:----------:|:----------:|:--------------:|:--------------:| |
| | monochrome-caformer-40 | caformer | 0 | 96.41% | 2.69% | 0.89% | |
| | **monochrome-caformer-110** | caformer | 0 | **96.97%** | 1.57% | 1.46% | |
| | monochrome-caformer_safe2-80 | caformer | 2 | 94.84% | **1.12%** | 4.03% | |
| | monochrome-caformer_safe4-70 | caformer | 4 | 94.28% | **0.67%** | 5.04% | |
|
|
| **`monochrome-caformer-110` has the best overall accuracy** among them, but considering that this model is often used to screen out monochrome images |
| and we want to screen out as many as possible without omission, we have also introduced weighted models (`safe2` and `safe4`). |
| Although their overall accuracy has been slightly reduced, the probability of False Negative (misidentifying a monochrome image as a colored one) is lower, |
| making them more suitable for batch screening. |
|
|
| ## Deepdanbooru |
|
|
| `deepdanbooru` is a model used to tag anime images. Here, we provide a table for tag classification called `deepdanbooru_tags.csv`, |
| as well as an ONNX model (from [chinoll/deepdanbooru](https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags)). |
|
|
| It's worth noting that due to the poor quality of the deepdanbooru model itself and the relatively old dataset, |
| it is only for testing purposes and is not recommended to be used as the main classification model. We recommend using the `wd14` model instead, see: |
|
|
| * https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags |
|
|
|
|