Library-Mutsumi
/

imgutils-models

Model card Files Files and versions

imgutils-models / README.md

Plana-Archive's picture

Migrasi otomatis

16a534b verified about 22 hours ago

|

history blame contribute delete

2.83 kB

	---
	license: mit
	datasets:
	- deepghs/chafen_arknights
	- deepghs/monochrome_danbooru
	metrics:
	- accuracy
	---

	# imgutils-models

	This repository includes all the models in [deepghs/imgutils](https://github.com/deepghs/imgutils).

	## LPIPS

	This model is used for clustering anime images (named `差分` in Chinese), based on [richzhang/PerceptualSimilarity](https://github.com/richzhang/PerceptualSimilarity), trained with dataset [deepghs/chafen_arknights(private)](https://huggingface.co/datasets/deepghs/chafen_arknights).

	When threshold is `0.45`, the [adjusted rand score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html) can reach `0.995`.

	File lists:
	* `lpips_diff.onnx`, feature difference.
	* `lpips_feature.onnx`, feature extracting.

	## Monochrome

	These model is used for monochrome image classification, based on CNNs and Transformers, trained with dataset [deepghs/monochrome_danbooru(private)](https://huggingface.co/datasets/deepghs/monochrome_danbooru).

	The following are the checkpoints that have been formally put into use, all based on the Caformer architecture:

	\| Checkpoint \| Algorithm \| Safe Level \| Accuracy \| False Negative \| False Positive \|
	\|:----------------------------:\|:---------:\|:----------:\|:----------:\|:--------------:\|:--------------:\|
	\| monochrome-caformer-40 \| caformer \| 0 \| 96.41% \| 2.69% \| 0.89% \|
	\| monochrome-caformer-110 \| caformer \| 0 \| 96.97% \| 1.57% \| 1.46% \|
	\| monochrome-caformer_safe2-80 \| caformer \| 2 \| 94.84% \| 1.12% \| 4.03% \|
	\| monochrome-caformer_safe4-70 \| caformer \| 4 \| 94.28% \| 0.67% \| 5.04% \|

	`monochrome-caformer-110` has the best overall accuracy among them, but considering that this model is often used to screen out monochrome images
	and we want to screen out as many as possible without omission, we have also introduced weighted models (`safe2` and `safe4`).
	Although their overall accuracy has been slightly reduced, the probability of False Negative (misidentifying a monochrome image as a colored one) is lower,
	making them more suitable for batch screening.

	## Deepdanbooru

	`deepdanbooru` is a model used to tag anime images. Here, we provide a table for tag classification called `deepdanbooru_tags.csv`,
	as well as an ONNX model (from [chinoll/deepdanbooru](https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags)).

	It's worth noting that due to the poor quality of the deepdanbooru model itself and the relatively old dataset,
	it is only for testing purposes and is not recommended to be used as the main classification model. We recommend using the `wd14` model instead, see:

	* https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags