Upload folder using huggingface_hub

1e3247f verified 9 months ago

4.23 kB

	---
	language:
	- "en"
	- "zh"
	tags:
	- mlx
	- DeepDanbooru
	- danbooru
	- Image-Clip
	- image-interrogate
	- image-to-text
	- captioning
	license: "mit"
	base_model: "hazhu/mlx-DeepDanbooru"
	---

	# mlx-DeepDanbooru

	Pure MLX implementation of DeepDanbooru Neural Network for __Apple Silicon Chips__: M1, M2, M3, M4;
	`mlx-DeepDanBooru` is available for: MacBook Pro / Air, Mac mini, iMac.

	## Usage

	Image-to-Text, captioning, CLIP by using [DeepDanBooru Model](https://github.com/KichangKim/DeepDanbooru) on Apple Devices.

	## MLX DeepDanBooru Model

	This `mlx-DeepDanBooru` Model implementation is inspired by a PyTorch implementation of [AUTOMATIC1111/TorchDeepDanbooru](https://github.com/AUTOMATIC1111/TorchDeepDanbooru)

	## Installation

	```
	conda create -n mlx026 python=3.12
	conda activate mlx026
	#
	pip install numpy
	pip install pillow
	```

	MLX is available on [PyPI](https://pypi.org/project/mlx/). To install the Python API, run:

	```
	pip install mlx
	```

	`mlx-DeepDanbooru` is base on `mlx` version: `0.26.1`

	## Inference

	```
	python infer.py
	```

	Image Interrogate:

	```python
	import numpy as np
	from PIL import Image, ImageDraw

	# using apple silicon's MLX
	# not Pytorch
	import mlx.core as mx
	from mlxDeepDanBooru.mlx_deep_danbooru_model import mlxDeepDanBooruModel


	model_path = "models/model-resnet_custom_v3_mlx.npz"
	tags_path = 'models/tags-resnet_custom_v3_mlx.npy'

	mlx_dan = mlxDeepDanBooruModel()
	mlx_dan.load_weights(model_path)
	mx.eval(mlx_dan.parameters())


	model_tags = np.load(tags_path)
	print(f'total tags: {len(model_tags)}')

	def danbooru_tags(fpath):
	tags = []
	pic = Image.open(fpath).convert("RGB").resize((512, 512))
	a = np.expand_dims(np.array(pic, dtype=np.float32), 0) / 255

	x = mx.array(a)
	y = mlx_dan(x)[0]

	for n in range(10):
	mlx_dan(x)
	for i, p in enumerate(y):
	if p >= 0.5:
	# 0.5 can be changed for demand: 0.0 ~ 1.0
	#print(model_tags[i].item(), p)
	tags.append(model_tags[i].item())

	return tags

	image_count = 0
	def image_infer(fpath):
	global image_count
	tags = danbooru_tags(fpath)
	image_count += 1
	return tags


	t1 = time.time()
	tags_1 = image_infer("example/1.png")
	tags_2 = image_infer("example/2.png")

	t2 = time.time()

	print(tags_1)
	# will show tags: ['1girl', 'beach', 'black_hair', 'blurry', 'blurry_background', 'blurry_foreground', 'building', 'bush', 'christmas_tree', 'day', 'depth_of_field', 'field', 'grass', 'lake', 'looking_at_viewer', 'mountain', 'nature', 'outdoors', 'palm_leaf', 'palm_tree', 'park', 'park_bench', 'path', 'photo_background', 'plant', 'river', 'road', 'skirt', 'sky', 'smile', 'striped', 'striped_dress', 'striped_shirt', 'tree', 'vertical-striped_shirt', 'vertical_stripes', 'rating:safe']

	print(tags_2)
	# will show tags: ['1girl', '3d', 'blurry', 'blurry_background', 'blurry_foreground', 'brown_eyes', 'brown_hair', 'bush', 'christmas_tree', 'cosplay_photo', 'day', 'depth_of_field', 'field', 'floral_print', 'foliage', 'forest', 'garden', 'grass', 'jungle', 'lips', 'long_hair', 'long_sleeves', 'looking_at_viewer', 'nature', 'on_grass', 'outdoors', 'palm_tree', 'park', 'path', 'plant', 'potted_plant', 'realistic', 'smile', 'solo', 'tree', 'upper_body', 'white_dress', 'rating:safe']

	print("-----------")
	print(f'infer speed(with mlx): {(t2 - t1)/image_count} seconds per image')
	```


	## Performance

	In the `example` folder, 1024x1024 pixel,

	On Mac Mini M4, `MLX DeepDanBooru Model` inference Speed:

	```
	1.7 seconds per image
	```

	On Mac Mini M4, __MPS + Pytorch__ inference Speed: `0.8 seconds per image`

	On Mac Mini M4, CPU + Pytorch inference Speed: `2.5 seconds per image`

	## CURRENTLY

	the speed of __MPS + Pytorch__ > MLX.

	![Performance Bar Chart](/hazhu/mlxDeepDanbooru/resolve/main/example/mlx-performance.png)

	## Bench: 351 images, 720x1280 and 540x720:

	In Windows 11, Nvidia RTX 4070 Ti, CUDA+Pytorch:

	```
	SPEED: 0.3 seconds per image
	Power Consumption: 260 ~ 300 Watt
	```

	In Mac mini M4, `mlx-DeepDanBooru`:

	```
	SPEED: 1.68 seconds per image
	Power Consumption: 8 ~ 12 Watt
	```

	In Mac mini M4, `mlx-DeepDanBooru` with multiprocessing, i.e.: run `infer_multiprocessing.py`:

	```
	SPEED: 0.42 seconds per image
	```