|
|
--- |
|
|
language: |
|
|
- en |
|
|
- ru |
|
|
tags: |
|
|
- vision |
|
|
- image-classification |
|
|
- style-recognition |
|
|
- anime |
|
|
- danbooru |
|
|
- artist-identification |
|
|
- few-shot |
|
|
- aniworldai |
|
|
- onnx |
|
|
license: apache-2.0 |
|
|
base_model: facebook/convnext-tiny-224 |
|
|
library_name: onnxruntime |
|
|
pipeline_tag: image-classification |
|
|
--- |
|
|
|
|
|
# 🎨 Author_ID — Anime Artist Style Recognition |
|
|
|
|
|
<div align="center"> |
|
|
<a href="https://aniworldai.org/"> |
|
|
<img src="https://img.shields.io/badge/AniWorldAI-Official-blue?style=for-the-badge&logo=web" alt="AniWorldAI Website"> |
|
|
</a> |
|
|
<a href="https://t.me/aniworldai"> |
|
|
<img src="https://img.shields.io/badge/Telegram-Channel-2CA5E0?style=for-the-badge&logo=telegram" alt="Telegram Channel"> |
|
|
</a> |
|
|
<a href="https://t.me/aniworld_bot"> |
|
|
<img src="https://img.shields.io/badge/🔥_Full_3000_Authors-Try_in_Bot-orange?style=for-the-badge&logo=telegram" alt="Try Full Version"> |
|
|
</a> |
|
|
</div> |
|
|
|
|
|
<br> |
|
|
|
|
|
## 🇬🇧 English Description |
|
|
|
|
|
**Author_ID** is an AI model that recognizes the **artistic style** of anime illustrations and identifies the most likely artist from **Danbooru** database. |
|
|
|
|
|
Think of it as **"Shazam for anime art"** — upload any illustration and instantly discover who drew it or whose style it resembles. |
|
|
|
|
|
### 🧠 Architecture: Face ID for Art |
|
|
|
|
|
This model is built using the same architectural principles as **Apple Face ID**: |
|
|
|
|
|
| Face ID | Author_ID | |
|
|
|---------|-----------| |
|
|
| Encodes facial features into embedding | Encodes artistic style into embedding | |
|
|
| Compares with stored face template | Compares with artist style centroids | |
|
|
| Works with one photo enrollment | Works with few-shot artist samples | |
|
|
|
|
|
The model generates a **512-dimensional style embedding** and compares it against precomputed artist centroids using cosine similarity. |
|
|
|
|
|
### ⚡ Few-Shot Learning |
|
|
|
|
|
Unlike traditional classifiers that require thousands of samples per class, Author_ID uses a **metric learning** approach: |
|
|
|
|
|
- **No retraining needed** to add new artists |
|
|
- Just compute centroid from **3-5 sample images** |
|
|
- Instantly searchable in the embedding space |
|
|
|
|
|
### 📦 Model Versions |
|
|
|
|
|
| Version | Authors | Availability | |
|
|
|---------|---------|--------------| |
|
|
| **Demo (this repo)** | 500 | Free download | |
|
|
| **Full** | 3000+ | [Telegram Bot](https://t.me/aniworld_bot) | |
|
|
|
|
|
### 🏷️ Output Format |
|
|
|
|
|
Returns top-5 most similar artists with confidence scores: |
|
|
``` |
|
|
(artist:hiten:0.87), (artist:saitom:0.72), (artist:anmi:0.68), ... |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 🇷🇺 Описание на русском |
|
|
|
|
|
**Author_ID** — это ИИ-модель, которая распознаёт **художественный стиль** аниме-иллюстраций и определяет наиболее вероятного автора из базы **Danbooru**. |
|
|
|
|
|
Можно сказать, это **"Shazam для аниме-артов"** — загрузите любую картинку и мгновенно узнайте, кто её нарисовал или чей стиль она напоминает. |
|
|
|
|
|
### 🧠 Архитектура: Face ID для арта |
|
|
|
|
|
Модель построена по тем же принципам, что и **Apple Face ID**: |
|
|
|
|
|
| Face ID | Author_ID | |
|
|
|---------|-----------| |
|
|
| Кодирует черты лица в эмбеддинг | Кодирует стиль рисунка в эмбеддинг | |
|
|
| Сравнивает с сохранённым шаблоном | Сравнивает с центроидами авторов | |
|
|
| Работает с одним фото при регистрации | Работает с few-shot примерами | |
|
|
|
|
|
Модель генерирует **512-мерный вектор стиля** и сравнивает его с предрассчитанными центроидами авторов через косинусное сходство. |
|
|
|
|
|
### ⚡ Few-Shot обучение |
|
|
|
|
|
В отличие от классических классификаторов, Author_ID использует **metric learning**: |
|
|
|
|
|
- **Не требует переобучения** для новых авторов |
|
|
- Достаточно **3-5 примеров** для создания центроида |
|
|
- Мгновенный поиск в пространстве эмбеддингов |
|
|
|
|
|
### 📦 Версии модели |
|
|
|
|
|
| Версия | Авторов | Доступность | |
|
|
|--------|---------|-------------| |
|
|
| **Demo (этот репо)** | 500 | Бесплатно | |
|
|
| **Full** | 3000+ | [Telegram Bot](https://t.me/aniworld_bot) | |
|
|
|
|
|
--- |
|
|
|
|
|
## 🚀 How to Use / Как использовать |
|
|
|
|
|
### Installation / Установка |
|
|
```bash |
|
|
pip install onnxruntime onnx pillow numpy huggingface_hub |
|
|
# or for GPU / или для GPU: |
|
|
pip install onnxruntime-gpu onnx pillow numpy huggingface_hub |
|
|
``` |
|
|
|
|
|
### Inference / Инференс |
|
|
|
|
|
```python |
|
|
import onnxruntime as ort |
|
|
import onnx |
|
|
import numpy as np |
|
|
from PIL import Image |
|
|
import json |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
# Download model from HuggingFace (cached automatically) |
|
|
MODEL_PATH = hf_hub_download( |
|
|
repo_id="AugustLabs/Author_ID", |
|
|
filename=" style_predictor_500.onnx" |
|
|
) |
|
|
|
|
|
class AuthorID: |
|
|
""" |
|
|
Author_ID: Anime Artist Style Recognition |
|
|
Single ONNX file contains: model + centroids + author names |
|
|
""" |
|
|
|
|
|
def __init__(self, onnx_path): |
|
|
# Load metadata (author names embedded in ONNX) |
|
|
model_onnx = onnx.load(onnx_path) |
|
|
self.names = [] |
|
|
self.input_size = 384 |
|
|
|
|
|
for prop in model_onnx.metadata_props: |
|
|
if prop.key == "author_names": |
|
|
self.names = json.loads(prop.value) |
|
|
elif prop.key == "input_size": |
|
|
self.input_size = int(prop.value) |
|
|
|
|
|
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] |
|
|
self.session = ort.InferenceSession(onnx_path, providers=providers) |
|
|
|
|
|
self.mean = np.array([0.485, 0.456, 0.406], dtype=np.float32).reshape(1, 3, 1, 1) |
|
|
self.std = np.array([0.229, 0.224, 0.225], dtype=np.float32).reshape(1, 3, 1, 1) |
|
|
|
|
|
def preprocess(self, image_path): |
|
|
img = Image.open(image_path) |
|
|
|
|
|
# Handle transparency |
|
|
if img.mode in ('RGBA', 'LA') or (img.mode == 'P' and 'transparency' in img.info): |
|
|
bg = Image.new('RGB', img.size, (255, 255, 255)) |
|
|
img = img.convert('RGBA') |
|
|
bg.paste(img, mask=img.split()[3]) |
|
|
img = bg |
|
|
else: |
|
|
img = img.convert('RGB') |
|
|
|
|
|
img = img.resize((self.input_size, self.input_size), Image.BILINEAR) |
|
|
|
|
|
img_np = np.array(img, dtype=np.float32) / 255.0 |
|
|
img_np = img_np.transpose(2, 0, 1)[np.newaxis, ...] |
|
|
img_np = (img_np - self.mean) / self.std |
|
|
|
|
|
return img_np |
|
|
|
|
|
def predict(self, image_path, top_k=5): |
|
|
"""Returns list of (author_name, similarity_score)""" |
|
|
img_np = self.preprocess(image_path) |
|
|
top_indices, top_scores = self.session.run(None, {'image': img_np}) |
|
|
|
|
|
results = [] |
|
|
for idx, score in zip(top_indices[0][:top_k], top_scores[0][:top_k]): |
|
|
results.append((self.names[idx], float(score))) |
|
|
|
|
|
return results |
|
|
|
|
|
def predict_tags(self, image_path, top_k=5): |
|
|
"""Returns formatted tags: (artist:name:score)""" |
|
|
results = self.predict(image_path, top_k) |
|
|
return [f"(artist:{name}:{score:.2f})" for name, score in results] |
|
|
|
|
|
|
|
|
# === Example Usage === |
|
|
if __name__ == "__main__": |
|
|
# Initialize (once) — model downloads automatically |
|
|
model = AuthorID(MODEL_PATH) |
|
|
|
|
|
# Predict |
|
|
results = model.predict("your_image.jpg", top_k=5) |
|
|
|
|
|
print("🎨 Detected artist styles:") |
|
|
for author, score in results: |
|
|
print(f" {author}: {score:.1%}") |
|
|
|
|
|
# Or get formatted tags |
|
|
tags = model.predict_tags("your_image.jpg") |
|
|
print("\n📝 Tags:", ", ".join(tags)) |
|
|
``` |
|
|
|
|
|
### Expected Output / Пример вывода |
|
|
``` |
|
|
🎨 Detected artist styles: |
|
|
hiten_(hitenkei): 87.3% |
|
|
saitom: 71.8% |
|
|
anmi: 68.2% |
|
|
kantoku: 65.1% |
|
|
mishima_kurone: 62.4% |
|
|
|
|
|
📝 Tags: (artist:hiten_(hitenkei):0.87), (artist:saitom:0.72), (artist:anmi:0.68), (artist:kantoku:0.65), (artist:mishima_kurone:0.62) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## 📊 Technical Details / Технические детали |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Backbone | ConvNeXt-Tiny | |
|
|
| Embedding dim | 512 | |
|
|
| Input size | 384×384 | |
|
|
| Training data | Danbooru (filtered) | |
|
|
| Metric | Cosine similarity | |
|
|
| Format | ONNX (opset 17) | |
|
|
|
|
|
--- |
|
|
|
|
|
## ⚠️ Limitations / Ограничения |
|
|
|
|
|
- Works best on **anime/manga style** illustrations |
|
|
- May confuse artists with very similar styles |
|
|
- Confidence drops on **heavily cropped** or **low-quality** images |
|
|
- Demo version limited to **500 authors** |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
## 🔥 Want the full 3000+ artist version? |
|
|
|
|
|
<a href="https://t.me/aniworld_bot"> |
|
|
<img src="https://img.shields.io/badge/Try_Full_Version-Telegram_Bot-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white" alt="Telegram Bot"> |
|
|
</a> |
|
|
|
|
|
<br><br> |
|
|
|
|
|
**More AI Models & News:** |
|
|
|
|
|
<a href="https://aniworldai.org/"> |
|
|
<img src="https://img.shields.io/badge/🌐_AniWorldAI.org-Website-blue?style=flat&logo=google-chrome" alt="Website"> |
|
|
</a> |
|
|
<a href="https://t.me/aniworldai"> |
|
|
<img src="https://img.shields.io/badge/📢_Subscribe-Telegram_Channel-2CA5E0?style=flat&logo=telegram" alt="Channel"> |
|
|
</a> |
|
|
<a href="https://huggingface.co/AniWorldAI"> |
|
|
<img src="https://img.shields.io/badge/🤗_More_Models-HuggingFace-yellow?style=flat" alt="HuggingFace"> |
|
|
</a> |
|
|
|
|
|
</div> |