Add library_name: oppaioracle for HF download-stats registration

Browse files

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 license: apache-2.0
 pipeline_tag: image-classification
 language:
 - en
@@ -19,6 +20,12 @@ tags:
 ---
 ## TL;DR
 A multi-label anime tagger trained from scratch on a \~5.9M image dataset that received a targeted cleaning and vocabulary-expansion pass before training. The corrections touched roughly **1.3M tags** — large in absolute terms, but only on the order of **\~3% of all tags** in the corpus, so this is best described as a *targeted* cleaning rather than a heavy one. The pass was deliberately weighted toward **low-frequency tags**, which is where mislabels and missing labels hurt a tagger the most. On my evaluation set the model achieves the best precision-equals-recall point and a good mAP relative to comparable open tagger checkpoints, but the underlying training data still contains category-level noise that no amount of training would have erased. **All predictions should be human-reviewed before they are trusted.**
@@ -225,7 +232,6 @@ V1 ships with the noise it ships with. V2 is where I plan to do something about
 - **SmilingWolf** for the ViT v3 tagger, which made the initial cleaning pass tractable. None of this would have been feasible without an existing strong tagger to use as a second opinion.
 - The broader anime-tagger open-source community for the public tag corpora and prior model checkpoints I compared against.
-- deepghs/danbooru2024 dataset
 ---

 ---
 license: apache-2.0
+library_name: oppaioracle
 pipeline_tag: image-classification
 language:
 - en
 ---
+# OppaiOracle — Hugging Face Release Draft
+> Draft release notes / model card for the first public OppaiOracle checkpoint. Intended audience: people considering using this model for anime/illustration tagging. Tone: direct about what works, direct about what doesn't.
+---
 ## TL;DR
 A multi-label anime tagger trained from scratch on a \~5.9M image dataset that received a targeted cleaning and vocabulary-expansion pass before training. The corrections touched roughly **1.3M tags** — large in absolute terms, but only on the order of **\~3% of all tags** in the corpus, so this is best described as a *targeted* cleaning rather than a heavy one. The pass was deliberately weighted toward **low-frequency tags**, which is where mislabels and missing labels hurt a tagger the most. On my evaluation set the model achieves the best precision-equals-recall point and a good mAP relative to comparable open tagger checkpoints, but the underlying training data still contains category-level noise that no amount of training would have erased. **All predictions should be human-reviewed before they are trusted.**
 - **SmilingWolf** for the ViT v3 tagger, which made the initial cleaning pass tractable. None of this would have been feasible without an existing strong tagger to use as a second opinion.
 - The broader anime-tagger open-source community for the public tag corpora and prior model checkpoints I compared against.
 ---