Progress for next model
I Have gotten the 2026 data, and images. The meta I am using is a little dated but will by syncing the old keywords to the new keywords.
I will be breaking the tags up in sections now like general: meta:
I am about 20% done doing my ai reviewed & human verified of the general tags.
I have generated about 12,000 images to help de-noise high issue tags. These images will all need manual review and human tagging to high accuracy as they will be used as a final fine tune pass to help shift out noise.
Current corrections is 500K. Mostly additional tags as removes from the first one for the most part clean most low hanging fruit out.
Overall cleaned data sits at approx 1.8M tags and rising.
Help is always appreciated.
My next model I will also be reducing the parameter count and increasing the dataset size.
Likely like this one, even with the reduced parameters I will hit a wall before training is complete.