pony-v7-base-Diffusers

@@ -1,120 +1,122 @@
----
-license: other
-license_name: pony-license
-license_link: LICENSE
----
-# Pony V7
-![Pony V7](V7.webp)
-Pony V7 is a versatile character generation model based on AuraFlow architecture. It supports a wide range of styles and species types (humanoid, anthro, feral, and more) and handles character interactions through natural language prompts.
-## Fictional
-First, let me introduce [Fictional](https://fictional.ai) - our multimodal platform where AI Characters come alive through text, images, voice, and (soon) video. Powered by PonyV7, V6, Chroma, Seedream 4, and other advanced models, Fictional lets you discover, create, and interact with characters who live their own lives and share their own stories.
-Fictional is also what enables the development of models like V7, so if you're excited about the future of multimodal AI characters, please download Fictional on iOS or Android and help shape our future!
-- **iOS**: https://apps.apple.com/us/app/fictional/id6739802573
-- **Android**: https://play.google.com/store/apps/details?id=ai.fictional.app
-### Get in touch with us
-Please join [our Discord Server](https://discord.gg/pYsdjMfu3q) if you have questions about Fictional and Pony models.
-## Important model information
-Please check [this article](https://civitai.com/articles/19986) to learn more about why it took so long for us to ship V7 and upcoming model releases.
-## Model prompting
-This model supports a wide array of styles and aesthetics but provides an opinionated default prompt template:
-```
-special tags, factual description of image, stylistic description of image, additional content tags
-```
-### Special Tags
-`score_X`, `style_cluster_x`, `source_X` - warning: V7 prompting may be inconsistent, please see the article as we are working on V7.1 to address this.
-### Factual description of image
-Description of what is portrayed in the image without any stylistic indicators. Two recommendations:
-1. Start with a single phrase describing what you want in the image before going into details
-2. When referring to characters use pattern: `<species> <gender> <name> from <source>`
-For example "Anthro bunny female Lola Bunny from Space Jam".
-This model is capable of recognizing many popular and obscure characters and series.
-### Stylistic description of image
-Any information about image medium, shot type, lighting, etc. (More info TBD with captioning Colab)
-### Tags
-V7 is trained on a combination of natural language prompts and tags and is capable of understanding both, so describing the intended result using normal language works in most cases, although you can add some tags after the main prompt to boost them.
-### Captioning Colab
-To get a better understanding of V7 prompting, we are releasing a [captioning Colab](https://colab.research.google.com/drive/19PG-0ltob8EynxUZSwOdjMFmqyJ7ZOCB) with all the models used for V7 captioning.
-## Supported inference settings
-V7 supports resolutions in the range of 768px to 1536px. It is recommended to go for higher resolutions and at least 30 steps during inference.
-## Highlights compared to V6
-- Much stronger understanding of prompts, especially when it comes to spatial information and multiple characters
-- Much stronger background support - both generation of backgrounds and using background with character
-- Much stronger realism support out of the box
-- Ability to generate very dark and very light images
-- Resolution up to 1536x1536 pixels
-- Expanded character recognition (some V6 characters may get less recognized, but generally we extended the knowledge by a lot)
-## Special thanks
-- Iceman for helping to procure necessary training resources
-- [Simo Ryu](https://x.com/cloneofsimo) and the rest of FAL.ai team for creating AuraFlow and emotional support
-- [Runpod for providing captioning compute](https://runpod.io/?utm_source=purplesmartai)
-- [Piclumen](https://www.piclumen.com/) for being our partners
-- [City96](https://github.com/city96) for help with GGUF support
-- [diffusers](https://huggingface.co/docs/diffusers/en/index) team for supporting AuraFlow integration work
-- PSAI Server Subscribers for supporting the project costs
-- PSAI Server Moderators for being vigilant and managing the community
-- Many supporters that decided to remain anonymous but their help has been critical for getting V7 done
-## Technical details
-The model has been trained on ~10M images aesthetically ranked and selected from a superset of over 30M images with roughly 1:1 ratio between anime/cartoon/furry/pony datasets and 1:1 ratio between safe/questionable/explicit ratings. 100% of all images have been tagged and captioned with high quality detailed captions.
-All images have been used in training with both captions and tags. Artists' names have been removed and source data has been filtered based on our Opt-in/Opt-out program. Any inappropriate explicit content has been filtered out.
-## Limitations
-- This model does not support text generation and has degraded text generation capabilities compared to base AuraFlow
-- Special tags (including quality tags) have much weaker performance compared to V6, meaning score_9 would not necessarily yield better results on some prompts. We are working on a V7.1 follow-up to improve this
-- Small details and especially faces may degrade significantly depending on art style, this is a combination of outdated VAE and insufficient training which we are trying to improve in V7.1
-## LoRA training
-We recommend using SimpleTuner for LoRA training following [this guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/AURAFLOW.md).
-## Commercial API
-We provide [commercial API](https://fal.ai/models/fal-ai/pony-v7) via our exclusive partner FAL.ai
-## License
-This model is licensed under a Pony License
-In short, you can use this model and its outputs commercially unless you provide an inference service or application, have a company with over 1M revenue or use in professional video production. This limitations do not apply if you use first party commercial APIs.
-If you want to use this model commercially, please reach us at contact@purplesmart.ai.
-Explicit permission for commercial inference has been granted to CivitAi and Hugging Face.

+---
+license: other
+license_name: pony-license
+license_link: LICENSE
+---
+# Pony V7
+# This repo is a clone of the [original](purplesmartai/pony-v7-base) with the files not needed for Diffusers format to work removed.
+![Pony V7](V7.webp)
+Pony V7 is a versatile character generation model based on AuraFlow architecture. It supports a wide range of styles and species types (humanoid, anthro, feral, and more) and handles character interactions through natural language prompts.
+## Fictional
+First, let me introduce [Fictional](https://fictional.ai) - our multimodal platform where AI Characters come alive through text, images, voice, and (soon) video. Powered by PonyV7, V6, Chroma, Seedream 4, and other advanced models, Fictional lets you discover, create, and interact with characters who live their own lives and share their own stories.
+Fictional is also what enables the development of models like V7, so if you're excited about the future of multimodal AI characters, please download Fictional on iOS or Android and help shape our future!
+- **iOS**: https://apps.apple.com/us/app/fictional/id6739802573
+- **Android**: https://play.google.com/store/apps/details?id=ai.fictional.app
+### Get in touch with us
+Please join [our Discord Server](https://discord.gg/pYsdjMfu3q) if you have questions about Fictional and Pony models.
+## Important model information
+Please check [this article](https://civitai.com/articles/19986) to learn more about why it took so long for us to ship V7 and upcoming model releases.
+## Model prompting
+This model supports a wide array of styles and aesthetics but provides an opinionated default prompt template:
+```
+special tags, factual description of image, stylistic description of image, additional content tags
+```
+### Special Tags
+`score_X`, `style_cluster_x`, `source_X` - warning: V7 prompting may be inconsistent, please see the article as we are working on V7.1 to address this.
+### Factual description of image
+Description of what is portrayed in the image without any stylistic indicators. Two recommendations:
+1. Start with a single phrase describing what you want in the image before going into details
+2. When referring to characters use pattern: `<species> <gender> <name> from <source>`
+For example "Anthro bunny female Lola Bunny from Space Jam".
+This model is capable of recognizing many popular and obscure characters and series.
+### Stylistic description of image
+Any information about image medium, shot type, lighting, etc. (More info TBD with captioning Colab)
+### Tags
+V7 is trained on a combination of natural language prompts and tags and is capable of understanding both, so describing the intended result using normal language works in most cases, although you can add some tags after the main prompt to boost them.
+### Captioning Colab
+To get a better understanding of V7 prompting, we are releasing a [captioning Colab](https://colab.research.google.com/drive/19PG-0ltob8EynxUZSwOdjMFmqyJ7ZOCB) with all the models used for V7 captioning.
+## Supported inference settings
+V7 supports resolutions in the range of 768px to 1536px. It is recommended to go for higher resolutions and at least 30 steps during inference.
+## Highlights compared to V6
+- Much stronger understanding of prompts, especially when it comes to spatial information and multiple characters
+- Much stronger background support - both generation of backgrounds and using background with character
+- Much stronger realism support out of the box
+- Ability to generate very dark and very light images
+- Resolution up to 1536x1536 pixels
+- Expanded character recognition (some V6 characters may get less recognized, but generally we extended the knowledge by a lot)
+## Special thanks
+- Iceman for helping to procure necessary training resources
+- [Simo Ryu](https://x.com/cloneofsimo) and the rest of FAL.ai team for creating AuraFlow and emotional support
+- [Runpod for providing captioning compute](https://runpod.io/?utm_source=purplesmartai)
+- [Piclumen](https://www.piclumen.com/) for being our partners
+- [City96](https://github.com/city96) for help with GGUF support
+- [diffusers](https://huggingface.co/docs/diffusers/en/index) team for supporting AuraFlow integration work
+- PSAI Server Subscribers for supporting the project costs
+- PSAI Server Moderators for being vigilant and managing the community
+- Many supporters that decided to remain anonymous but their help has been critical for getting V7 done
+## Technical details
+The model has been trained on ~10M images aesthetically ranked and selected from a superset of over 30M images with roughly 1:1 ratio between anime/cartoon/furry/pony datasets and 1:1 ratio between safe/questionable/explicit ratings. 100% of all images have been tagged and captioned with high quality detailed captions.
+All images have been used in training with both captions and tags. Artists' names have been removed and source data has been filtered based on our Opt-in/Opt-out program. Any inappropriate explicit content has been filtered out.
+## Limitations
+- This model does not support text generation and has degraded text generation capabilities compared to base AuraFlow
+- Special tags (including quality tags) have much weaker performance compared to V6, meaning score_9 would not necessarily yield better results on some prompts. We are working on a V7.1 follow-up to improve this
+- Small details and especially faces may degrade significantly depending on art style, this is a combination of outdated VAE and insufficient training which we are trying to improve in V7.1
+## LoRA training
+We recommend using SimpleTuner for LoRA training following [this guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/AURAFLOW.md).
+## Commercial API
+We provide [commercial API](https://fal.ai/models/fal-ai/pony-v7) via our exclusive partner FAL.ai
+## License
+This model is licensed under a Pony License
+In short, you can use this model and its outputs commercially unless you provide an inference service or application, have a company with over 1M revenue or use in professional video production. This limitations do not apply if you use first party commercial APIs.
+If you want to use this model commercially, please reach us at contact@purplesmart.ai.
+Explicit permission for commercial inference has been granted to CivitAi and Hugging Face.