CalamitousFelicitousness commited on
Commit
b3682e1
·
verified ·
1 Parent(s): a0927d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +122 -120
README.md CHANGED
@@ -1,120 +1,122 @@
1
- ---
2
- license: other
3
- license_name: pony-license
4
- license_link: LICENSE
5
- ---
6
-
7
- # Pony V7
8
-
9
- ![Pony V7](V7.webp)
10
-
11
- Pony V7 is a versatile character generation model based on AuraFlow architecture. It supports a wide range of styles and species types (humanoid, anthro, feral, and more) and handles character interactions through natural language prompts.
12
-
13
- ## Fictional
14
-
15
- First, let me introduce [Fictional](https://fictional.ai) - our multimodal platform where AI Characters come alive through text, images, voice, and (soon) video. Powered by PonyV7, V6, Chroma, Seedream 4, and other advanced models, Fictional lets you discover, create, and interact with characters who live their own lives and share their own stories.
16
-
17
- Fictional is also what enables the development of models like V7, so if you're excited about the future of multimodal AI characters, please download Fictional on iOS or Android and help shape our future!
18
-
19
- - **iOS**: https://apps.apple.com/us/app/fictional/id6739802573
20
- - **Android**: https://play.google.com/store/apps/details?id=ai.fictional.app
21
-
22
- ### Get in touch with us
23
-
24
- Please join [our Discord Server](https://discord.gg/pYsdjMfu3q) if you have questions about Fictional and Pony models.
25
-
26
- ## Important model information
27
-
28
- Please check [this article](https://civitai.com/articles/19986) to learn more about why it took so long for us to ship V7 and upcoming model releases.
29
-
30
- ## Model prompting
31
-
32
- This model supports a wide array of styles and aesthetics but provides an opinionated default prompt template:
33
-
34
- ```
35
- special tags, factual description of image, stylistic description of image, additional content tags
36
- ```
37
-
38
- ### Special Tags
39
-
40
- `score_X`, `style_cluster_x`, `source_X` - warning: V7 prompting may be inconsistent, please see the article as we are working on V7.1 to address this.
41
-
42
- ### Factual description of image
43
-
44
- Description of what is portrayed in the image without any stylistic indicators. Two recommendations:
45
-
46
- 1. Start with a single phrase describing what you want in the image before going into details
47
-
48
- 2. When referring to characters use pattern: `<species> <gender> <name> from <source>`
49
-
50
- For example "Anthro bunny female Lola Bunny from Space Jam".
51
-
52
- This model is capable of recognizing many popular and obscure characters and series.
53
-
54
- ### Stylistic description of image
55
-
56
- Any information about image medium, shot type, lighting, etc. (More info TBD with captioning Colab)
57
-
58
- ### Tags
59
-
60
- V7 is trained on a combination of natural language prompts and tags and is capable of understanding both, so describing the intended result using normal language works in most cases, although you can add some tags after the main prompt to boost them.
61
-
62
- ### Captioning Colab
63
-
64
- To get a better understanding of V7 prompting, we are releasing a [captioning Colab](https://colab.research.google.com/drive/19PG-0ltob8EynxUZSwOdjMFmqyJ7ZOCB) with all the models used for V7 captioning.
65
-
66
- ## Supported inference settings
67
-
68
- V7 supports resolutions in the range of 768px to 1536px. It is recommended to go for higher resolutions and at least 30 steps during inference.
69
-
70
- ## Highlights compared to V6
71
-
72
- - Much stronger understanding of prompts, especially when it comes to spatial information and multiple characters
73
- - Much stronger background support - both generation of backgrounds and using background with character
74
- - Much stronger realism support out of the box
75
- - Ability to generate very dark and very light images
76
- - Resolution up to 1536x1536 pixels
77
- - Expanded character recognition (some V6 characters may get less recognized, but generally we extended the knowledge by a lot)
78
-
79
- ## Special thanks
80
-
81
- - Iceman for helping to procure necessary training resources
82
- - [Simo Ryu](https://x.com/cloneofsimo) and the rest of FAL.ai team for creating AuraFlow and emotional support
83
- - [Runpod for providing captioning compute](https://runpod.io/?utm_source=purplesmartai)
84
- - [Piclumen](https://www.piclumen.com/) for being our partners
85
- - [City96](https://github.com/city96) for help with GGUF support
86
- - [diffusers](https://huggingface.co/docs/diffusers/en/index) team for supporting AuraFlow integration work
87
- - PSAI Server Subscribers for supporting the project costs
88
- - PSAI Server Moderators for being vigilant and managing the community
89
- - Many supporters that decided to remain anonymous but their help has been critical for getting V7 done
90
-
91
- ## Technical details
92
-
93
- The model has been trained on ~10M images aesthetically ranked and selected from a superset of over 30M images with roughly 1:1 ratio between anime/cartoon/furry/pony datasets and 1:1 ratio between safe/questionable/explicit ratings. 100% of all images have been tagged and captioned with high quality detailed captions.
94
-
95
- All images have been used in training with both captions and tags. Artists' names have been removed and source data has been filtered based on our Opt-in/Opt-out program. Any inappropriate explicit content has been filtered out.
96
-
97
- ## Limitations
98
-
99
- - This model does not support text generation and has degraded text generation capabilities compared to base AuraFlow
100
- - Special tags (including quality tags) have much weaker performance compared to V6, meaning score_9 would not necessarily yield better results on some prompts. We are working on a V7.1 follow-up to improve this
101
- - Small details and especially faces may degrade significantly depending on art style, this is a combination of outdated VAE and insufficient training which we are trying to improve in V7.1
102
-
103
- ## LoRA training
104
-
105
- We recommend using SimpleTuner for LoRA training following [this guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/AURAFLOW.md).
106
-
107
-
108
- ## Commercial API
109
-
110
- We provide [commercial API](https://fal.ai/models/fal-ai/pony-v7) via our exclusive partner FAL.ai
111
-
112
- ## License
113
-
114
- This model is licensed under a Pony License
115
-
116
- In short, you can use this model and its outputs commercially unless you provide an inference service or application, have a company with over 1M revenue or use in professional video production. This limitations do not apply if you use first party commercial APIs.
117
-
118
- If you want to use this model commercially, please reach us at contact@purplesmart.ai.
119
-
120
- Explicit permission for commercial inference has been granted to CivitAi and Hugging Face.
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: pony-license
4
+ license_link: LICENSE
5
+ ---
6
+
7
+ # Pony V7
8
+
9
+ # This repo is a clone of the [original](purplesmartai/pony-v7-base) with the files not needed for Diffusers format to work removed.
10
+
11
+ ![Pony V7](V7.webp)
12
+
13
+ Pony V7 is a versatile character generation model based on AuraFlow architecture. It supports a wide range of styles and species types (humanoid, anthro, feral, and more) and handles character interactions through natural language prompts.
14
+
15
+ ## Fictional
16
+
17
+ First, let me introduce [Fictional](https://fictional.ai) - our multimodal platform where AI Characters come alive through text, images, voice, and (soon) video. Powered by PonyV7, V6, Chroma, Seedream 4, and other advanced models, Fictional lets you discover, create, and interact with characters who live their own lives and share their own stories.
18
+
19
+ Fictional is also what enables the development of models like V7, so if you're excited about the future of multimodal AI characters, please download Fictional on iOS or Android and help shape our future!
20
+
21
+ - **iOS**: https://apps.apple.com/us/app/fictional/id6739802573
22
+ - **Android**: https://play.google.com/store/apps/details?id=ai.fictional.app
23
+
24
+ ### Get in touch with us
25
+
26
+ Please join [our Discord Server](https://discord.gg/pYsdjMfu3q) if you have questions about Fictional and Pony models.
27
+
28
+ ## Important model information
29
+
30
+ Please check [this article](https://civitai.com/articles/19986) to learn more about why it took so long for us to ship V7 and upcoming model releases.
31
+
32
+ ## Model prompting
33
+
34
+ This model supports a wide array of styles and aesthetics but provides an opinionated default prompt template:
35
+
36
+ ```
37
+ special tags, factual description of image, stylistic description of image, additional content tags
38
+ ```
39
+
40
+ ### Special Tags
41
+
42
+ `score_X`, `style_cluster_x`, `source_X` - warning: V7 prompting may be inconsistent, please see the article as we are working on V7.1 to address this.
43
+
44
+ ### Factual description of image
45
+
46
+ Description of what is portrayed in the image without any stylistic indicators. Two recommendations:
47
+
48
+ 1. Start with a single phrase describing what you want in the image before going into details
49
+
50
+ 2. When referring to characters use pattern: `<species> <gender> <name> from <source>`
51
+
52
+ For example "Anthro bunny female Lola Bunny from Space Jam".
53
+
54
+ This model is capable of recognizing many popular and obscure characters and series.
55
+
56
+ ### Stylistic description of image
57
+
58
+ Any information about image medium, shot type, lighting, etc. (More info TBD with captioning Colab)
59
+
60
+ ### Tags
61
+
62
+ V7 is trained on a combination of natural language prompts and tags and is capable of understanding both, so describing the intended result using normal language works in most cases, although you can add some tags after the main prompt to boost them.
63
+
64
+ ### Captioning Colab
65
+
66
+ To get a better understanding of V7 prompting, we are releasing a [captioning Colab](https://colab.research.google.com/drive/19PG-0ltob8EynxUZSwOdjMFmqyJ7ZOCB) with all the models used for V7 captioning.
67
+
68
+ ## Supported inference settings
69
+
70
+ V7 supports resolutions in the range of 768px to 1536px. It is recommended to go for higher resolutions and at least 30 steps during inference.
71
+
72
+ ## Highlights compared to V6
73
+
74
+ - Much stronger understanding of prompts, especially when it comes to spatial information and multiple characters
75
+ - Much stronger background support - both generation of backgrounds and using background with character
76
+ - Much stronger realism support out of the box
77
+ - Ability to generate very dark and very light images
78
+ - Resolution up to 1536x1536 pixels
79
+ - Expanded character recognition (some V6 characters may get less recognized, but generally we extended the knowledge by a lot)
80
+
81
+ ## Special thanks
82
+
83
+ - Iceman for helping to procure necessary training resources
84
+ - [Simo Ryu](https://x.com/cloneofsimo) and the rest of FAL.ai team for creating AuraFlow and emotional support
85
+ - [Runpod for providing captioning compute](https://runpod.io/?utm_source=purplesmartai)
86
+ - [Piclumen](https://www.piclumen.com/) for being our partners
87
+ - [City96](https://github.com/city96) for help with GGUF support
88
+ - [diffusers](https://huggingface.co/docs/diffusers/en/index) team for supporting AuraFlow integration work
89
+ - PSAI Server Subscribers for supporting the project costs
90
+ - PSAI Server Moderators for being vigilant and managing the community
91
+ - Many supporters that decided to remain anonymous but their help has been critical for getting V7 done
92
+
93
+ ## Technical details
94
+
95
+ The model has been trained on ~10M images aesthetically ranked and selected from a superset of over 30M images with roughly 1:1 ratio between anime/cartoon/furry/pony datasets and 1:1 ratio between safe/questionable/explicit ratings. 100% of all images have been tagged and captioned with high quality detailed captions.
96
+
97
+ All images have been used in training with both captions and tags. Artists' names have been removed and source data has been filtered based on our Opt-in/Opt-out program. Any inappropriate explicit content has been filtered out.
98
+
99
+ ## Limitations
100
+
101
+ - This model does not support text generation and has degraded text generation capabilities compared to base AuraFlow
102
+ - Special tags (including quality tags) have much weaker performance compared to V6, meaning score_9 would not necessarily yield better results on some prompts. We are working on a V7.1 follow-up to improve this
103
+ - Small details and especially faces may degrade significantly depending on art style, this is a combination of outdated VAE and insufficient training which we are trying to improve in V7.1
104
+
105
+ ## LoRA training
106
+
107
+ We recommend using SimpleTuner for LoRA training following [this guide](https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/AURAFLOW.md).
108
+
109
+
110
+ ## Commercial API
111
+
112
+ We provide [commercial API](https://fal.ai/models/fal-ai/pony-v7) via our exclusive partner FAL.ai
113
+
114
+ ## License
115
+
116
+ This model is licensed under a Pony License
117
+
118
+ In short, you can use this model and its outputs commercially unless you provide an inference service or application, have a company with over 1M revenue or use in professional video production. This limitations do not apply if you use first party commercial APIs.
119
+
120
+ If you want to use this model commercially, please reach us at contact@purplesmart.ai.
121
+
122
+ Explicit permission for commercial inference has been granted to CivitAi and Hugging Face.