Instructions to use circlestone-labs/Anima with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusion Single File
How to use circlestone-labs/Anima with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Improvement suggestion: Better multi-character LoRA support (reduced identity bleed / cross-contamination)
Anima Base v1 is excellent for single-character and style consistency, but combining multiple character LoRAs in one scene is currently quite hit or miss due to strong identity bleed, feature blending, or one LoRA overpowering the other.
What I've tried:
- Natural language structured prompts with clear positional descriptions ("On the left: Character A, blue hair... On the right: Character B...")
- Reduced LoRA weights (0.55β0.8)
- Subject count tags (2girls/2boy, 1girl 1boy, etc.)
- Various CFG / steps / samplers
It works reasonably for side-by-side non-interacting characters but breaks down quickly with physical contact/interactions.
For future versions (v1.1 / finetunes / Anima-Turbo, etc.), any architectural or training improvements that reduce cross-contamination between multiple character LoRAs while keeping the strong single-subject coherence would be extremely valuable!
There isn't a single model that solves this, it's a flaw of the lora technology. It would require actual research beyond the scope of a finetune to solve.
Maybe try implementing https://github.com/yaoliliu/FreeFuse support for Anima. I think this is the solution you are asking for.
or maybe this https://blog.comfy.org/p/masking-and-scheduling-lora-and-model-weights
if there are only a few characters in your training data, and it happens that you have several standing pose from different angle.
You can try combine the characters with white background(segment the character).
if there is A,B, and D, you must at least have A+B, C+D and A+B+C (remember adding 2girls/3girls or similar tags), if you can add more it will be better.
if it happens that you have some images that some of A/B/C/D are interact with others, it's better.
no NL caption in the training data, booru style tags are enough.
every character solo images should be in a subset(different costumes are considered as a different character), every combine pattern images(like 2girls/3girls) should be a subset.
raise repeat numbers, so that (repeat num * single character solo image number) is similar to (repeat num * this combine subset images)
if there are a lot of characters(and costumes), the combine patterns increase horribly. Don't try cover every pattern. the most characters and costumes i tried is 9 characters with 27 costumes.