Details on the code used to produce and use this model are available at:
https://github.com/schrum2/MarioDiffusion
That repo has instructions to check out this model and apply it to the generation of Super Mario Bros. level scenes. There is also an interactive GUI for constructing complete levels out of model-generated scenes.
This model makes use of https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1 as a text embedding model for use with diffusion to generate Mario levels. Mario scene captions contain multiple period-separated phrases, and this model embeds each phrase with its own sentence embedding vector for the diffusion model to use as text conditioning. It also makes use of negative guidance during diffusion training. Unfortunately, its performance is not great, and it is made available mainly for full transparency.
A model with better performance that uses sentence-transformers/multi-qa-MiniLM-L6-cos-v1
with regular captions and multiple sentence embeddings is https://huggingface.co/schrum2/MarioDiffusion-MiniLM-multiple-regular0.
For a model that uses the same text embedding model,
but embeds the entire caption as a single vector and also uses negative guidance,
see https://huggingface.co/schrum2/MarioDiffusion-MiniLM-single-negative0.
To see a model that uses a simple token-based transformer model for text
embedding with negative guidance, see https://huggingface.co/schrum2/MarioDiffusion-MLM-negative0.
- Downloads last month
- 1