--- license: cc-by-sa-4.0 datasets: - sankalpsinha77/MARVEL-40M language: - en base_model: - stabilityai/stable-diffusion-3.5-large tags: - text-to-image - image ---
MARVEL-FX3D Sankalp SinhaπŸ‘¨β€πŸ’» Β· Mohammad Sadil KhanπŸ‘¨β€πŸ’» Β· Muhammad Usama Β· Shino Sam Β· Didier Stricker Β· Sk Aziz Ali Β· Muhammad Zeshan Afzal πŸ‘¨β€πŸ’» Equally contributing first authors [![Paper]( https://img.shields.io/badge/πŸ“„_Paper-ArXiv-b31b1b?style=for-the-badge)](https://openaccess.thecvf.com/content/CVPR2025/papers/Sinha_MARVEL-40M_Multi-Level_Visual_Elaboration_for_High-Fidelity_Text-to-3D_Content_Creation_CVPR_2025_paper.pdf) [![Project Page](https://img.shields.io/badge/🌐_Project-Website-2ecc71?style=for-the-badge)](https://sankalpsinha-cmos.github.io/MARVEL/) [![Dataset](https://img.shields.io/badge/πŸ€—_Dataset-HuggingFace-yellow?style=for-the-badge)](https://huggingface.co/datasets/sankalpsinha77/MARVEL-40M) [![Explorer](https://img.shields.io/badge/πŸ”_Explorer-Demo-blue?style=for-the-badge)](https://sadilkhan.github.io/Marvel-Explorer/) [![Code](https://img.shields.io/badge/⚑_MARVEL--FX3D-Pipeline-purple?style=for-the-badge)](https://github.com/SadilKhan/MARVEL-FX3D)
--- This repo contains weights for fine-tuned Stable Diffusion 3.5 Large on [MARVEL-40M+](https://sadilkhan.github.io/Marvel-Explorer/) dataset. Given a text prompt, the model generates an image suitable for a pretrained image-to-3D model such as Sam3D, Trellis, or Stable Fast 3D. # Inference ```python # Generate Image from text prompts import torch from diffusers import StableDiffusion3Pipeline model_id = "stabilityai/stable-diffusion-3.5-large" lora_path = "SadilKhan/MARVEL_FX3D" # or local path pipe = StableDiffusion3Pipeline.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto" ) # Load LoRA weights pipe.load_lora_weights(lora_path) pipe.to("cuda") prompt = "An old, moss-covered wishing well. Rough stones, aged wood, rusty chains, mushrooms, fallen leaves, and twigs create an enchanting, ancient, and rustic atmosphere." image = pipe( prompt=prompt, num_inference_steps=28, guidance_scale=7.0, ).images[0] image.save("output.png") ``` # Citation If you find MARVEL-FX3D useful, please cite ``` @inproceedings{sinha2025marvel, title = {MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation}, author = {Sinha, Sankalp and Khan, Mohammad Sadil and Usama, Muhammad and Sam, Shino and Stricker, Didier and Ali, Sk Aziz and Afzal, Muhammad Zeshan}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={8105--8116}, year={2025} } ```