| This model is part of the research work described in "FeatureFusion: Merging Diffusion Models Through Representation Correlations" by Murdock Aubry and James Bona-Landry. | |
| <h1> | |
| Model Description | |
| </h1> | |
| <h2>Overview</h2> | |
| This model is an electronics specialist based on the Stable Diffusion 1.4 architecture. | |
| <br> | |
| <h2>Model Details</h2> | |
| Base Model: CompVis/stable-diffusion-v1-4 | |
| Type: Specialist | |
| Specialization: Electronics | |
| Training Data: Electronics shard | |
| Model Architecture: UNet-based diffusion model | |
| <h2>Limitations</h2> | |
| The model has the same limitations as the base Stable Diffusion model | |
| Best performance is achieved when prompts relate to the model's specialization | |
| May produce unexpected results for concepts outside its training distribution | |
| <h1>Training</h1> | |
| <h2>Training Procedure</h2> | |
| Training Data: Pick-a-Pic v1 | |
| Training Method: Finetuning of the UNet component while keeping text encoder and VAE frozen | |
| <h2>Hyperparameters:</h2> | |
| Optimizer: AdamW | |
| Learning rate: 1e-6 | |
| Schedule: Cosine with warmup | |
| Training steps: 5 epochs on 1000 data samples | |
| Memory optimization: Gradient accumulation (4 steps), attention slicing, VAE slicing, gradient checkpointing | |
| <h1>Citation</h1> | |
| If you use this model in your research, please cite:<br> | |
| @article{aubry2024featurefusion,<br> | |
| title={FeatureFusion: Merging Diffusion Models Through Representation Correlations},<br> | |
| author={Aubry, Murdock and Bona-Landry, James},<br> | |
| journal={},<br> | |
| year={2025}<br> | |
| } | |
| --- | |
| license: mit | |
| language: | |
| - en | |
| base_model: | |
| - CompVis/stable-diffusion-v1-4 | |
| pipeline_tag: text-to-image | |
| --- |