Ricardouchub
/

SarcasmDiffusion

@@ -1,19 +1,163 @@
 # SarcasmDiffusion — SDXL Fused Meme Generator
-Fine-tuning de **Stable Diffusion XL Base 1.0** usando **LoRA** para aprender el estilo visual de memes sarcásticos.
-## Uso rápido
 ```python
 from diffusers import AutoPipelineForText2Image
 import torch
-pipe = AutoPipelineForText2Image.from_pretrained("Ricardouchub/SarcasmDiffusion", torch_dtype=torch.float16).to("cuda")
-img = pipe(
-    "sarcastic meme about running out of GPU VRAM at 3am, high contrast, stock photo style",
-    negative_prompt="nsfw, text overlay, low quality",
-    num_inference_steps=20, guidance_scale=6.5
-).images[0]
-img.show()

+---
+license: mit
+base_model:
+- stabilityai/stable-diffusion-xl-base-1.0
+pipeline_tag: text-to-image
+---
 # SarcasmDiffusion — SDXL Fused Meme Generator
+**Model type:** Stable Diffusion XL (Base 1.0) fine‑tuned via **LoRA** (merged/fused) to learn the *visual* style of sarcastic/ironic memes.
+**Author:** Ricardo Urdaneta (github.com/Ricardouchub)
+**Repository:** SarcasmDiffusion
+---
+## Overview
+SarcasmDiffusion is a diffusion-based generative model focused on producing **clean meme-style photographs** that are suitable for **caption overlays** (text is added *after* generation). The model was LoRA‑fine‑tuned on a filtered and enriched subset of the *Hateful Memes* dataset to capture stylistic cues of humorous/ironic memes while **avoiding offensive content**.
+- **Base:** `stabilityai/stable-diffusion-xl-base-1.0`
+- **Fine‑tuning:** LoRA on the **UNet** only; **VAE** and **text encoders** are frozen.
+- **Exported artifact:** **Fused SDXL** (no external LoRA required at inference).
+> This model focuses on **style transfer for meme aesthetics** (composition, lighting, “stock-photo vibe”), *not* on rendering text inside images. Add titles/subtitles with your own overlay function or editor.
+---
+## Intended Use
+- Generating **meme-ready images** with space at the top/bottom for captions.
+- Creative exploration of humorous/ironic visual setups controlled by prompts.
+- Educational/portfolio use for **LoRA fine‑tuning workflows** with SDXL.
+### Out of Scope / Limitations
+- **No text rendering inside the image** (explicitly discouraged via negative prompts).
+- May produce **stock-like** aesthetics by design.
+- Not suitable for generating or amplifying **harmful, hateful, or NSFW** content.
+- As with all text-to-image systems, prompts with ambiguous semantics can yield unpredictable outputs.
+---
+## Training Summary
+- **Base model:** SDXL Base 1.0
+- **LoRA rank / alpha / dropout:** `r=8`, `alpha=16`, `dropout=0.05`
+- **Resolution:** 1024 (training); common inference at 768–896 for speed
+- **Batch:** 1 (gradient accumulation = 4)
+- **Steps:** ~6k (≈0.7 epoch on ~8.5k images)
+- **Precision:** fp16 (LoRA params kept in fp32 during training)
+- **Optimizer:** AdamW
+- **Scheduler:** cosine with warmup (recommended)
+- **Frozen:** VAE, text_encoder, text_encoder_2
+### Data
+- Source: *Hateful Memes* (Facebook AI).
+- We **excluded** labeled hateful samples and applied **NLP enrichment**:
+  - Emotion scoring (GoEmotions distilled) and irony scoring (RoBERTa‑irony).
+  - Heuristics + percentiles → tones: `humor / irony / neutral`.
+- Final training CSV: prompts balanced by tone; **negative prompts** to avoid text overlays, low‑quality artifacts, watermarks/logos, and unsafe content.
+> The dataset is **not** included here. Please obtain *Hateful Memes* under its original terms and reproduce the preprocessing if needed.
+---
+## Safety, Ethics & Mitigations
+- We filtered out hateful labels and used **negative prompts** to avoid NSFW/hate/text overlays.
+- Despite mitigations, **misuse is possible**. Users are responsible for **prompting responsibly** and complying with local laws and platform policies.
+- Do not use the model to create defamatory, harassing, discriminatory, or otherwise harmful imagery.
+**Known risks:** dataset biases may remain; aesthetic biases (stock-photo look); occasional failure to respect negative prompts.
+---
+## How to Use
 ```python
 from diffusers import AutoPipelineForText2Image
 import torch
+pipe = AutoPipelineForText2Image.from_pretrained(
+    "Ricardouchub/SarcasmDiffusion",
+    torch_dtype=torch.float16
+).to("cuda")  # use "cpu" if no GPU
+prompt = (
+    "sarcastic meme about checking the fridge for the third time, "
+    "centered subject, plain background, high-contrast photo, stock photo style"
+)
+negative = "nsfw, hate speech, slur, watermark, logo, low quality, blurry, busy background, text overlay"
+g = torch.Generator(device=pipe.device).manual_seed(123)
+image = pipe(prompt,
+             negative_prompt=negative,
+             num_inference_steps=22,
+             guidance_scale=6.3,
+             width=896, height=896,
+             generator=g).images[0]
+image.save("sample.png")
+```
+### Prompting Tips
+- Add **layout hints**: “centered subject”, “plain background”, “space at top and bottom”.
+- Keep **negative prompts** to avoid logos/text/NSFW.
+- Use seeds for reproducibility; `steps=18–28`, `guidance=5.5–7.5`, `size=768–1024`.
+---
+## Files
+This repository should contain the standard **Diffusers** layout:
+```
+model_index.json
+unet/
+vae/
+text_encoder/
+text_encoder_2/
+scheduler/
+tokenizer/
+...
+```
+Since this is a **fused** export, you **don’t** need an external LoRA weight file.
+---
+## License
+- **Code:** MIT (project-level).
+- **Model weights:** follow the base model’s license (Stability AI / SDXL Base 1.0).
+- **Data:** Users must obtain *Hateful Memes* from its source and agree to its terms.
+> By using this model, you agree not to generate content that is illegal, harmful, or violates rights of others.
+---
+## Evaluation
+Qualitative assessment via fixed prompt sheets (humor/irony/neutral). Suggested automatic metrics for future work: CLIP‑score vs. caption, aesthetic predictors, and human preference studies.
+---
+## Acknowledgments
+- Stability AI — SDXL Base 1.0
+- Hugging Face — Diffusers, Accelerate, PEFT
+- Facebook AI — Hateful Memes dataset
+---
+##  Citation
+If you use this model in your research or portfolio, please cite:
+```
+@software{sarcasmdiffusion_sdxl_fused_2025,
+  author  = {Ricardo (Ricardouchub)},
+  title   = {SarcasmDiffusion — SDXL Fused Meme Generator},
+  year    = {2025},
+  url     = {https://huggingface.co/Ricardouchub/SarcasmDiffusion-SDXL-Fused}
+}
+```