Thatguy099
/

colab

Model card Files Files and versions

xet

Community

Thatguy099 commited on May 19, 2025

Commit

47ccaa4

verified ·

1 Parent(s): aeec00e

Update DiffuseCraft.ipynb

Browse files

Files changed (1) hide show

DiffuseCraft.ipynb +59 -62

DiffuseCraft.ipynb CHANGED Viewed

@@ -4,18 +4,13 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "# DiffuseCraft: Text-to-Image Generation on T4 Colab\n",
         "\n",
-        "This script uses a custom Stable Diffusion model from Hugging Face for text-to-image generation, optimized for T4 GPU with low RAM usage.\n",
         "\n",
-        "**Requirements**:\n",
-        "- T4 GPU runtime in Colab\n",
-        "- Hugging Face account and token (for gated models)\n",
         "\n",
-        "**Features**:\n",
-        "- Uses `diffusers` library with FP16 precision\n",
-        "- Enables model CPU offloading for low RAM\n",
-        "- Supports custom prompts and negative prompts\n"
       ]
     },
     {
@@ -24,10 +19,14 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# Install required libraries\n",
-        "!pip install -q diffusers==0.21.4 transformers==4.33.0 accelerate==0.22.0\n",
-        "!pip install -q torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118\n",
-        "!pip install -q xformers==0.0.22\n"
       ]
     },
     {
@@ -36,15 +35,24 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# Import libraries\n",
         "import torch\n",
         "from diffusers import StableDiffusionPipeline\n",
-        "from huggingface_hub import login\n",
-        "import os\n",
         "\n",
-        "# Set Hugging Face token (replace with your token)\n",
-        "os.environ['HUGGINGFACE_TOKEN'] = 'your_hf_token_here'\n",
-        "login(os.environ['HUGGINGFACE_TOKEN'])\n"
       ]
     },
     {
@@ -53,25 +61,24 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# Initialize the pipeline with optimizations\n",
-        "model_id = 'runwayml/stable-diffusion-v1-5'  # Replace with your custom HF model ID\n",
         "\n",
-        "pipe = StableDiffusionPipeline.from_pretrained(\n",
-        "    model_id,\n",
-        "    torch_dtype=torch.float16,\n",
-        "    use_auth_token=True\n",
-        ")\n",
-        "\n",
-        "# Enable optimizations for T4\n",
-        "pipe = pipe.to('cuda')\n",
-        "pipe.enable_attention_slicing()  # Reduces memory usage\n",
-        "pipe.enable_model_cpu_offload()  # Offloads model to CPU when not in use\n",
         "\n",
-        "# Optional: Enable xformers for faster inference\n",
-        "try:\n",
-        "    pipe.enable_xformers_memory_efficient_attention()\n",
-        "except:\n",
-        "    print('xformers not supported, proceeding without it.')\n"
       ]
     },
     {
@@ -80,37 +87,27 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# Define generation parameters\n",
-        "prompt = 'A serene mountain landscape at sunset, vibrant colors, highly detailed'\n",
-        "negative_prompt = 'blurry, low quality, artifacts, text, watermark'\n",
-        "num_inference_steps = 30  # Lower steps for faster generation\n",
-        "guidance_scale = 7.5\n",
-        "\n",
-        "# Generate image\n",
-        "image = pipe(\n",
-        "    prompt,\n",
-        "    negative_prompt=negative_prompt,\n",
-        "    num_inference_steps=num_inference_steps,\n",
-        "    guidance_scale=guidance_scale,\n",
-        "    height=512,\n",
-        "    width=512\n",
-        ").images[0]\n",
-        "\n",
-        "# Save and display image\n",
-        "image.save('generated_image.png')\n",
-        "image\n"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Notes\n",
-        "- Replace `'your_hf_token_here'` with your Hugging Face token.\n",
-        "- Replace `'runwayml/stable-diffusion-v1-5'` with your custom model ID from Hugging Face.\n",
-        "- Adjust `prompt`, `negative_prompt`, `num_inference_steps`, and `guidance_scale` as needed.\n",
-        "- The script uses FP16 and attention slicing to minimize RAM usage.\n",
-        "- Model CPU offloading reduces VRAM requirements, ideal for T4 GPUs.\n"
       ]
     }
   ],
@@ -130,7 +127,7 @@
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
-      "version": "3.8.10"
     }
   },
   "nbformat": 4,

       "cell_type": "markdown",
       "metadata": {},
       "source": [
+        "# DiffuseCraft: Text-to-Image Generation with Custom Model\n",
         "\n",
+        "This notebook uses a custom text-to-image model from Hugging Face to generate images from text prompts. It is optimized for use with a T4 GPU in Google Colab, with a focus on minimizing RAM usage.\n",
         "\n",
+        "## Setup\n",
         "\n",
+        "Run the following cell to install the required libraries:"
       ]
     },
     {
       "metadata": {},
       "outputs": [],
       "source": [
+        "!pip install --no-cache-dir diffusers transformers torch"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Then, load the model by running the next cell. Make sure to replace `\"username/efficient-text-to-image\"` with the actual model ID from Hugging Face."
       ]
     },
     {
       "metadata": {},
       "outputs": [],
       "source": [
         "import torch\n",
         "from diffusers import StableDiffusionPipeline\n",
         "\n",
+        "model_id = \"username/efficient-text-to-image\"  # Replace with actual model ID\n",
+        "pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)\n",
+        "pipe = pipe.to(\"cuda\")\n",
+        "pipe.enable_attention_slicing()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Generate Image\n",
+        "\n",
+        "Enter your text prompt in the `prompt` variable below. You can also adjust the `height`, `width`, and `num_inference_steps` to balance between image quality and resource usage. Smaller values will use less memory but may result in lower quality images.\n",
+        "\n",
+        "Run the cell to generate and display the image."
       ]
     },
     {
       "metadata": {},
       "outputs": [],
       "source": [
+        "prompt = \"A beautiful landscape with mountains and a river\"\n",
+        "height = 256\n",
+        "width = 256\n",
+        "num_inference_steps = 20\n",
         "\n",
+        "with torch.inference_mode():\n",
+        "    image = pipe(prompt, height=height, width=width, num_inference_steps=num_inference_steps).images[0]\n",
+        "from IPython.display import display\n",
+        "display(image)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Clean Up\n",
         "\n",
+        "After generating the image, you can run the following cell to clear the GPU memory, which can help if you plan to generate multiple images."
       ]
     },
     {
       "metadata": {},
       "outputs": [],
       "source": [
+        "import gc\n",
+        "gc.collect()\n",
+        "torch.cuda.empty_cache()"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {},
       "source": [
+        "## Save Image\n",
+        "\n",
+        "If you want to save the generated image, run the following cell:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "image.save(\"generated_image.png\")"
       ]
     }
   ],
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
+      "version": "3.11.0"
     }
   },
   "nbformat": 4,