Lab-5 / README.md
Teoman21's picture
docs: Update README and profiling analysis with installation, usage instructions, and performance benchmarks
e589d21

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: Mosaic Generator
emoji: 🧩
colorFrom: indigo
colorTo: purple
sdk: gradio
app_file: app.py
pinned: false

Lab 5 – Mosaic Generator

A fully refactored and optimized version of the Lab 1 mosaic pipeline. This release adds strict vectorization, caching, profiling evidence, and a polished Gradio front end.

1. Installation

# 1. Create and activate a Python 3.10+ virtual environment
python3 -m venv .venv
source .venv/bin/activate

# 2. Install the project dependencies
pip install --upgrade pip
pip install -r requirements.txt

Optional extras:

  • pip install line_profiler if you want to re-run the profiling notebook.
  • pip install jupyterlab if you prefer to explore the notebooks interactively.

2. Usage

Run the Gradio App Locally

cd lab-5
python app.py

Visit http://localhost:7860 to upload an image, tweak grid/tile settings, and view the generated mosaic, quality metrics, and timing stats live.

Programmatic Pipeline Example

from pathlib import Path
from PIL import Image
from src.config import Config
from src.pipeline import MosaicPipeline

cfg = Config(
    grid=32,
    tile_size=32,
    out_w=768,
    out_h=768,
    tiles_cache_dir="tile_cache"
)
pipeline = MosaicPipeline(cfg)
image = Image.open(Path("test_images/copley.png")).convert("RGB")
results = pipeline.run_full_pipeline(image)
results["outputs"]["mosaic"].save("outputs/mosaic.png")
print(results["timing"], results["metrics"])

Profiling Notebook

Open profiling_analysis.ipynb to reproduce the cProfile / line_profiler runs, before-vs-after timings, and plots used in the assessment.

3. Performance Benchmarks (vs Lab 1)

Benchmarks compare the original Lab 1 implementation (β€œLegacy”) with this optimized Lab 5 pipeline on the same MacBook Pro (M3 Pro, Python 3.11). Each entry averages three runs with cached tiles.

Image Size Grid Legacy Time (s) Lab 5 Time (s) Speedup
256Γ—256 16Γ—16 0.063 0.038 1.6Γ—
512Γ—512 32Γ—32 0.149 0.140 1.1Γ—
1024Γ—1024 64Γ—64 0.576 0.542 1.1Γ—

Key optimizations that produced the gains:

  1. Vectorized grid analysis – replaces nested loops with numpy.lib.stride_tricks.block_view and weighted reductions, eliminating thousands of Python iterations per frame.
  2. Vectorized tile matching – stacks the tile bank once, computes LAB/RGB distances with NumPy, and gathers tiles in bulk.
  3. Tile caching – persist Hugging Face tiles to disk (tile_cache/) and reuse them across runs, avoiding repeated dataset downloads/resizing.
  4. Configurable quantization – optional uniform or k-means quantization reduces the color-space variance before tiling.

Refer to the notebook for raw profiler dumps, bottleneck analysis, and charts illustrating how the optimized pipeline scales more gracefully as grids grow.

4. Deployed Demo

A live Gradio demo is hosted on Hugging Face Spaces:
πŸ‘‰ https://huggingface.co/spaces/Teoman21/Lab-5

The hosted build runs the same app.py entry point, with tiles cached in the Space storage. Use it for quick testing or to share results without cloning the repo.

5. Repository Map

  • app.py – launches the Gradio interface.
  • src/ – reusable package (mosaic.py, tiles.py, pipeline.py, metrics.py, gradio_interface.py, etc.).
  • tile_cache/ – on-disk cache of Hugging Face tiles (populated at runtime).
  • test_images/ – sample photos for local testing.
  • profiling_analysis.ipynb – notebook covering profiling, benchmarks, and plots.
  • helpers/download_tiles.py – utility to pre-download HF dataset tiles.

6. Support & Notes

  • First run may take longer while tiles download from Hugging Face. Subsequent runs use the cache.
  • If you see dataset download errors, set HF_HOME or edit Config.hf_cache_dir to point at a writable cache folder.
  • The project targets Python 3.10+ and macOS/Linux; Windows should work but has not been profiled extensively.