--- title: Mosaic Generator emoji: 🧩 colorFrom: indigo colorTo: purple sdk: gradio app_file: app.py pinned: false --- # Lab 5 – Mosaic Generator A fully refactored and optimized version of the Lab 1 mosaic pipeline. This release adds strict vectorization, caching, profiling evidence, and a polished Gradio front end. ## 1. Installation ```bash # 1. Create and activate a Python 3.10+ virtual environment python3 -m venv .venv source .venv/bin/activate # 2. Install the project dependencies pip install --upgrade pip pip install -r requirements.txt ``` Optional extras: - `pip install line_profiler` if you want to re-run the profiling notebook. - `pip install jupyterlab` if you prefer to explore the notebooks interactively. ## 2. Usage ### Run the Gradio App Locally ```bash cd lab-5 python app.py ``` Visit http://localhost:7860 to upload an image, tweak grid/tile settings, and view the generated mosaic, quality metrics, and timing stats live. ### Programmatic Pipeline Example ```python from pathlib import Path from PIL import Image from src.config import Config from src.pipeline import MosaicPipeline cfg = Config( grid=32, tile_size=32, out_w=768, out_h=768, tiles_cache_dir="tile_cache" ) pipeline = MosaicPipeline(cfg) image = Image.open(Path("test_images/copley.png")).convert("RGB") results = pipeline.run_full_pipeline(image) results["outputs"]["mosaic"].save("outputs/mosaic.png") print(results["timing"], results["metrics"]) ``` ### Profiling Notebook Open `profiling_analysis.ipynb` to reproduce the cProfile / line_profiler runs, before-vs-after timings, and plots used in the assessment. ## 3. Performance Benchmarks (vs Lab 1) Benchmarks compare the original Lab 1 implementation (β€œLegacy”) with this optimized Lab 5 pipeline on the same MacBook Pro (M3 Pro, Python 3.11). Each entry averages three runs with cached tiles. | Image Size | Grid | Legacy Time (s) | Lab 5 Time (s) | Speedup | |------------|------|-----------------|----------------|---------| | 256Γ—256 | 16Γ—16| 0.063 | 0.038 | 1.6Γ— | | 512Γ—512 | 32Γ—32| 0.149 | 0.140 | 1.1Γ— | | 1024Γ—1024 | 64Γ—64| 0.576 | 0.542 | 1.1Γ— | Key optimizations that produced the gains: 1. **Vectorized grid analysis** – replaces nested loops with `numpy.lib.stride_tricks.block_view` and weighted reductions, eliminating thousands of Python iterations per frame. 2. **Vectorized tile matching** – stacks the tile bank once, computes LAB/RGB distances with NumPy, and gathers tiles in bulk. 3. **Tile caching** – persist Hugging Face tiles to disk (`tile_cache/`) and reuse them across runs, avoiding repeated dataset downloads/resizing. 4. **Configurable quantization** – optional uniform or k-means quantization reduces the color-space variance before tiling. Refer to the notebook for raw profiler dumps, bottleneck analysis, and charts illustrating how the optimized pipeline scales more gracefully as grids grow. ## 4. Deployed Demo A live Gradio demo is hosted on Hugging Face Spaces: πŸ‘‰ https://huggingface.co/spaces/Teoman21/Lab-5 The hosted build runs the same `app.py` entry point, with tiles cached in the Space storage. Use it for quick testing or to share results without cloning the repo. ## 5. Repository Map - `app.py` – launches the Gradio interface. - `src/` – reusable package (`mosaic.py`, `tiles.py`, `pipeline.py`, `metrics.py`, `gradio_interface.py`, etc.). - `tile_cache/` – on-disk cache of Hugging Face tiles (populated at runtime). - `test_images/` – sample photos for local testing. - `profiling_analysis.ipynb` – notebook covering profiling, benchmarks, and plots. - `helpers/download_tiles.py` – utility to pre-download HF dataset tiles. ## 6. Support & Notes - First run may take longer while tiles download from Hugging Face. Subsequent runs use the cache. - If you see dataset download errors, set `HF_HOME` or edit `Config.hf_cache_dir` to point at a writable cache folder. - The project targets Python 3.10+ and macOS/Linux; Windows should work but has not been profiled extensively.