# Compilation Guide: Gemma 3 270M-it for WebGPU & Mobile This document details the step-by-step process used to compile `google/gemma-3-270m-it` for WebGPU and mobile platforms using MLC-LLM. ## Prerequisites * **Operating System**: macOS (Apple Silicon recommended for performance) or Linux. * **Python**: 3.11 or 3.12 (Managed via `venv`). * **MLC-LLM**: Nightly build (necessary for Gemma 3 support). * **Emscripten**: Required for WebGPU (WASM) compilation. * **Vulkan SDK**: Required for Android compilation testing (optional for build). ## 1. Environment Setup We utilized a Python virtual environment and installed the specific nightly wheels for macOS. ```bash # Create and activate environment python3 -m venv venv source venv/bin/activate # Install MLC-LLM Nightly (Verify latest instructions on mlc.ai) pip install --pre --force-reinstall mlc-llm-nightly-cpu mlc-ai-nightly-cpu \ -f https://mlc.ai/wheels ``` ## 2. Model Download We used a custom script (`setup_gemma.py`) to download the model from Hugging Face. * **Source**: `google/gemma-3-270m-it` * **Authentication**: Requires `HF_TOKEN` environment variable. ## 3. Configuration Generation Standard generation fails because Gemma 3 IT is multimodal. We generated a **text-only** configuration by manually stripping vision-related fields from the config. **Command used:** ```bash python -m mlc_llm gen_config ./models/gemma-3-270m-it \ --quantization q4f16_1 \ --conv-template gemma_instruction \ --output ./dist/gemma-3-270m-it-mlc ``` **Modifications:** * Ensured `is_text_model: true` in `mlc-chat-config.json`. * Removed `vision_config` and image processing parameters. ## 4. WebGPU Compilation (The Hard Part) Compiling for WebGPU requires **Emscripten** and building the TVM runtime from source, as the pip packages do not contain the necessary bitcode libraries (`wasm_runtime.bc`). ### Step 4a: Install Emscripten We automated this with `install_emscripten.sh`. ```bash git clone https://github.com/emscripten-core/emsdk.git cd emsdk ./emsdk install latest ./emsdk activate latest source ./emsdk_env.sh ``` ### Step 4b: Build Runtime Libraries We cloned `mlc-llm` source and built the required `.bc` files using a temporary workspace to handle path spacing issues. * **Artifacts Built**: `wasm_runtime.bc`, `tvmjs_support.bc`, `webgpu_runtime.bc`, `mlc_wasm_runtime.bc`. * **Destination**: Installed into `venv/lib/python3.12/site-packages/tvm/` and source `dist/wasm/`. ### Step 4c: Compile Model Using `compile_webgpu.py`: ```bash python -m mlc_llm compile ./dist/gemma-3-270m-it-mlc/mlc-chat-config.json \ --device webgpu \ --opt O3 \ --output ./dist/libs/gemma-3-270m-it-webgpu.wasm ``` ## 5. Mobile Compilation For iOS and Android, we used the standard `mlc_llm compile` command with respective targets. * **iOS**: `--device iphone` -> `gemma-3-270m-ios.tar` * **Android**: `--device android` -> `gemma-3-270m-android.tar` ## Troubleshooting ### Emscripten & SSL on macOS **Issue**: `curl` failed during `emsdk install` with SSL errors. **Fix**: Unset `SSL_CERT_FILE` before running installation. ```bash unset SSL_CERT_FILE ``` ### Missing Runtime Libraries (`Cannot find library: ...bc`) **Issue**: The default pip install is minimal and lacks WASM support libraries. **Fix**: You MUST build `mlc-llm` runtime from source (using `build_webgpu_runtime.sh`) and assume the installed python package structure matches safely. ### Path Spaces **Issue**: Projects in folders with spaces ("Gemma 3 270m") break `make` and `clang`. **Fix**: Build scripts were updated to move sources to `/tmp` for the compilation phase. ## File Structure The final release package structure: * `gemma-3-270m-it-mlc/`: The generic configuration and weights. * `libs/`: platform-specific compiled binaries/WASM. * `README.md`: Documentation.