| # Compilation Guide: Gemma 3 270M-it for WebGPU & Mobile |
|
|
| This document details the step-by-step process used to compile `google/gemma-3-270m-it` for WebGPU and mobile platforms using MLC-LLM. |
|
|
| ## Prerequisites |
|
|
| * **Operating System**: macOS (Apple Silicon recommended for performance) or Linux. |
| * **Python**: 3.11 or 3.12 (Managed via `venv`). |
| * **MLC-LLM**: Nightly build (necessary for Gemma 3 support). |
| * **Emscripten**: Required for WebGPU (WASM) compilation. |
| * **Vulkan SDK**: Required for Android compilation testing (optional for build). |
|
|
| ## 1. Environment Setup |
|
|
| We utilized a Python virtual environment and installed the specific nightly wheels for macOS. |
|
|
| ```bash |
| # Create and activate environment |
| python3 -m venv venv |
| source venv/bin/activate |
| |
| # Install MLC-LLM Nightly (Verify latest instructions on mlc.ai) |
| pip install --pre --force-reinstall mlc-llm-nightly-cpu mlc-ai-nightly-cpu \ |
| -f https://mlc.ai/wheels |
| ``` |
|
|
| ## 2. Model Download |
|
|
| We used a custom script (`setup_gemma.py`) to download the model from Hugging Face. |
|
|
| * **Source**: `google/gemma-3-270m-it` |
| * **Authentication**: Requires `HF_TOKEN` environment variable. |
|
|
| ## 3. Configuration Generation |
|
|
| Standard generation fails because Gemma 3 IT is multimodal. We generated a **text-only** configuration by manually stripping vision-related fields from the config. |
|
|
| **Command used:** |
| ```bash |
| python -m mlc_llm gen_config ./models/gemma-3-270m-it \ |
| --quantization q4f16_1 \ |
| --conv-template gemma_instruction \ |
| --output ./dist/gemma-3-270m-it-mlc |
| ``` |
|
|
| **Modifications:** |
| * Ensured `is_text_model: true` in `mlc-chat-config.json`. |
| * Removed `vision_config` and image processing parameters. |
|
|
| ## 4. WebGPU Compilation (The Hard Part) |
|
|
| Compiling for WebGPU requires **Emscripten** and building the TVM runtime from source, as the pip packages do not contain the necessary bitcode libraries (`wasm_runtime.bc`). |
|
|
| ### Step 4a: Install Emscripten |
| We automated this with `install_emscripten.sh`. |
| ```bash |
| git clone https://github.com/emscripten-core/emsdk.git |
| cd emsdk |
| ./emsdk install latest |
| ./emsdk activate latest |
| source ./emsdk_env.sh |
| ``` |
|
|
| ### Step 4b: Build Runtime Libraries |
| We cloned `mlc-llm` source and built the required `.bc` files using a temporary workspace to handle path spacing issues. |
| * **Artifacts Built**: `wasm_runtime.bc`, `tvmjs_support.bc`, `webgpu_runtime.bc`, `mlc_wasm_runtime.bc`. |
| * **Destination**: Installed into `venv/lib/python3.12/site-packages/tvm/` and source `dist/wasm/`. |
|
|
| ### Step 4c: Compile Model |
| Using `compile_webgpu.py`: |
| ```bash |
| python -m mlc_llm compile ./dist/gemma-3-270m-it-mlc/mlc-chat-config.json \ |
| --device webgpu \ |
| --opt O3 \ |
| --output ./dist/libs/gemma-3-270m-it-webgpu.wasm |
| ``` |
|
|
| ## 5. Mobile Compilation |
|
|
| For iOS and Android, we used the standard `mlc_llm compile` command with respective targets. |
|
|
| * **iOS**: `--device iphone` -> `gemma-3-270m-ios.tar` |
| * **Android**: `--device android` -> `gemma-3-270m-android.tar` |
|
|
| ## Troubleshooting |
|
|
| ### Emscripten & SSL on macOS |
| **Issue**: `curl` failed during `emsdk install` with SSL errors. |
| **Fix**: Unset `SSL_CERT_FILE` before running installation. |
| ```bash |
| unset SSL_CERT_FILE |
| ``` |
|
|
| ### Missing Runtime Libraries (`Cannot find library: ...bc`) |
| **Issue**: The default pip install is minimal and lacks WASM support libraries. |
| **Fix**: You MUST build `mlc-llm` runtime from source (using `build_webgpu_runtime.sh`) and assume the installed python package structure matches safely. |
|
|
| ### Path Spaces |
| **Issue**: Projects in folders with spaces ("Gemma 3 270m") break `make` and `clang`. |
| **Fix**: Build scripts were updated to move sources to `/tmp` for the compilation phase. |
|
|
| ## File Structure |
|
|
| The final release package structure: |
| * `gemma-3-270m-it-mlc/`: The generic configuration and weights. |
| * `libs/`: platform-specific compiled binaries/WASM. |
| * `README.md`: Documentation. |
|
|