Compilation Guide: Gemma 3 270M-it for WebGPU & Mobile
This document details the step-by-step process used to compile google/gemma-3-270m-it for WebGPU and mobile platforms using MLC-LLM.
Prerequisites
- Operating System: macOS (Apple Silicon recommended for performance) or Linux.
- Python: 3.11 or 3.12 (Managed via
venv). - MLC-LLM: Nightly build (necessary for Gemma 3 support).
- Emscripten: Required for WebGPU (WASM) compilation.
- Vulkan SDK: Required for Android compilation testing (optional for build).
1. Environment Setup
We utilized a Python virtual environment and installed the specific nightly wheels for macOS.
# Create and activate environment
python3 -m venv venv
source venv/bin/activate
# Install MLC-LLM Nightly (Verify latest instructions on mlc.ai)
pip install --pre --force-reinstall mlc-llm-nightly-cpu mlc-ai-nightly-cpu \
-f https://mlc.ai/wheels
2. Model Download
We used a custom script (setup_gemma.py) to download the model from Hugging Face.
- Source:
google/gemma-3-270m-it - Authentication: Requires
HF_TOKENenvironment variable.
3. Configuration Generation
Standard generation fails because Gemma 3 IT is multimodal. We generated a text-only configuration by manually stripping vision-related fields from the config.
Command used:
python -m mlc_llm gen_config ./models/gemma-3-270m-it \
--quantization q4f16_1 \
--conv-template gemma_instruction \
--output ./dist/gemma-3-270m-it-mlc
Modifications:
- Ensured
is_text_model: trueinmlc-chat-config.json. - Removed
vision_configand image processing parameters.
4. WebGPU Compilation (The Hard Part)
Compiling for WebGPU requires Emscripten and building the TVM runtime from source, as the pip packages do not contain the necessary bitcode libraries (wasm_runtime.bc).
Step 4a: Install Emscripten
We automated this with install_emscripten.sh.
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest
./emsdk activate latest
source ./emsdk_env.sh
Step 4b: Build Runtime Libraries
We cloned mlc-llm source and built the required .bc files using a temporary workspace to handle path spacing issues.
- Artifacts Built:
wasm_runtime.bc,tvmjs_support.bc,webgpu_runtime.bc,mlc_wasm_runtime.bc. - Destination: Installed into
venv/lib/python3.12/site-packages/tvm/and sourcedist/wasm/.
Step 4c: Compile Model
Using compile_webgpu.py:
python -m mlc_llm compile ./dist/gemma-3-270m-it-mlc/mlc-chat-config.json \
--device webgpu \
--opt O3 \
--output ./dist/libs/gemma-3-270m-it-webgpu.wasm
5. Mobile Compilation
For iOS and Android, we used the standard mlc_llm compile command with respective targets.
- iOS:
--device iphone->gemma-3-270m-ios.tar - Android:
--device android->gemma-3-270m-android.tar
Troubleshooting
Emscripten & SSL on macOS
Issue: curl failed during emsdk install with SSL errors.
Fix: Unset SSL_CERT_FILE before running installation.
unset SSL_CERT_FILE
Missing Runtime Libraries (Cannot find library: ...bc)
Issue: The default pip install is minimal and lacks WASM support libraries.
Fix: You MUST build mlc-llm runtime from source (using build_webgpu_runtime.sh) and assume the installed python package structure matches safely.
Path Spaces
Issue: Projects in folders with spaces ("Gemma 3 270m") break make and clang.
Fix: Build scripts were updated to move sources to /tmp for the compilation phase.
File Structure
The final release package structure:
gemma-3-270m-it-mlc/: The generic configuration and weights.libs/: platform-specific compiled binaries/WASM.README.md: Documentation.