File size: 3,888 Bytes

2229feb

# Compilation Guide: Gemma 3 270M-it for WebGPU & Mobile

This document details the step-by-step process used to compile `google/gemma-3-270m-it` for WebGPU and mobile platforms using MLC-LLM.

## Prerequisites

*   **Operating System**: macOS (Apple Silicon recommended for performance) or Linux.
*   **Python**: 3.11 or 3.12 (Managed via `venv`).
*   **MLC-LLM**: Nightly build (necessary for Gemma 3 support).
*   **Emscripten**: Required for WebGPU (WASM) compilation.
*   **Vulkan SDK**: Required for Android compilation testing (optional for build).

## 1. Environment Setup

We utilized a Python virtual environment and installed the specific nightly wheels for macOS.

```bash
# Create and activate environment
python3 -m venv venv
source venv/bin/activate

# Install MLC-LLM Nightly (Verify latest instructions on mlc.ai)
pip install --pre --force-reinstall mlc-llm-nightly-cpu mlc-ai-nightly-cpu \
    -f https://mlc.ai/wheels
```

## 2. Model Download

We used a custom script (`setup_gemma.py`) to download the model from Hugging Face.

*   **Source**: `google/gemma-3-270m-it`
*   **Authentication**: Requires `HF_TOKEN` environment variable.

## 3. Configuration Generation

Standard generation fails because Gemma 3 IT is multimodal. We generated a **text-only** configuration by manually stripping vision-related fields from the config.

**Command used:**
```bash
python -m mlc_llm gen_config ./models/gemma-3-270m-it \
    --quantization q4f16_1 \
    --conv-template gemma_instruction \
    --output ./dist/gemma-3-270m-it-mlc
```

**Modifications:**
*   Ensured `is_text_model: true` in `mlc-chat-config.json`.
*   Removed `vision_config` and image processing parameters.

## 4. WebGPU Compilation (The Hard Part)

Compiling for WebGPU requires **Emscripten** and building the TVM runtime from source, as the pip packages do not contain the necessary bitcode libraries (`wasm_runtime.bc`).

### Step 4a: Install Emscripten
We automated this with `install_emscripten.sh`.
```bash
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest
./emsdk activate latest
source ./emsdk_env.sh
```

### Step 4b: Build Runtime Libraries
We cloned `mlc-llm` source and built the required `.bc` files using a temporary workspace to handle path spacing issues.
*   **Artifacts Built**: `wasm_runtime.bc`, `tvmjs_support.bc`, `webgpu_runtime.bc`, `mlc_wasm_runtime.bc`.
*   **Destination**: Installed into `venv/lib/python3.12/site-packages/tvm/` and source `dist/wasm/`.

### Step 4c: Compile Model
Using `compile_webgpu.py`:
```bash
python -m mlc_llm compile ./dist/gemma-3-270m-it-mlc/mlc-chat-config.json \
    --device webgpu \
    --opt O3 \
    --output ./dist/libs/gemma-3-270m-it-webgpu.wasm
```

## 5. Mobile Compilation

For iOS and Android, we used the standard `mlc_llm compile` command with respective targets.

*   **iOS**: `--device iphone` -> `gemma-3-270m-ios.tar`
*   **Android**: `--device android` -> `gemma-3-270m-android.tar`

## Troubleshooting

### Emscripten & SSL on macOS
**Issue**: `curl` failed during `emsdk install` with SSL errors.
**Fix**: Unset `SSL_CERT_FILE` before running installation.
```bash
unset SSL_CERT_FILE
```

### Missing Runtime Libraries (`Cannot find library: ...bc`)
**Issue**: The default pip install is minimal and lacks WASM support libraries.
**Fix**: You MUST build `mlc-llm` runtime from source (using `build_webgpu_runtime.sh`) and assume the installed python package structure matches safely.

### Path Spaces
**Issue**: Projects in folders with spaces ("Gemma 3 270m") break `make` and `clang`.
**Fix**: Build scripts were updated to move sources to `/tmp` for the compilation phase.

## File Structure

The final release package structure:
*   `gemma-3-270m-it-mlc/`: The generic configuration and weights.
*   `libs/`: platform-specific compiled binaries/WASM.
*   `README.md`: Documentation.