# FunctionGemma Fine-tuned Model for WebLLM

This model can be used with [WebLLM](https://github.com/mlc-ai/web-llm).

## Model Information
- Base Model: google/functiongemma-270m-it
- LoRA Adapter: 2796gauravc/functiongemma-physics-game-lora
- Quantization: q4f16_1

## Usage with WebLLM

Since compiling to WASM requires building from source, you can use this model
with the pre-compiled Gemma WASM library from WebLLM:

```javascript
import * as webllm from "@mlc-ai/web-llm";

const appConfig = {
  model_list: [
    {
      model: "https://huggingface.co/2796gauravc/functiongemma-mlc",
      model_id: "functiongemma-physics",
      // Use the official Gemma WASM (compatible with your model)
      model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/main/gemma-2b-it-q4f16_1-ctx4k_cs1k-webgpu.wasm"
    }
  ]
};

const engine = await webllm.CreateMLCEngine(
  "functiongemma-physics",
  { appConfig }
);

const response = await engine.chat.completions.create({
  messages: [{ role: "user", content: "Hello!" }]
});
```

## Alternative: Use Ollama for Local Testing

For local CPU/GPU inference without browser:

```bash
# Convert to GGUF format first
pip install llama-cpp-python

# Then use with Ollama or llama.cpp
```

## Files in This Repo

- `params_shard_*.bin`: Model weights in MLC format
- `mlc-chat-config.json`: Model configuration
- `tokenizer.json`: Tokenizer
- `tokenizer_config.json`: Tokenizer configuration

## Note on WASM Compilation

Compiling custom WASM libraries requires building MLC-LLM from source with
Emscripten, which takes 1-2 hours. For most use cases, using the official
Gemma WASM is sufficient and fully compatible.