| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: transformers.js |
| | tags: |
| | - code |
| | - python |
| | - maincoder |
| | - code-generation |
| | - reinforcement-learning |
| | - mcpo |
| | - onnx |
| | pipeline_tag: text-generation |
| | base_model: Maincode/Maincoder-1B |
| | --- |
| | |
| | # Maincoder 1B — ONNX (Quantized, WebGPU) |
| |
|
| | This is a **quantized ONNX** version of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), optimized for in-browser inference with [Transformers.js](https://huggingface.co/docs/transformers.js) and WebGPU. |
| |
|
| | ## Quantization |
| |
|
| | - **Format:** ONNX with int4 (MatMulNBits) quantization |
| | - **Original model size:** ~5 GB (fp32) |
| | - **Quantized model size:** ~1.5 GB (q4) |
| | - **Quantization method:** `MatMulNBitsQuantizer` from `onnxruntime` with block_size=32, symmetric quantization |
| | |
| | All tensor data is embedded in a single `.onnx` file (no external data files) for browser compatibility. |
| | |
| | ## Usage with Transformers.js |
| | |
| | ```javascript |
| | import { AutoModelForCausalLM, AutoTokenizer } from "@huggingface/transformers"; |
| | |
| | const model = await AutoModelForCausalLM.from_pretrained( |
| | "shreyask/Maincoder-1B-ONNX-web", |
| | { dtype: "q4", device: "webgpu" } |
| | ); |
| |
|
| | const tokenizer = await AutoTokenizer.from_pretrained( |
| | "shreyask/Maincoder-1B-ONNX-web" |
| | ); |
| | |
| | const messages = [ |
| | { role: "system", content: "You are Maincoder, an expert code generation assistant." }, |
| | { role: "user", content: "Write a binary search function in Python" }, |
| | ]; |
| | |
| | const input = tokenizer.apply_chat_template(messages, { |
| | add_generation_prompt: true, |
| | return_dict: true, |
| | }); |
| |
|
| | const output = await model.generate({ |
| | ...input, |
| | max_new_tokens: 1024, |
| | eos_token_id: [151643, 151645], |
| | }); |
| | ``` |
| | |
| | ## Base Model |
| | |
| | This is a quantized conversion of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B). See the base model card for training details, benchmarks, and intended use. |
| | |