shreyask
/

Maincoder-1B-ONNX-web

Text Generation

Transformers.js

code-generation

reinforcement-learning

Model card Files Files and versions

shreyask commited on Feb 13

Commit

314c10a

·

verified ·

1 Parent(s): dd73600

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +64 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+license: apache-2.0
+language:
+- en
+library_name: transformers.js
+tags:
+- code
+- python
+- maincoder
+- code-generation
+- reinforcement-learning
+- mcpo
+- onnx
+pipeline_tag: text-generation
+base_model: Maincode/Maincoder-1B
+---
+# Maincoder 1B — ONNX (Quantized, WebGPU)
+This is a **quantized ONNX** version of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), optimized for in-browser inference with [Transformers.js](https://huggingface.co/docs/transformers.js) and WebGPU.
+## Quantization
+- **Format:** ONNX with int4 (MatMulNBits) quantization
+- **Original model size:** ~5 GB (fp32)
+- **Quantized model size:** ~1.5 GB (q4)
+- **Quantization method:** `MatMulNBitsQuantizer` from `onnxruntime` with block_size=32, symmetric quantization
+All tensor data is embedded in a single `.onnx` file (no external data files) for browser compatibility.
+## Usage with Transformers.js
+```javascript
+import { AutoModelForCausalLM, AutoTokenizer } from "@huggingface/transformers";
+const model = await AutoModelForCausalLM.from_pretrained(
+  "shreyask/Maincoder-1B-ONNX-web",
+  { dtype: "q4", device: "webgpu" }
+);
+const tokenizer = await AutoTokenizer.from_pretrained(
+  "shreyask/Maincoder-1B-ONNX-web"
+);
+const messages = [
+  { role: "system", content: "You are Maincoder, an expert code generation assistant." },
+  { role: "user", content: "Write a binary search function in Python" },
+];
+const input = tokenizer.apply_chat_template(messages, {
+  add_generation_prompt: true,
+  return_dict: true,
+});
+const output = await model.generate({
+  ...input,
+  max_new_tokens: 1024,
+  eos_token_id: [151643, 151645],
+});
+```
+## Base Model
+This is a quantized conversion of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B). See the base model card for training details, benchmarks, and intended use.