| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - mlx |
| - phi-2 |
| - html |
| - css |
| - web-development |
| - code-generation |
| - fine-tuned |
| - apple-silicon |
| base_model: microsoft/phi-2 |
| pipeline_tag: text-generation |
| library_name: mlx |
| model-index: |
| - name: WebICoder-v3-MLX-8bit |
| results: [] |
| --- |
| |
| # β‘ WebICoder v3 β HTML Code Generation (MLX 8-bit) |
|
|
| **WebICoder v3** is a fine-tuned version of [Microsoft Phi-2](https://huggingface.co/microsoft/phi-2) (2.7B parameters) specialized in generating **complete, production-ready HTML/CSS websites** from natural language descriptions. |
|
|
| Optimized for **Apple Silicon** via [MLX](https://github.com/ml-explore/mlx). |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | **Base Model** | Microsoft Phi-2 (2.7B parameters) | |
| | **Architecture** | PhiForCausalLM (32 layers, 2560 hidden) | |
| | **Format** | MLX (Apple Silicon optimized) | |
| | **Quantization** | 8-bit (8.503 bits/weight, affine) | |
| | **Size** | ~2.9 GB | |
| | **Context Length** | 4096 tokens | |
| | **Task** | HTML/CSS Code Generation | |
| | **Speed** | ~12-20 tok/s on M-series Mac | |
|
|
| ## Also Available |
|
|
| | Variant | Link | Size | |
| |---|---|---| |
| | **8-bit** (higher quality) | `YOUR_USERNAME/WebICoder-v3-MLX-8bit` | ~2.9 GB | |
|
|
| --- |
|
|
| ## β οΈ MANDATORY β Read Before Using |
|
|
| > **If you skip these steps, the model will produce broken, repeated, or low-quality output.** |
| > Follow ALL 5 rules below to get the best results. |
|
|
| ### Rule 1 β Use the correct prompt format |
|
|
| The model was trained with an **Alpaca-style format**. You MUST wrap your prompt like this: |
|
|
| ``` |
| ### Instruction: |
| {your website description here} |
| |
| ### Response: |
| ``` |
|
|
| β **DO NOT** send raw text like `"Create a website"` β the model won't understand it correctly. |
|
|
| β
**DO** use the format above, or use `tokenizer.apply_chat_template()` which does it automatically. |
|
|
| ### Rule 2 β ALWAYS stop at `</html>` |
|
|
| The model does not always emit an EOS token after finishing the HTML. You **MUST** check for `</html>` in the output and stop generation when you see it. |
|
|
| ```python |
| # β
Correct β stop at </html> |
| for response in stream_generate(model, tokenizer, prompt=prompt, max_tokens=4096, sampler=sampler): |
| full_text += response.text |
| if "</html>" in full_text: |
| break |
| ``` |
|
|
| β Without this, the model will **repeat the entire page** in a loop. |
|
|
| ### Rule 3 β Use repetition penalty |
|
|
| A repetition penalty is **essential** to prevent the model from generating duplicate sections (e.g., the same footer twice, identical testimonials). |
|
|
| ```python |
| from mlx_lm.sample_utils import make_logits_processors |
| |
| logits_processors = make_logits_processors(repetition_penalty=1.2, repetition_context_size=256) |
| ``` |
|
|
| Then pass `logits_processors=logits_processors` to `stream_generate()`. |
|
|
| ### Rule 4 β Use low temperature (0.3 β 0.5) |
|
|
| High temperature (> 0.7) produces incoherent, broken HTML. **Always use 0.3 β 0.5**. |
|
|
| ```python |
| from mlx_lm.sample_utils import make_sampler |
| |
| sampler = make_sampler(temp=0.4) # β
Recommended |
| ``` |
|
|
| ### Rule 5 β Post-process the output |
|
|
| The model may occasionally prepend training artifacts (system prompt) before the HTML. **Always clean the output:** |
|
|
| ```python |
| import re |
| |
| def clean_html(text: str) -> str: |
| """Extract clean HTML from model output.""" |
| # Remove leaked system prompts |
| text = re.sub(r"You are (?:Deep|Web[iI])coder.*?production-ready code\.\n*", "", text, flags=re.DOTALL) |
| text = re.sub(r"### Instruction:.*", "", text, flags=re.DOTALL) |
| text = re.sub(r"### Response:\s*", "", text, flags=re.DOTALL) |
| |
| # Extract HTML document |
| match = re.search(r"(<(?:!DOCTYPE\s+html|html)[\s\S]*?</html>)", text, re.IGNORECASE) |
| if match: |
| return match.group(1).strip() |
| |
| # Fallback |
| start = re.search(r"<(?:!DOCTYPE|html|head|body)", text, re.IGNORECASE) |
| if start: |
| html = text[start.start():].strip() |
| if not html.lower().startswith("<!doctype"): |
| html = "<!DOCTYPE html>\n<html>\n" + html + "\n</html>" |
| return html |
| |
| return text.strip() |
| ``` |
|
|
| --- |
|
|
| ## Quick Start β Complete Working Example |
|
|
| Copy-paste this and it will work: |
|
|
| ```python |
| from mlx_lm import load, stream_generate |
| from mlx_lm.sample_utils import make_sampler, make_logits_processors |
| import re |
| |
| # 1. Load model |
| model, tokenizer = load("YOUR_USERNAME/WebICoder-v3-MLX-8bit") |
| |
| # 2. Format prompt (MANDATORY) |
| user_prompt = "Create a modern portfolio website with a hero, project cards, and a contact form" |
| |
| prompt = f"""### Instruction: |
| {user_prompt} |
| |
| ### Response: |
| """ |
| |
| # 3. Configure sampler + repetition penalty (MANDATORY) |
| sampler = make_sampler(temp=0.4) |
| logits_processors = make_logits_processors(repetition_penalty=1.2, repetition_context_size=256) |
| |
| # 4. Generate with stop at </html> (MANDATORY) |
| full_text = "" |
| for response in stream_generate( |
| model, tokenizer, |
| prompt=prompt, |
| max_tokens=4096, |
| sampler=sampler, |
| logits_processors=logits_processors, |
| ): |
| full_text += response.text |
| print(response.text, end="", flush=True) |
| |
| if "</html>" in full_text or response.finish_reason: |
| break |
| |
| # 5. Clean output (MANDATORY) |
| def clean_html(text): |
| text = re.sub(r"You are (?:Deep|Web[iI])coder.*?production-ready code\.\n*", "", text, flags=re.DOTALL) |
| match = re.search(r"(<(?:!DOCTYPE\s+html|html)[\s\S]*?</html>)", text, re.IGNORECASE) |
| return match.group(1).strip() if match else text.strip() |
| |
| html = clean_html(full_text) |
| |
| # Save to file |
| with open("output.html", "w") as f: |
| f.write(html) |
| print(f"\n\nSaved to output.html ({len(html)} chars)") |
| ``` |
|
|
| --- |
|
|
| ## Recommended Parameters Summary |
|
|
| | Parameter | Value | Mandatory? | |
| |---|---|:---:| |
| | **Prompt format** | `### Instruction:` / `### Response:` | β
YES | |
| | **Temperature** | 0.3 β 0.5 | β
YES | |
| | **Repetition Penalty** | 1.2 | β
YES | |
| | **Repetition Context** | 256 | β
YES | |
| | **Max Tokens** | 4096 | β
YES | |
| | **Stop at `</html>`** | Check output and break | β
YES | |
| | **Post-processing** | `clean_html()` function | β
YES | |
| | **Top-p** | 0.9 | Recommended | |
| | **Top-k** | 50 | Optional | |
|
|
| --- |
|
|
| ## Using the Chat Template |
|
|
| The tokenizer includes a built-in chat template that handles prompt formatting automatically: |
|
|
| ```python |
| messages = [ |
| {"role": "user", "content": "Create a dark-themed portfolio website with project cards"} |
| ] |
| |
| prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| # This automatically wraps it in ### Instruction: / ### Response: format |
| ``` |
|
|
| ## Using the Example Script |
|
|
| ```bash |
| # Single prompt |
| python example.py "Create a landing page for a coffee shop" |
| |
| # Interactive mode |
| python example.py --interactive |
| ``` |
|
|
| --- |
|
|
| ## Example Outputs |
|
|
| | Prompt | What You Get | |
| |---|---| |
| | "Create a portfolio with a hero and project cards" | Nav, animated hero, glassmorphism cards, contact form, footer | |
| | "Create a landing page for a fitness app" | Hero gradient, feature cards, testimonials, CTA, footer | |
| | "Create a pricing page with 3 tiers" | Toggle monthly/yearly, feature lists, highlighted plan | |
| | "Create a login page with split layout" | Gradient left, form right, social login buttons | |
|
|
| --- |
|
|
| ## What the Model Generates |
|
|
| When properly configured, WebICoder v3 produces: |
|
|
| - β
Complete `<!DOCTYPE html>` with `<head>`, `<meta>`, `<title>` |
| - β
**Vanilla CSS** β custom properties, gradients, glassmorphism, `backdrop-filter` |
| - β
**Responsive design** β `@media` queries, `clamp()`, CSS Grid `auto-fit` |
| - β
**Animations** β `fade-in` with `IntersectionObserver`, hover transitions |
| - β
**Modern design** β gradient text, blur effects, rounded corners, shadows |
| - β
**Complete pages** β nav, hero, content sections, footer |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - Optimized for **single-page HTML** with embedded CSS/JS |
| - Context window: **4096 tokens** β very complex multi-section pages may still be truncated |
| - Based on Phi-2 (2.7B) β larger models will produce more sophisticated output |
| - English prompts work best |
|
|
| --- |
|
|
| ## Training Details |
|
|
| | Property | Value | |
| |---|---| |
| | **Base Model** | microsoft/phi-2 | |
| | **Fine-tuning** | Full fine-tuning on HTML/CSS code pairs | |
| | **Training Format** | Alpaca-style (Instruction / Response) | |
| | **Training Context** | 4096 tokens | |
| | **Precision** | float16 | |
| | **Quantization** | Post-training 8-bit (MLX affine, group_size=64) | |
| |
| --- |
| |
| ## Files Included |
| |
| | File | Description | |
| |---|---| |
| | `model.safetensors` | Quantized model weights | |
| | `config.json` | Model architecture configuration | |
| | `tokenizer.json` | Tokenizer vocabulary | |
| | `tokenizer_config.json` | Tokenizer settings with chat template | |
| | `generation_config.json` | Recommended generation parameters | |
| | `example.py` | Ready-to-use example script with all mandatory rules | |
| | `LICENSE` | MIT License | |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{webicoder-v3, |
| title={WebICoder v3: Fine-tuned Phi-2 for HTML Code Generation}, |
| year={2025}, |
| publisher={Hugging Face}, |
| url={https://huggingface.co/YOUR_USERNAME/WebICoder-v3-MLX-8bit} |
| } |
| ``` |
|
|
| ## License |
|
|
| MIT License β see [LICENSE](LICENSE) for details. |
|
|