|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- zig |
|
|
- code |
|
|
- programming |
|
|
- lora |
|
|
- qwen2.5-coder |
|
|
base_model: Qwen/Qwen2.5-Coder-7B-Instruct |
|
|
model_type: qwen2.5 |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# ZigNet Qwen2.5-Coder-7B |
|
|
|
|
|
**Fine-tuned Qwen2.5-Coder-7B for Zig programming language analysis and assistance** |
|
|
|
|
|
This model is part of the [ZigNet](https://github.com/fulgidus/zignet) project - an MCP (Model Context Protocol) server that provides intelligent Zig code analysis for Claude and other LLMs. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |
|
|
- **Fine-tuning Method**: QLoRA (4-bit quantization) |
|
|
- **Training Data**: 13,756 Zig code examples from official documentation (v0.13-0.15) |
|
|
- **Supported Zig Versions**: 0.13.x, 0.14.x, 0.15.x |
|
|
- **Training Hardware**: NVIDIA RTX 3090 (24GB VRAM) |
|
|
- **Adapter Size**: ~155MB (LoRA adapters only) |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
```python |
|
|
LoraConfig: |
|
|
- r: 16 |
|
|
- lora_alpha: 32 |
|
|
- target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] |
|
|
- lora_dropout: 0.05 |
|
|
- bias: "none" |
|
|
|
|
|
TrainingArguments: |
|
|
- num_train_epochs: 3 |
|
|
- per_device_train_batch_size: 16 |
|
|
- learning_rate: 2e-4 |
|
|
- warmup_steps: 100 |
|
|
- fp16: true |
|
|
``` |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The model was trained on a curated dataset of Zig examples including: |
|
|
- Official Zig documentation examples (v0.13, v0.14, v0.15) |
|
|
- Advanced features: comptime, generics, error handling, async |
|
|
- Real-world code patterns from popular Zig projects |
|
|
|
|
|
**Dataset**: [fulgidus/zignet-training-dataset](https://huggingface.co/datasets/fulgidus/zignet-training-dataset) |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed to: |
|
|
- 📖 Provide Zig documentation context |
|
|
- 💡 Suggest intelligent code fixes for Zig errors |
|
|
- 🔍 Explain Zig-specific idioms and patterns |
|
|
- ⚡ Generate idiomatic Zig code |
|
|
|
|
|
**Note**: This model is NOT used for parsing or validation (handled by deterministic compiler-based tools). It focuses on documentation lookup and intelligent suggestions. |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Quality**: ⭐⭐⭐⭐⭐ Best-in-class for Zig syntax and idioms |
|
|
- **Benchmarks**: 100% pass rate on Zig validation tests |
|
|
- **Response Time**: ~15-20s (after GGUF quantization) |
|
|
|
|
|
## Usage |
|
|
|
|
|
### With Transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load base model |
|
|
base_model = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2.5-Coder-7B-Instruct", |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(base_model, "fulgidus/zignet-qwen2.5-coder-7b") |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct") |
|
|
|
|
|
# Generate |
|
|
prompt = "Explain Zig comptime feature with an example" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_length=500) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
### With ZigNet MCP Server |
|
|
|
|
|
This model is integrated into ZigNet for use with Claude: |
|
|
|
|
|
```bash |
|
|
# Install ZigNet |
|
|
git clone https://github.com/fulgidus/zignet |
|
|
cd zignet |
|
|
pnpm install |
|
|
pnpm run build |
|
|
|
|
|
# Configure MCP client (Claude Desktop) |
|
|
# Add to ~/Library/Application Support/Claude/claude_desktop_config.json |
|
|
{ |
|
|
"mcpServers": { |
|
|
"zignet": { |
|
|
"command": "node", |
|
|
"args": ["/path/to/zignet/dist/mcp-server.js"] |
|
|
} |
|
|
} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Focused on Zig 0.13-0.15 (may have limited accuracy on very old or very new syntax) |
|
|
- LoRA adapters only (requires base model for inference) |
|
|
- Optimized for English documentation and comments |
|
|
- Not suitable for real-time parsing (use ZigNet's AST parser for that) |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@software{zignet2025, |
|
|
author = {fulgidus}, |
|
|
title = {ZigNet: Intelligent Zig Code Analysis via MCP}, |
|
|
year = {2025}, |
|
|
url = {https://github.com/fulgidus/zignet} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache-2.0 (same as base model) |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- **Base Model**: [Qwen2.5-Coder](https://github.com/QwenLM/Qwen2.5-Coder) by Alibaba Cloud |
|
|
- **Zig Language**: [ziglang.org](https://ziglang.org) |
|
|
- **Training Framework**: HuggingFace Transformers + PEFT |
|
|
|
|
|
--- |
|
|
|
|
|
**Project**: [github.com/fulgidus/zignet](https://github.com/fulgidus/zignet) |
|
|
**Author**: fulgidus |
|
|
**Date**: October 2025 |
|
|
|