--- language: - en license: mit library_name: transformers base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct tags: - code - code-editing - merge - fastedit - qwen2 pipeline_tag: text-generation --- # FastEdit 1.7B A fine-tuned **Qwen2.5-Coder-1.5B-Instruct** for merging code edit snippets into source files. Given an original code chunk (~35 lines) and a compact edit snippet with context markers, the model produces the merged result. This model is designed to be used with the [FastEdit](https://github.com/parcadei/fastedit) toolkit, which handles AST scoping, deterministic edits, and post-processing. **Using the model directly requires the exact prompt format described below.** ## Model variants All variants are in this repo under subfolders: | Subfolder | Format | Size | Use case | |-----------|--------|------|----------| | `bf16/` | BF16 safetensors | 3.2 GB | Fine-tuning, reference, GPU serving via vLLM/TGI | | `mlx-8bit/` | MLX 8-bit | 1.7 GB | Apple Silicon (recommended for local use) | | `gguf/` | GGUF Q8_0 | 1.7 GB | llama.cpp, LM Studio, Ollama | ## Prompt format The model expects a specific 2-message chat format. **Using a different prompt will produce poor results.** ### System message ``` You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated. /no_think ``` The `/no_think` suffix disables Qwen's thinking mode — without it, the model may emit thousands of reasoning tokens before producing output. ### User message ``` Merge all changes from the snippet into the below. - Preserve the code's structure, order, comments, and indentation exactly. - Output only the updated code, enclosed within and tags. - Do not include any additional text, explanations, placeholders, ellipses, or code fences. {original_code} {update_snippet} Provide the complete updated code. ``` ### Expected output The model outputs the merged code wrapped in `` tags: ``` def process(data): try: result = transform(data) return result except Error as e: return {"error": str(e)} ``` ### Complete example **Original code** (what tree-sitter extracts for the target function): ```python def process(data): result = transform(data) return result ``` **Edit snippet** (what the user/agent writes): ```python def process(data): try: # ... existing code ... except Error as e: return {"error": str(e)} ``` **Model output:** ```python def process(data): try: result = transform(data) return result except Error as e: return {"error": str(e)} ``` The model understands `# ... existing code ...` markers (and language-specific variants like `// ... existing code ...`) as instructions to preserve the original lines in that region. ## How it fits into FastEdit In production, the model is the **fallback** — not the primary path: 1. **AST scoping** — tree-sitter finds the target function by name (~35 lines), so the model never sees the whole file 2. **Deterministic text-match** �� 74% of edits are resolved by matching context lines and splicing in new lines (0 tokens, <1ms) 3. **Model merge** — the remaining 26% of edits (structural changes like wrapping in try/catch, full rewrites) go to this model The model only ever processes ~35-line chunks. It was trained on function-scoped edits, not whole files. Feeding it large inputs will degrade quality. ## Using without FastEdit If you want to use the model directly (without the toolkit), you need to: 1. **Scope the input yourself** — extract only the target function/class, not the whole file 2. **Use the exact prompt format** above — different prompts will produce different (worse) results 3. **Parse the output** — extract text between `` and `` tags 4. **Handle edge cases** — the model may emit `` blocks (strip them), use variant tag names (``, ``), or truncate output on long functions ```python from transformers import AutoModelForCausalLM, AutoTokenizer # BF16 (GPU / fine-tuning) model = AutoModelForCausalLM.from_pretrained("continuous-lab/FastEdit", subfolder="bf16", torch_dtype="auto") tokenizer = AutoTokenizer.from_pretrained("continuous-lab/FastEdit", subfolder="bf16") messages = [ {"role": "system", "content": "You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated. /no_think"}, {"role": "user", "content": """Merge all changes from the snippet into the below. - Preserve the code's structure, order, comments, and indentation exactly. - Output only the updated code, enclosed within and tags. - Do not include any additional text, explanations, placeholders, ellipses, or code fences. def process(data): result = transform(data) return result def process(data): try: # ... existing code ... except Error as e: return {"error": str(e)} Provide the complete updated code."""} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512, temperature=0) result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) # Parse: extract text between and ``` ## Training - **Base model**: Qwen2.5-Coder-1.5B-Instruct - **Task**: Code edit merging across 13 languages ## Evaluation Tested on 22 structurally distinct edit patterns (73 cases) across 13 languages: | Path | Accuracy | Avg tokens | Avg latency | |------|----------|------------|-------------| | Deterministic (74% of edits) | 100% | 0 | <1ms | | Model (26% of edits) | 92% | ~40 | ~500ms | | **Combined** | **~98%** | **~10** | **~130ms** | Per-language model accuracy (156-example benchmark): | Language | Accuracy | |----------|----------| | Python, Java, Kotlin, C, PHP | 92% | | JavaScript, TypeScript, Rust, Swift | 85% | | Go, C++, Ruby | 77% | ## Limitations - Performance degrades on inputs longer than ~100 lines. - Does not handle whole-file edits well — use the FastEdit toolkit's AST scoping. - The edit snippet must use `# ... existing code ...` markers (or language-equivalent) for context preservation. Without markers, the model treats the entire snippet as a replacement. - Languages not in the training set may work but are untested. ## License MIT