| |
|
| | --- |
| | |
| | base_model: unsloth/qwen2.5-coder-1.5b-instruct-bnb-4bit |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - text-generation-inference |
| | - transformers |
| | - unsloth |
| | - qwen2 |
| | - trl |
| | - sft |
| | - fast-apply |
| | - instant-apply |
| |
|
| | --- |
| | |
| | [](https://hf.co/QuantFactory) |
| |
|
| |
|
| | # QuantFactory/FastApply-1.5B-v1.0-GGUF |
| | This is quantized version of [Kortix/FastApply-1.5B-v1.0](https://huggingface.co/Kortix/FastApply-1.5B-v1.0) created using llama.cpp |
| |
|
| | # Original Model Card |
| |
|
| |
|
| | # FastApply-1.5B-v1.0 |
| |
|
| | [Github: kortix-ai/fast-apply](https://github.com/kortix-ai/fast-apply) |
| | [Dataset: Kortix/FastApply-dataset-v1.0](https://huggingface.co/datasets/Kortix/FastApply-dataset-v1.0) |
| | [Try it now on 👉 Google Colab](https://colab.research.google.com/drive/1BNCab4oK-xBqwFQD4kCcjKc7BPKivkm1?usp=sharing) |
| |
|
| | ## Model Details |
| |
|
| | ### Basic Information |
| |
|
| | - **Developed by:** Kortix |
| | - **License:** apache-2.0 |
| | - **Finetuned from model:** [unsloth/Qwen2.5-Coder-1.5B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Qwen2.5-Coder-1.5B-Instruct-bnb-4bit) |
| |
|
| | ### Model Description |
| |
|
| | FastApply-1.5B-v1.0 is a 1.5B model designed for instant code application, producing full file edits to power [SoftGen AI](https://softgen.ai/). |
| | It is part of the Fast Apply pipeline for data generation and fine-tuning Qwen2.5 Coder models. |
| |
|
| | The model achieves high throughput when deployed on fast providers like Fireworks while maintaining high edit accuracy, with a speed of approximately 340 tokens/second. |
| |
|
| | ## Intended Use |
| |
|
| | FastApply-1.5B-v1.0 is intended for use in AI-powered code editors and tools that require fast, accurate code modifications. It is particularly well-suited for: |
| |
|
| | - Instant code application tasks |
| | - Full file edits |
| | - Integration with AI-powered code editors like Aider and PearAI |
| | - Local tools to reduce the cost of frontier model output |
| |
|
| | ## Inference template |
| |
|
| | FastApply-1.5B-v1.0 is based on the Qwen2.5 Coder architecture and is fine-tuned for code editing tasks. It uses a specific prompt structure for inference: |
| |
|
| | ``` |
| | <|im_start|>system |
| | You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|> |
| | <|im_start|>user |
| | Merge all changes from the <update> snippet into the <code> below. |
| | - Preserve the code's structure, order, comments, and indentation exactly. |
| | - Output only the updated code, enclosed within <updated-code> and </updated-code> tags. |
| | - Do not include any additional text, explanations, placeholders, ellipses, or code fences. |
| | |
| | <code>{original_code}</code> |
| | |
| | <update>{update_snippet}</update> |
| | |
| | Provide the complete updated code.<|im_end|> |
| | <|im_start|>assistant |
| | ``` |
| |
|
| | The model's output is structured as: |
| |
|
| | ``` |
| | <updated-code>[Full-complete updated file]</updated-code> |
| | ``` |
| |
|
| | ## Additional Information |
| |
|
| | For more details on the Fast Apply pipeline, data generation process, and deployment instructions, please refer to the [GitHub repository](https://github.com/kortix-ai/fast-apply). |
| |
|
| | ## How to Use |
| |
|
| | To use the model, you can load it using the Hugging Face Transformers library: |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained("Kortix/FastApply-1.5B-v1.0", device_map="auto") |
| | tokenizer = AutoTokenizer.from_pretrained("Kortix/FastApply-1.5B-v1.0") |
| | |
| | # Prepare your input following the prompt structure mentioned above |
| | input_text = """<|im_start|>system |
| | You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated.<|im_end|> |
| | <|im_start|>user |
| | Merge all changes from the <update> snippet into the <code> below. |
| | - Preserve the code's structure, order, comments, and indentation exactly. |
| | - Output only the updated code, enclosed within <updated-code> and </updated-code> tags. |
| | - Do not include any additional text, explanations, placeholders, ellipses, or code fences. |
| | |
| | <code>{original_code}</code> |
| | |
| | <update>{update_snippet}</update> |
| | |
| | Provide the complete updated code.<|im_end|> |
| | <|im_start|>assistant |
| | """ |
| | |
| | input_text = input_text.format( |
| | original_code=original_code, |
| | update_snippet=update_snippet, |
| | ).strip() |
| | |
| | # Generate the response |
| | input_ids = tokenizer.encode(input_text, return_tensors="pt") |
| | output = model.generate(input_ids, max_length=8192,) |
| | |
| | response = tokenizer.decode(output[0][len(input_ids[0]):]) |
| | print(response) |
| | |
| | # Extract the updated code from the response |
| | updated_code = response.split("<updated-code>")[1].split("</updated-code>")[0] |
| | ``` |
| |
|
| | ## Evaluation: |
| |
|
| |  |
| |
|
| |
|