Sweep Next-Edit 1.5B (GGUF)
A 1.5B parameter model for next-edit autocomplete, quantized to Q8_0 GGUF format.
Model Description
Sweep Next-Edit predicts your next code edit before you make it. It runs locally on your laptop in under 500ms (with speculative decoding) and outperforms models over 4x its size on next-edit benchmarks. More details here.
Usage
Download run_model.py and the model file, then:
uv pip install llama-cpp-python huggingface_hub
python run_model.py
Model Details
- Format: GGUF (Q8_0 quantization)
- Parameters: 1.5B
- Context Length: 8192 tokens
- Base Model: Qwen2.5-Coder
Example
The model uses a specific prompt format with file context, recent diffs, and current state to predict the next edit. See run_model.py for a complete example.
Links
- Blog Post - Technical details and benchmarks
- JetBrains Plugin - Sweep AI JetBrains Plugin
- HN Thread - Discuss implementation for VSCode, Neovim & Emacs
- Twitter Post - Ask us any other questions
License
Apache 2.0
- Downloads last month
- 2,434
Hardware compatibility
Log In
to view the estimation
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support