|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
library_name: peft |
|
|
tags: |
|
|
- base_model:adapter:distilgpt2 |
|
|
- lora |
|
|
- transformers |
|
|
base_model: distilgpt2 |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# NextStep-Coder-MoE |
|
|
|
|
|
A high-performance tiny coding LLM with **Interleaved Thinking** capability for advanced reasoning and agentic workflows. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
NextStep-Coder-MoE is a LoRA fine-tuned model based on Qwen2.5-Coder-1.5B, optimized for: |
|
|
- Chain-of-Thought reasoning with `<think>...</think>` tags |
|
|
- Multi-step coding tasks |
|
|
- Agentic workflows (plan → act → reflect) |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load model |
|
|
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-1.5B") |
|
|
model = PeftModel.from_pretrained(base_model, "OsamaBinLikhon/NextStep-Coder-MoE") |
|
|
tokenizer = AutoTokenizer.from_pretrained("OsamaBinLikhon/NextStep-Coder-MoE") |
|
|
|
|
|
# Generate |
|
|
prompt = "Write a Python function to check if a number is prime.\n<think>" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=256) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Base Model | Qwen/Qwen2.5-Coder-1.5B | |
|
|
| Method | LoRA (r=16, alpha=32) | |
|
|
| Trainable Params | 18.4M (1.18%) | |
|
|
| Precision | bf16 | |
|
|
| Framework | Transformers + PEFT | |
|
|
|
|
|
## Interleaved Thinking Format |
|
|
|
|
|
The model uses `<think>` tags to show reasoning: |
|
|
|
|
|
``` |
|
|
User: Write a binary search function. |
|
|
|
|
|
Model: <think> |
|
|
I need to implement binary search on a sorted array. |
|
|
Key steps: find middle, compare, narrow search space. |
|
|
Edge case: empty array returns -1. |
|
|
</think> |
|
|
|
|
|
def binary_search(arr, target): |
|
|
left, right = 0, len(arr) - 1 |
|
|
while left <= right: |
|
|
mid = (left + right) // 2 |
|
|
if arr[mid] == target: |
|
|
return mid |
|
|
elif arr[mid] < target: |
|
|
left = mid + 1 |
|
|
else: |
|
|
right = mid - 1 |
|
|
return -1 |
|
|
``` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Code generation with step-by-step reasoning |
|
|
- Debugging and code review |
|
|
- Algorithm design and explanation |
|
|
- Educational coding assistance |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Best for Python, may vary for other languages |
|
|
- Requires `<think>` tag retention in conversation history |
|
|
- 2048 token context limit |
|
|
|
|
|
## Author |
|
|
|
|
|
**OsamaBinLikhon** |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.18.0 |