NL to Bash β Qwen2.5-Coder-0.5B Fine-tuned
Fine-tuned version of Qwen2.5-Coder-0.5B-Instruct on 40,639 natural language β Bash command pairs from the NL2SH-ALFA dataset.
Try it live: π Gradio Demo
Results
| Metric | Score |
|---|---|
| Exact Match | 13.67% |
| Semantic Match (cosine β₯ 0.8) | 60.33% |
| Avg Similarity | 0.776 |
Evaluated on 300 held-out test examples from NL2SH-ALFA. Semantic similarity is computed using
all-MiniLM-L6-v2embeddings and is a better indicator of real-world quality than exact match alone, since multiple Bash commands can be functionally equivalent.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("dhwanichande29/nl-to-bash")
tokenizer = AutoTokenizer.from_pretrained("dhwanichande29/nl-to-bash")
system_prompt = "Your task is to translate a natural language instruction to a Bash command. You will receive an instruction in English and output a Bash command that can be run in a Linux terminal."
def translate(instruction):
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": instruction}
]
formatted = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=False)
response = outputs[0][inputs.input_ids.shape[-1]:]
return tokenizer.decode(response, skip_special_tokens=True).strip()
print(translate("list all files in current directory"))
# find . -type f
Example Outputs
| Natural Language | Generated Bash |
|---|---|
| list all files in current directory | find . -type f |
| find all python files | find . -name "*.py" |
| count lines in a text file | wc -l path/to/file |
| remove all .tmp files | find . -name "*.tmp" -exec rm {} \; |
| show disk usage | du -h / |
Training Details
- Base model: Qwen/Qwen2.5-Coder-0.5B-Instruct (494M parameters)
- Dataset: westenfelder/NL2SH-ALFA
- Train split: 40,639 examples
- Test split: 300 examples
- Epochs: 10
- Batch size: 15 per device (effective: 75 with gradient accumulation steps of 5)
- Precision: bfloat16
- Max token length: 150
- Hardware: NVIDIA A100-SXM4-80GB
- Training time: ~2.09 hours
- Experiment tracking: Weights & Biases (
nl2shproject)
Dataset
westenfelder/NL2SH-ALFA β a dataset of natural language instructions paired with corresponding Bash commands.
GitHub
Full training code, evaluation notebooks, and FastAPI deployment: π github.com/Dhwani-Chande/Natural-Language-to-Bash-Translation-using-LLMs
- Downloads last month
- 88
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for dhwanichande29/nl-to-bash
Base model
Qwen/Qwen2.5-0.5B Finetuned
Qwen/Qwen2.5-Coder-0.5B Finetuned
Qwen/Qwen2.5-Coder-0.5B-Instruct