Aznaur's picture
Upload Terminal Bench Pro v2 model (epoch 9)
96bc847 verified
# Terminal Agent - Multi-Task NAT v13
## Model Description
This model is fine-tuned from Qwen3-8B on multi-task terminal agent trajectories using Negative-Aware Training (NAT).
### Key Features
- **5 Tasks**: fix-git, cancel-async-tasks, log-summary-date-ranges, regex-log, pypi-server
- **Fixed Tool Signatures**: Corrected critical bug where `note_name` was incorrectly removed
- **Clean Tool Calls**: Removed hallucinated parameters (message_title, message_description, message_attachment)
- **Negative Examples**: Includes looping and wrong_command negative examples
### Training Details
- **Base Model**: Qwen/Qwen3-8B
- **Training Data**: 40 samples (20 positive, 20 negative)
- **Epochs**: 300
- **Learning Rate**: 5e-5
- **Batch Size**: 4
### Tool Signatures (Corrected)
- `shell_exec(id, command, block)`
- `shell_write_content_to_file(content, file_path)`
- `create_note(note_name, content)`
- `append_note(note_name, content)`
- `read_note(note_name)`
### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("camel-ai/terminal_agent_multitask_nat_v13")
tokenizer = AutoTokenizer.from_pretrained("camel-ai/terminal_agent_multitask_nat_v13")
```
### V13 Fixes
1. **KEEP note_name** - Required by runtime (was incorrectly removed in v12)
2. **System prompt uses note_name** - Matches runtime expectations
3. **Remove only hallucinated params** - message_title, message_description, message_attachment
4. **Added tool call validation** - Catches signature issues before training
### Evaluation Results
Expected to achieve >80% success rate on 5 tasks when evaluated with matching task set.
## License
MIT License