Negative-Aware Training (NAT) Dataset for Fix-Git Task
Based on paper: "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents" (arXiv 2402.11651)
What is NAT?
Negative-Aware Training (NAT) is a technique that improves LLM agent performance by training on both:
- Positive examples - Correct trajectories that solve the task
- Negative examples - Incorrect trajectories that demonstrate common mistakes
The key insight: Models learn better when they see both what TO do and what NOT to do.
How NAT Works
1. Explicit Differentiation
Positive and negative examples have different system prompts:
Positive prompt suffix:
<training_mode>
You are being shown a CORRECT example of how to solve this task.
Generate tool calls that follow the exact format shown, with only the required parameters.
Do NOT add extra parameters like message_title, message_description, or type.
When the task is complete, respond with a brief summary (NO tool call).
</training_mode>
Negative prompt suffix:
<training_mode>
WARNING: The following is an INCORRECT example that demonstrates common mistakes.
This trajectory contains errors such as:
- Adding unnecessary parameters (message_title, message_description)
- Looping on the same commands after task completion
- Using wrong command formats
Learn to AVOID these patterns.
</training_mode>
2. Training Process
- Both positive and negative examples are mixed in the training data
- The model learns to associate the positive prompt with correct behavior
- The model learns to associate the negative prompt with incorrect behavior
- At inference time, only the positive prompt is used
3. Inference
During inference, we use ONLY the positive prompt. The model has learned:
- What correct tool calls look like
- What mistakes to avoid
Dataset Composition
| Type | Count | Description |
|---|---|---|
| Positive | 4 | Claude's successful trajectory |
| Negative (hallucinated args) | 2 | Tool calls with extra message_title, message_description |
| Negative (looping) | 2 | Repeating commands after task completion |
| Negative (wrong command) | 2 | Using id as command value |
| Total | 10 |
Example Comparisons
Positive Example (Correct)
{
"name": "shell_exec",
"arguments": "{\"id\": \"survey\", \"command\": \"pwd && ls -la && git status\", \"block\": true}"
}
Negative Example 1: Hallucinated Arguments β
{
"name": "shell_exec",
"arguments": "{\"id\": \"survey\", \"command\": \"pwd && ls -la && git status\", \"block\": true, \"message_title\": \"Survey\", \"message_description\": \"Performing system survey\"}"
}
Problem: Extra message_title and message_description parameters that don't exist in the tool schema.
Negative Example 2: Looping Behavior β
After task is already complete (merge done, working tree clean), the model generates:
{
"name": "shell_exec",
"arguments": "{\"id\": \"loop_merge\", \"command\": \"git merge 650dba4 --no-ff -m \\\"Merge Stanford changes into master\\\"\", \"block\": true}"
}
Problem: Repeating the merge command when the task is already done.
Negative Example 3: Wrong Command Format β
{
"name": "shell_exec",
"arguments": "{\"id\": \"survey\", \"command\": \"survey\", \"block\": true}"
}
Problem: Using the id value as the command instead of an actual shell command.
Why NAT Works
From the paper:
- +8.74% improvement with 2k positive + 10k negative examples
- Models implicitly learn from negative samples when combined with positive ones
- The improvement is more substantial when there are fewer positive examples (data-scarce scenarios)
- High-quality negative examples are crucial - low-quality negatives can hurt performance
Files
tokenized_fix_git_v7_nat/
βββ train.parquet # Training data (tokenized)
βββ val.parquet # Validation data (tokenized)
βββ README.md # This file
βββ examples/ # Human-readable examples
βββ positive_messages.json
βββ negative_hallucinated.json
βββ negative_looping.json
βββ negative_wrong_cmd.json
βββ positive_text.txt
βββ negative_hallucinated_text.txt
Training Command
torchrun --nproc_per_node=2 src/tbench_sft/tbench_sft.py \
--config src/tbench_sft/config/experiment_fix_git_overfit_v7_nat.yaml
References
- Wang et al. (2024). "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents". NAACL 2025. arXiv:2402.11651
- GitHub: Reason-Wang/NAT