Update model card with metadata, links, and sample usage
Browse filesHi! I'm Niels from the Hugging Face community science team. I've updated your model card to include:
- Relevant metadata (`pipeline_tag` and `library_name`).
- A link to your GitHub repository and the paper.
- A summary of the **TerminalTraj** pipeline.
- A sample usage snippet derived from your README to make it easier for users to get started.
This will improve the discoverability and usability of the model on the Hub.
README.md
CHANGED
|
@@ -1,4 +1,56 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
## Citation
|
| 4 |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: transformers
|
| 4 |
+
pipeline_tag: text-generation
|
| 5 |
+
base_model: Qwen/Qwen2.5-Coder-7B
|
| 6 |
+
tags:
|
| 7 |
+
- terminal-agent
|
| 8 |
+
- agent
|
| 9 |
+
- code
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# TerminalTraj-7B
|
| 13 |
+
|
| 14 |
+
This is the 7B model presented in the paper [Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments](https://huggingface.co/papers/2602.01244).
|
| 15 |
+
|
| 16 |
+
## Introduction
|
| 17 |
+
|
| 18 |
+
Training agentic models for terminal-based tasks critically depends on high-quality terminal trajectories that capture realistic long-horizon interactions across diverse domains. **TerminalTraj** is a scalable pipeline that:
|
| 19 |
+
1. Filters high-quality repositories to construct Dockerized execution environments.
|
| 20 |
+
2. Generates Docker-aligned task instances.
|
| 21 |
+
3. Synthesizes agent trajectories with executable validation code.
|
| 22 |
+
|
| 23 |
+
Using TerminalTraj, the authors curated 32K Docker images and generated 50,733 verified terminal trajectories. This model is fine-tuned from the Qwen2.5-Coder-7B backbone, achieving significant performance improvements on TerminalBench.
|
| 24 |
+
|
| 25 |
+
- **Repository:** [multimodal-art-projection/TerminalTraj](https://github.com/multimodal-art-projection/TerminalTraj)
|
| 26 |
+
- **Paper:** [Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments](https://huggingface.co/papers/2602.01244)
|
| 27 |
+
- **Dataset:** [m-a-p/TerminalTraj](https://huggingface.co/datasets/m-a-p/TerminalTraj)
|
| 28 |
+
|
| 29 |
+
## Sample Usage
|
| 30 |
+
|
| 31 |
+
You can use this model with the `transformers` library:
|
| 32 |
+
|
| 33 |
+
```python
|
| 34 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 35 |
+
import torch
|
| 36 |
+
|
| 37 |
+
model_id = "m-a-p/TerminalTraj-7B"
|
| 38 |
+
|
| 39 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 40 |
+
|
| 41 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 42 |
+
model_id,
|
| 43 |
+
torch_dtype=torch.bfloat16,
|
| 44 |
+
device_map="auto"
|
| 45 |
+
)
|
| 46 |
+
|
| 47 |
+
# Inference example
|
| 48 |
+
prompt = "Write a bash script to find all .py files in a directory and count the lines of code."
|
| 49 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 50 |
+
|
| 51 |
+
outputs = model.generate(**inputs, max_new_tokens=512)
|
| 52 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 53 |
+
```
|
| 54 |
|
| 55 |
## Citation
|
| 56 |
|