YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

T-Bench Qwen SFT Multi-Task Clean v10

Model Description

This is a Qwen3-8B model fine-tuned on clean terminal bench trajectories. This is the "clean" version trained only on successful trajectories without negative examples.

Training Details

  • Base Model: Qwen/Qwen3-8B
  • Training Method: Supervised Fine-Tuning (SFT) on clean trajectories
  • Epochs: 30
  • Learning Rate: 5e-5
  • Max Length: 32768 tokens
  • Batch Size: 4 (2 per GPU with data parallelism)
  • Attention: FlashAttention 2
  • Precision: bfloat16

Dataset

Trained on clean, successful terminal bench trajectories demonstrating correct command execution patterns.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Aznaur/tbench-qwen-sft-multitask-clean-v10",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Aznaur/tbench-qwen-sft-multitask-clean-v10",
    trust_remote_code=True
)

Model Features

  • Context Length: 32768 tokens (extended from 16384)
  • Memory Efficient: Uses FlashAttention 2 and gradient checkpointing
  • Clean Training: Only successful trajectories, no negative examples
  • Long Context: Supports extended terminal sessions

Hardware Requirements

  • GPU Memory: ~16GB minimum (model is ~16GB with bfloat16)
  • Recommended: A100 40GB+ for optimal performance

License

This model inherits the license from the base Qwen3-8B model.

Downloads last month
53
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support