EnvScaler-Qwen3-1.7B

Model Description

EnvScaler-Qwen3-1.7B is a tool-enhanced language model based on Qwen3-1.7B (Thinking Mode), trained using the EnvScaler framework for tool-interactive agent tasks. This model has been trained through Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL).

Training Process

This model was trained using a two-stage approach:

Stage 1: Supervised Fine-Tuning (SFT)

Stage 2: Reinforcement Learning (RL)

The training process enables the model to learn from both demonstration trajectories (SFT) and reinforcement signals (RL), resulting in improved performance on complex tool-interactive tasks.

How to Use

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "XXHStudyHard/EnvScaler-Qwen3-1.7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Use with function calling interface
# See EnvScaler project for full interaction examples

With EnvScaler Framework

For full integration with tool-interactive environments, please refer to the EnvScaler project documentation.

Related Resources

Citation

If you use this model, please cite our work:

@article{song2026envscaler,
  title={EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis},
  author={Song, Xiaoshuai and Chang, Haofei and Dong, Guanting and Zhu, Yutao and Dou, Zhicheng and Wen, Ji-Rong},
  journal={arXiv preprint arXiv:2601.05808},
  year={2026}
}

License

This model is licensed under the Apache 2.0 License, following the base Qwen3 model license.

Contact

For any questions or feedback, please contact: songxiaoshuai@ruc.edu.cn

Downloads last month
15
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XXHStudyHard/EnvScaler-Qwen3-1.7B

Quantizations
1 model

Collection including XXHStudyHard/EnvScaler-Qwen3-1.7B

Paper for XXHStudyHard/EnvScaler-Qwen3-1.7B