Arctic-AWM-4B / README.md

nielsr HF Staff

Add pipeline tag, library name, and sample usage

47bef89 verified 29 days ago

3.59 kB

base_model:
  - Qwen/Qwen3-4B
language:
  - en
license: apache-2.0
tags:
  - agent
  - tool-use
  - reinforcement-learning
  - mcp
pipeline_tag: text-generation
library_name: transformers

Arctic-AWM-4B

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Zhaoyang Wang¹, Canwen Xu², Boyi Liu², Yite Wang², Siwei Han¹,
Zhewei Yao², Huaxiu Yao¹, Yuxiong He²

¹UNC-Chapel Hill ²Snowflake AI Research

Overview

Arctic-AWM-4B is a multi-turn tool-use agent model trained with agentic reinforcement learning on Qwen3-4B, using the fully synthetic environments from AgentWorldModel-1K. It was introduced in the paper Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning.

The model is trained to interact with tool-use environments exposed via a unified MCP (Model Context Protocol) interface, enabling strong multi-turn agentic capabilities.

Sample Usage

To use the model for agentic tasks, you can serve it using vLLM and interact with it using the awm CLI tool.

Serve the model

vllm serve Snowflake/Arctic-AWM-4B --host 127.0.0.1 --port 8000

Run the Agent Demo

After starting an MCP environment (see the GitHub repository for environment setup), you can run the agent:

awm agent \
    --task "show me the top 10 most expensive products" \
    --mcp_url http://localhost:8001/mcp \
    --vllm_url http://localhost:8000/v1 \
    --model Snowflake/Arctic-AWM-4B

Resources

Related resources are also available, please check:

Resource	Link
📄 Paper	📄 arxiv.org/abs/2602.10090
💻 Code	💻 Snowflake-Labs/agent-world-model
📦 AgentWorldModel-1K	🤗 Snowflake/AgentWorldModel-1K
🤖 Arctic-AWM-4B	🤗 Snowflake/Arctic-AWM-4B
🤖 Arctic-AWM-8B	🤗 Snowflake/Arctic-AWM-8B
🤖 Arctic-AWM-14B	🤗 Snowflake/Arctic-AWM-14B

Citation

If you find this resource useful, please kindly cite:

@article{wang2026agentworldmodelinfinity,
      title={Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning}, 
      author={Zhaoyang Wang and Canwen Xu and Boyi Liu and Yite Wang and Siwei Han and Zhewei Yao and Huaxiu Yao and Yuxiong He},
      year={2026},
      eprint={2602.10090},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2602.10090}, 
}