Model Details
📃 Paper • 🌐 Project Page • 🤗 PABU-Data • 🤗 Model (PABU-Agent-8B)
Model Description
PABU-Agent-8B is a Large Language Model (LLM) agent built on top of LLaMA‑3.1‑8B, fine-tuned for interactive decision making using step-level supervision from the PABU Dataset. The model is trained to operate in sequential action–observation environments while maintaining a compact belief state via Progress-Aware Belief Update (PABU).
Instead of conditioning on full interaction histories, the model learns to predict relative task progress at each step and selectively retain informative past interactions. This results in improved task completion and reduced interaction length across diverse long-horizon environments.
- Model type: Decoder-only causal language model with belief-state conditioning
- Language(s) (NLP): English
- License: Inherits LLaMA‑3.1 license and downstream dataset licenses
- Finetuned from model: LLaMA‑3.1‑8B
Model Sources
- Base Model: https://ai.meta.com/llama/
- Repository: https://github.com/Hunter-Jiang/Progress-Aware-Belief-Update
- Paper: PABU: Progress-Aware Belief Update for Efficient LLM Agents
Uses
Direct Use
- Acting as an autonomous LLM agent in text-based interactive environments
- Research on belief updating, memory selection, and long-horizon reasoning
- Benchmarking agent efficiency under fixed training trajectories
Downstream Use
- Further fine-tuning for specialized agent environments
- Integration into agent frameworks requiring compact state representations
Out-of-Scope Use
- General-purpose chat or instruction following without environment feedback
- Real-world decision making or safety-critical deployments
- Tasks requiring multimodal (vision, audio) perception
Bias, Risks, and Limitations
- Optimized for synthetic, text-based environments; real-world transfer is limited
- Progress signals are environment-dependent and may not generalize
- Inherits biases present in the base LLaMA‑3.1 model and environment text
Recommendations
Users should evaluate the model in their target environment and avoid extrapolating performance gains beyond AgentGym-style tasks.
How to Get Started with the Model
The model is intended to be used within an agent loop that alternates between observations and actions and maintaining a belief memory buffer as described in PABU.
Citation
@misc{jiang2026pabuprogressawarebeliefupdate,
title={PABU: Progress-Aware Belief Update for Efficient LLM Agents},
author={Haitao Jiang and Lin Ge and Hengrui Cai and Rui Song},
year={2026},
eprint={2602.09138},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2602.09138},
}
- Downloads last month
- 18
Model tree for HunterJiang97/PABU-Agent-8B
Base model
meta-llama/Llama-3.1-8B