GPT-OSS AgentBoi

A parameter-efficient fine-tuning of GPT-OSS-20B focused on improving agentic reasoning, structured tool use, and ReAct-style instruction following.

This model was fine-tuned using LoRA adapters on the ReAct subset of Agent-FLAN with the goal of making GPT-OSS more reliable at multi-step reasoning, tool selection, action-observation workflows, and structured agent behavior.

Overview

Large language models are often strong conversationalists but can struggle with:

  • Multi-step planning
  • Tool selection and invocation
  • ReAct-style reasoning workflows
  • Structured action generation
  • Separating reasoning from final responses

GPT-OSS AgentBoi adapts GPT-OSS-20B toward these agent-oriented tasks while remaining trainable on consumer hardware through parameter-efficient fine-tuning.

Model Details

Item Value
Model Name GPT-OSS AgentBoi
Author shiv207
Base Model unsloth/gpt-oss-20b-unsloth-bnb-4bit
Training Method LoRA
Framework Unsloth
Dataset Agent-FLAN (ReAct subset)
Primary Task Agentic Tool Use
Language English
License Apache 2.0

Training Data

The model was fine-tuned using examples from the Agent-FLAN dataset, specifically the ReAct-style instruction trajectories.

These examples teach the model to:

  • Break complex tasks into intermediate steps
  • Decide when tool usage is appropriate
  • Generate structured actions
  • Follow action-observation loops
  • Produce concise final responses

Training Setup

Training was performed using:

  • GPT-OSS-20B
  • Unsloth
  • TRL
  • LoRA adapters
  • Google Colab Tesla T4 GPU

The objective was to improve agentic behavior while keeping training accessible on limited hardware.

Intended Use

This model is intended for:

  • AI agents
  • Tool-calling systems
  • Research assistants
  • Retrieval-augmented generation workflows
  • Multi-step planning tasks
  • Agentic reasoning experiments

Potential applications include:

  • Search agents
  • Knowledge retrieval systems
  • Function-calling assistants
  • Research copilots
  • Workflow automation agents

Example

User

Search for the latest SpaceX launch and summarize it.

Expected Agent Behavior

  1. Analyze the request.
  2. Determine that external information is required.
  3. Generate a structured search action.
  4. Process retrieved information.
  5. Produce a concise final answer.

The fine-tuning objective is to increase consistency in these workflows compared to the base model.

Loading the Model

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    "shiv207/gpt_oss_AGENTBOI"
)

Limitations

  • Evaluated primarily through qualitative testing.
  • No formal benchmark suite was used.
  • Training utilized only a subset of Agent-FLAN.
  • Performance may vary on unseen tool schemas.
  • Not optimized for general-purpose instruction tuning beyond agent-oriented tasks.

Acknowledgments

This project builds upon the work of:

  • OpenAI for GPT-OSS and the Harmony conversation format.
  • Unsloth for efficient GPT-OSS fine-tuning support.
  • InternLM for the Agent-FLAN dataset.

Repository

Source code and training notebook:

GitHub: https://github.com/shiv207

Author

shiv207

If you find this project useful, feel free to open issues, share feedback, or build on top of it.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shiv207/gpt_oss_AGENTBOI

Adapter
(72)
this model