Model Overview

Model Summary

OpenAI's gpt-oss family marks a significant shift towards open-source development for the organization, making advanced AI models more accessible.Released under a permissive Apache 2.0 license, these models are designed for strong reasoning, agentic capabilities, and versatile real-world applications.The family includes two text-only variants, a 21-billion and a 117-billion parameter model, which were trained on a dataset focused on STEM, coding, and general knowledge.This release empowers developers and researchers to run, customize, and build upon these models on their own infrastructure, ensuring data privacy and control.

The gpt-oss models are built on an efficient Transformer architecture that leverages a Mixture-of-Experts (MoE) design.This allows the models to have a large total number of parameters while only activating a fraction for any given task, which significantly reduces computational cost and memory requirements during inference.Both models support a large context length of up to 128,000 tokens and utilize advanced techniques like grouped multi-query attention and Rotary Positional Embeddings (RoPE) for improved efficiency.A key feature is their native quantization, which allows even the large model to run on a single high-end GPU and the smaller model to operate on consumer-grade hardware.

Designed for practical deployment, the gpt-oss models offer features aimed at usability and trust.A unique capability is the adjustable reasoning effort, allowing users to toggle between low, medium, and high settings to balance performance with latency. The models also provide full access to their chain-of-thought reasoning process, which aids in debugging and understanding the model's outputs.With built-in support for tool use like web browsing and code execution, these models are well-suited for creating sophisticated AI agents and customized applications for a wide range of specialized tasks.

For more details, please refer to GPT OSS Blog, GitHub.

Weights are released under the Apache 2 License . Keras model code is released under the Apache 2 License.

Links

Installation

Keras and KerasHub can be installed with:

pip install -U -q keras-hub
pip install -U -q keras

Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.

Available GPT OSS Presets.

The following model checkpoints are provided by the Keras team. Full code examples for each are available below.

Preset Parameters Description
gpt_oss_20b_en 20B This preset has 21 billion total parameters, with 3.6 billion active parameters, a 128k context length, and is de-quantized from MXFP4.
gpt_oss_120b_en 120B This preset has 117 billion total parameters, with 5.1 billion active parameters, a 128k context length, and is de-quantized from MXFP4.
Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including keras/gpt_oss_120b_en