File size: 15,242 Bytes
41107ef 3a30fda 41107ef 8370f01 41107ef 7a62f6d 41107ef defa5fb 41107ef | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 | ---
base_model:
- openai/gpt-oss-120b
- MultiverseComputingCAI/HyperNova-60B
library_name: transformers
license: apache-2.0
---
<div align="center">
# HyperNova 60B 2602
### Powered by CompactifAI
[](https://opensource.org/licenses/Apache-2.0)
[](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602)
[](https://discord.gg/8mT9FveN)
**Optimized for Efficient Inference** · **Reduced Memory Footprint** · **Native Tool Calling Support**
</div>
---
## Table of Contents
- [Highlights](#highlights)
- [Model Overview](#model-overview)
- [Key Characteristics](#key-characteristics)
- [Quick Start](#quick-start)
- [What's New in HyperNova 60B 2602](#whats-new-in-hypernova-60b-2602)
- [Tool Calling](#tool-calling)
- [Training & Fine-Tuning](#training--fine-tuning)
- [Architecture](#architecture)
- [Evaluation & Benchmarks](#evaluation--benchmarks)
- [Languages](#languages)
- [Intended Use](#intended-use)
- [Safety & Limitations](#safety--limitations)
- [Model Information](#model-information)
- [Citation](#citation)
---
## Model Overview
**HyperNova 60B 2602** is a **model developed based on [OpenAI’s gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b)**, developed by **Multiverse Computing**. The original gpt-oss-120b is an open-weight model (117B parameters, 5.1B active in MoE) designed for powerful reasoning, agentic tasks, and versatile developer use. This version is compressed with **CompactifAI**, Multiverse Computing’s proprietary technology, reducing parameter count and memory requirements while aiming to preserve strong reasoning.
The model is **instruction-tuned** and supports **native tool calling** (function calling with defined schemas, structured outputs, and agent-style workflows). HyperNova 60B 2602 is intended for the same broad use cases as gpt-oss-120b—reasoning, code generation, RAG, and tool-augmented applications—with **lower memory footprint** and deployment flexibility.
---
## Key Characteristics
| Characteristic | Description |
|-----------------------|-------------|
| Base model | [OpenAI gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, MoE; open-weight, Apache 2.0) |
| 🛠️ **Tool calling** | Native support; OpenAI-style function / tool calling schemas; agentic use (e.g. function calling, structured outputs) |
| 🧠 **Parameters** | 60B total parameters after CompactifAI compression (reduced vs. base 117B) |
| 📐 **Architecture** | Decoder-only Transformer (from gpt-oss lineage) |
| 🗜️ **Compression** | CompactifAI (proprietary compression technology) |
| Primary language | English |
| Other languages | Not formally evaluated |
---
## Quick Start
This model can be loaded with the **Transformers** API. Use `trust_remote_code=True` (required for the gpt-oss architecture). Recommended approach: `AutoModelForCausalLM` with `apply_chat_template`:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "MultiverseComputingCAI/HyperNova-60B-2602"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype="auto",
trust_remote_code=True,
)
messages = [{"role": "user", "content": "What is a Hypernova?"}]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
)
inputs = inputs.to(model.device)
attention_mask = torch.ones_like(inputs, dtype=torch.long, device=inputs.device)
outputs = model.generate(
inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
attention_mask=attention_mask,
)
reply = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(reply)
```
Alternatively you can use the `pipeline` API with `trust_remote_code=True`; the pipeline returns the full conversation structure, so extract the assistant message from `outputs[0]["generated_text"]` as needed.
---
## What’s New in HyperNova 60B 2602
**HyperNova 60B 2602** is a model developed based on **gpt-oss-120b**, retaining the base model’s strengths while reducing memory and improving deployment flexibility.
### Summary
- **Model developed based on [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b):** Same Apache 2.0 license and design goals (reasoning, agentic tasks, tool use); smaller footprint via CompactifAI.
- **Tool use:** Retains support for function calling, structured outputs, and agent-style workflows (OpenAI-style schemas).
- **Reasoning:** Compatible with configurable reasoning effort (e.g. low / medium / high in system prompt) where the format is preserved; full chain-of-thought available for debugging and analysis.
- **Evaluated** on tool-focused benchmarks (e.g. BFCL v4, Tau2-bench) and general benchmarks alongside other CompactifAI and gpt-oss variants.
---
## Tool Calling
HyperNova 60B 2602 supports **native tool use** and is well-suited for:
- **Function calling** with defined schemas
- **Structured outputs**
- **Agentic operations** (e.g. browser tasks, code execution where supported)
The model can detect when to invoke tools, emit structured JSON tool calls, and consume tool outputs to continue generation. Tool-calling behavior follows **OpenAI-style schemas**; compatibility refers to format and structure—exact parity with the base or other models is not guaranteed.
### Example Tool Call
```json
{
"name": "get_weather",
"arguments": {
"city": "Paris",
"date": "2026-02-10"
}
}
```
---
## Training & Fine-Tuning
### Base Model: gpt-oss-120b
The base model [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) was trained on OpenAI’s **harmony response format** and is intended for use with that format for correct behavior. It supports configurable reasoning levels (low / medium / high) and native tool use. See the [original model card](https://huggingface.co/openai/gpt-oss-120b) and [arXiv:2508.10925](https://arxiv.org/abs/2508.10925) for details.
### CompactifAI Compression & Optional Fine-Tuning
- **Compression:** CompactifAI was applied to produce a smaller, efficient model (60B parameters) while aiming to preserve reasoning and tool-use capabilities.
- **Optional fine-tuning:** This variant may include additional fine-tuning for tool calling and structured outputs; exact training details are model-specific.
---
## Architecture
### Model Specifications
| Specification | Value |
|-------------------|--------------------|
| Base model | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, 5.1B active MoE) |
| Total parameters | 60B, 4.8B active MoE |
---
## Evaluation & Benchmarks
### Evaluation Methodology
Benchmark scores were obtained with the following setups. Methodology varies by benchmark family.
#### MMLU-Pro, AIME25, GPQA:d, LiveCodeBench
- **Evaluation framework**: [Lighteval](https://github.com/huggingface/lighteval)
- **Inference library**: vLLM 0.14.0
- **Reasoning effort**: medium
- **Decoding**: temperature = 0.6, max_tokens = 131072, top_p = 1.0, top_k = 0
- **Batch size**: 64
#### IFBench, AA-LCR, SciCode
- **Evaluation framework**: [Nemo-skills](https://github.com/NVIDIA/NeMo-Skills)
- **Inference library**: vLLM 0.14.0
- **Reasoning effort**: medium
- **Decoding**: temperature = 1.0, max_tokens = 131072, top_p = 1.0, top_k = 0
- **Batch size**: 64
#### BFCL v4 (17 splits)
- **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1
- **Inference library**: vLLM 0.14.0
- **Reasoning effort**: high
- **Decoding**: temperature = 0.6, max_tokens = 16384, parallel_tool_calls = true, tool-call parser openai
#### Tau2-bench (Telecom)
- **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1
- **Inference library**: vLLM 0.14.0
- **Reasoning effort**: high (agent `extra_body.reasoning_effort`)
- **Decoding (agent)**: temperature = 1.0, top_p = 1.0, min_tokens = 1
- **Decoding (judge / user simulator)**: temperature = 0.7, timeout = 600
- **Reproducibility**: subset telecom (default); max steps 100; repeats 3; tool-call parser openai (agent), hermes (judge)
#### Terminal-Bench Hard (Artificial Analysis subset):
- **Evaluation framework**: laude-institute/harbor == 0.1.43
- **Inference library**: vLLM == 0.15.0
- **Reasoning effort**: high
- **Decoding**: temperature = 1.0, top_p = 1.0, max-model-len = 131072
- **Reproducibility**: subset from AA (https://artificialanalysis.ai/methodology/intelligence-benchmarking#terminal-bench-hard)
- **Agent**: terminus-2, max episodes 100; repeats 3;
### Quantitative Results (Reported & Planned)
Scores are accuracy or benchmark-specific metrics. Use `—` or *TBD* for evaluations not yet run. Reported numbers use the methodology described above (reasoning: cai-eval + Nemo-skills; BFCL v4 and Tau2-bench: cai-eval + EvalScope); other entries to be documented.
| Benchmark | gpt-oss-20b | gpt-oss-120b | HyperNova 60B 2602 |
|-----------------------|-----------------------|------------------------|--------------------------|
| MMLU-Pro | 74 | 78 | 74 |
| BFCL v4 | 61 | 64 | 62 |
| Tau2-bench (Telecom) | 59 | 68 | 61 |
| AIME25 | 72 | 80 | 76 |
| GPQA:d | 63 | 69 | 69 |
| IFBench | 55 | 63 | 60 |
| SciCode | 34 | 38 | 32 |
| LiveCodeBench | 64 | 66 | 64 |
| Terminal Bench | 9 | 22 | 16 |
| AA-LCR | 37 | 50 | 36 |
| AA-Omnis. Index | -40 | -36 | -41 |
| AA-Omnis. Accuracy | 16 | 21 | 15 |


### Quantitative Results (Inference Performance)
Representative throughput and memory under the evaluation setup above. Comparison against **gpt-oss-20b** and **gpt-oss-120b** on the same hardware.
#### Performance evaluation conditions
Describe the setup used to obtain the numbers in the table below (replace the placeholders or add a short paragraph):
- **Inference library**: vLLM 0.14.0
- **Hardware**: 4× NVIDIA H200 Tensor Core GPU
- **Conditions**: batch size=512, context length=512, decode length=256
- **Notes**: dtype=default
| Metric | gpt-oss-20b | gpt-oss-120b | HyperNova 60B 2602 | Hardware |
|----------------------------|--------------------------|--------------------------|--------------------------|-------------------------------|
| Tokens / second (decode) | 250 | 228 | 240 | 4× NVIDIA H200 Tensor Core GPU|
| Time to first token (ms) | 26 | 26 | 25 | 4× NVIDIA H200 Tensor Core GPU|
| Peak GPU memory (GB) | 13 | 61 | 32 | 4× NVIDIA H200 Tensor Core GPU|

---
## Languages
- **Primary language**: English
- **Other languages**: Not formally evaluated
The model was trained primarily on English-language data. Performance on other languages may vary and has not been systematically measured.
---
## Intended Use
### Recommended Use Cases
Aligned with [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) use cases, with the benefit of a smaller footprint:
- **Reasoning and analysis** (with configurable reasoning effort where supported)
- **Tool-augmented and agentic applications** (function calling, web browsing, code execution, structured outputs)
- **Code generation and reasoning**
- **Chatbots and virtual assistants**
- **Retrieval-augmented generation (RAG)**
- **Deployments** where gpt-oss-120b is desirable but memory or latency is constrained
### Out-of-Scope Uses
- Harmful, illegal, or deceptive content generation
- Impersonation of real individuals without consent
- High-risk decision-making without human oversight
- Surveillance or tracking of individuals
- Any use that violates applicable laws or regulations
---
## Safety & Limitations
### Known Limitations
- **English-centric** training data (inherited from base model).
- **Format:** For best results, use the same [harmony response format](https://huggingface.co/openai/gpt-oss-120b) as gpt-oss-120b where applicable; behavior may differ otherwise.
- **Tool calling** depends on correct schema and tool design; exact parity with gpt-oss-120b or other models is not guaranteed.
- **Compression** may affect some behaviors; evaluate for your use case.
### Recommendations
- Validate tool outputs before execution
- Use human oversight for critical applications
- Perform task-specific evaluation prior to deployment
---
## Model Information
| Field | Value |
|--------------|--------------------- |
| Model name | HyperNova 60B 2602 |
| Based on | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) |
| Version | 2602 |
| Release date | 26/02/2026 |
| Developed by | Multiverse Computing |
| License | Apache 2.0 |
| Contact | business@multiversecomputing.com |
---
## Citation
If you use this model, please cite the base model and this variant:
```bibtex
@misc{openai2025gptoss120b,
title = {gpt-oss-120b \& gpt-oss-20b Model Card},
author = {OpenAI},
year = {2025},
eprint = {2508.10925},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2508.10925}
}
@misc{hypernova60b2602,
title = {HyperNova 60B 2602: Model developed based on gpt-oss-120b},
author = {Multiverse Computing},
year = {2026},
url = {https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602},
note = {Model developed based on openai/gpt-oss-120b using CompactifAI technology}
}
```
**Built by [Multiverse Computing](https://www.multiversecomputing.com)** · [Report an issue](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602/discussions) · [Discord](https://discord.gg/8mT9FveN) |