| --- |
| base_model: |
| - openai/gpt-oss-120b |
| - MultiverseComputingCAI/Hypernova-60B-2602 |
| library_name: transformers |
| license: apache-2.0 |
| tags: |
| - text-generation |
| --- |
| <div align="center"> |
|
|
| # HyperNova 60B 2602 |
|
|
| ### Powered by CompactifAI |
|
|
| [](https://opensource.org/licenses/Apache-2.0) |
| [](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602) |
| [](https://discord.gg/cGas9uStqp) |
|
|
| **Optimized for Efficient Inference** · **Reduced Memory Footprint** · **Native Tool Calling Support** |
|
|
| </div> |
|
|
| --- |
|
|
| ## Table of Contents |
|
|
| - [Highlights](#highlights) |
| - [Model Overview](#model-overview) |
| - [Key Characteristics](#key-characteristics) |
| - [Quick Start](#quick-start) |
| - [What's New in HyperNova 60B 2602](#whats-new-in-hypernova-60b-2602) |
| - [Tool Calling](#tool-calling) |
| - [Training & Fine-Tuning](#training--fine-tuning) |
| - [Architecture](#architecture) |
| - [Evaluation & Benchmarks](#evaluation--benchmarks) |
| - [Languages](#languages) |
| - [Intended Use](#intended-use) |
| - [Safety & Limitations](#safety--limitations) |
| - [Model Information](#model-information) |
| - [Citation](#citation) |
|
|
| --- |
|
|
| ## Model Overview |
|
|
| **HyperNova 60B 2602** is a **model developed based on [OpenAI’s gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b)**, developed by **Multiverse Computing**. The original gpt-oss-120b is an open-weight model (117B parameters, 5.1B active in MoE) designed for powerful reasoning, agentic tasks, and versatile developer use. This version is compressed with **CompactifAI**, Multiverse Computing’s proprietary technology, reducing parameter count and memory requirements while aiming to preserve strong reasoning. |
|
|
| The model is **instruction-tuned** and supports **native tool calling** (function calling with defined schemas, structured outputs, and agent-style workflows). HyperNova 60B 2602 is intended for the same broad use cases as gpt-oss-120b—reasoning, code generation, RAG, and tool-augmented applications—with **lower memory footprint** and deployment flexibility. |
|
|
| ## Technical Deep Dive |
| For a detailed explanation of the compression architecture, model compression process, and benchmark results behind Hypernova-60B v2602, read [this full technical article by Johanna Angulo, Evaluation Manager at Multiverse Computing.](https://multiversecomputing.com/papers/hypernova-60b-2602-same-intelligence-half-the-size-improved-tool-calling-capability) |
|
|
| --- |
|
|
| ## Key Characteristics |
|
|
| | Characteristic | Description | |
| |-----------------------|-------------| |
| | Base model | [OpenAI gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, MoE; open-weight, Apache 2.0) | |
| | 🛠️ **Tool calling** | Native support; OpenAI-style function / tool calling schemas; agentic use (e.g. function calling, structured outputs) | |
| | 🧠 **Parameters** | 60B total parameters after CompactifAI compression (reduced vs. base 117B) | |
| | 📐 **Architecture** | Decoder-only Transformer (from gpt-oss lineage) | |
| | 🗜️ **Compression** | CompactifAI (proprietary compression technology) | |
| | Primary language | English | |
| | Other languages | Not formally evaluated | |
| --- |
| ## Quick Start |
| This model can be loaded with the **Transformers** API. Use `trust_remote_code=True` (required for the gpt-oss architecture). Recommended approach: `AutoModelForCausalLM` with `apply_chat_template`: |
| ```python |
| import torch |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| model_id = "MultiverseComputingCAI/HyperNova-60B-2602" |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| device_map="auto", |
| torch_dtype="auto", |
| trust_remote_code=True, |
| ) |
| messages = [{"role": "user", "content": "What is a Hypernova?"}] |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| return_tensors="pt", |
| add_generation_prompt=True, |
| ) |
| inputs = inputs.to(model.device) |
| attention_mask = torch.ones_like(inputs, dtype=torch.long, device=inputs.device) |
| outputs = model.generate( |
| inputs, |
| max_new_tokens=512, |
| do_sample=True, |
| temperature=0.7, |
| attention_mask=attention_mask, |
| ) |
| reply = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True) |
| print(reply) |
| ``` |
| Alternatively you can use the `pipeline` API with `trust_remote_code=True`; the pipeline returns the full conversation structure, so extract the assistant message from `outputs[0]["generated_text"]` as needed. |
|
|
| --- |
|
|
| ## What’s New in HyperNova 60B 2602 |
|
|
| **HyperNova 60B 2602** is a model developed based on **gpt-oss-120b**, retaining the base model’s strengths while reducing memory and improving deployment flexibility. |
|
|
| ### Summary |
|
|
| - **Model developed based on [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b):** Same Apache 2.0 license and design goals (reasoning, agentic tasks, tool use); smaller footprint via CompactifAI. |
| - **Tool use:** Retains support for function calling, structured outputs, and agent-style workflows (OpenAI-style schemas). |
| - **Reasoning:** Compatible with configurable reasoning effort (e.g. low / medium / high in system prompt) where the format is preserved; full chain-of-thought available for debugging and analysis. |
| - **Evaluated** on tool-focused benchmarks (e.g. BFCL v4, Tau2-bench) and general benchmarks alongside other CompactifAI and gpt-oss variants. |
|
|
| --- |
|
|
| ## Tool Calling |
|
|
| HyperNova 60B 2602 supports **native tool use** and is well-suited for: |
|
|
| - **Function calling** with defined schemas |
| - **Structured outputs** |
| - **Agentic operations** (e.g. browser tasks, code execution where supported) |
|
|
| The model can detect when to invoke tools, emit structured JSON tool calls, and consume tool outputs to continue generation. Tool-calling behavior follows **OpenAI-style schemas**; compatibility refers to format and structure—exact parity with the base or other models is not guaranteed. |
|
|
| ### Example Tool Call |
|
|
| ```json |
| { |
| "name": "get_weather", |
| "arguments": { |
| "city": "Paris", |
| "date": "2026-02-10" |
| } |
| } |
| ``` |
|
|
| --- |
|
|
| ## Training & Fine-Tuning |
|
|
| ### Base Model: gpt-oss-120b |
|
|
| The base model [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) was trained on OpenAI’s **harmony response format** and is intended for use with that format for correct behavior. It supports configurable reasoning levels (low / medium / high) and native tool use. See the [original model card](https://huggingface.co/openai/gpt-oss-120b) and [arXiv:2508.10925](https://arxiv.org/abs/2508.10925) for details. |
|
|
| ### CompactifAI Compression & Optional Fine-Tuning |
|
|
| - **Compression:** CompactifAI was applied to produce a smaller, efficient model (60B parameters) while aiming to preserve reasoning and tool-use capabilities. |
| - **Optional fine-tuning:** This variant may include additional fine-tuning for tool calling and structured outputs; exact training details are model-specific. |
|
|
| --- |
|
|
| ## Architecture |
|
|
| ### Model Specifications |
|
|
| | Specification | Value | |
| |-------------------|--------------------| |
| | Base model | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, 5.1B active MoE) | |
| | Total parameters | 60B, 4.8B active MoE | |
|
|
| --- |
|
|
| ## Evaluation & Benchmarks |
|
|
| ### Evaluation Methodology |
|
|
| Benchmark scores were obtained with the following setups. Methodology varies by benchmark family. |
|
|
| #### MMLU-Pro, AIME25, GPQA:d, LiveCodeBench |
|
|
| - **Evaluation framework**: [Lighteval](https://github.com/huggingface/lighteval) |
| - **Inference library**: vLLM 0.14.0 |
| - **Reasoning effort**: medium |
| - **Decoding**: temperature = 0.6, max_tokens = 131072, top_p = 1.0, top_k = 0 |
| - **Batch size**: 64 |
| |
| #### IFBench, AA-LCR, SciCode |
| |
| - **Evaluation framework**: [Nemo-skills](https://github.com/NVIDIA/NeMo-Skills) |
| - **Inference library**: vLLM 0.14.0 |
| - **Reasoning effort**: medium |
| - **Decoding**: temperature = 1.0, max_tokens = 131072, top_p = 1.0, top_k = 0 |
| - **Batch size**: 64 |
| |
| #### BFCL v4 (17 splits) |
|
|
| - **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1 |
| - **Inference library**: vLLM 0.14.0 |
| - **Reasoning effort**: high |
| - **Decoding**: temperature = 0.6, max_tokens = 16384, parallel_tool_calls = true, tool-call parser openai |
| |
| #### Tau2-bench (Telecom) |
| |
| - **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1 |
| - **Inference library**: vLLM 0.14.0 |
| - **Reasoning effort**: high (agent `extra_body.reasoning_effort`) |
| - **Decoding (agent)**: temperature = 1.0, top_p = 1.0, min_tokens = 1 |
| - **Decoding (judge / user simulator)**: temperature = 0.7, timeout = 600 |
| - **Reproducibility**: subset telecom (default); max steps 100; repeats 3; tool-call parser openai (agent), hermes (judge) |
| |
| #### Terminal-Bench Hard (Artificial Analysis subset): |
| |
| - **Evaluation framework**: laude-institute/harbor == 0.1.43 |
| - **Inference library**: vLLM == 0.15.0 |
| - **Reasoning effort**: high |
| - **Decoding**: temperature = 1.0, top_p = 1.0, max-model-len = 131072 |
| - **Reproducibility**: subset from AA (https://artificialanalysis.ai/methodology/intelligence-benchmarking#terminal-bench-hard) |
| - **Agent**: terminus-2, max episodes 100; repeats 3; |
|
|
| ### Quantitative Results (Reported & Planned) |
|
|
| Scores are accuracy or benchmark-specific metrics. Use `—` or *TBD* for evaluations not yet run. Reported numbers use the methodology described above (reasoning: cai-eval + Nemo-skills; BFCL v4 and Tau2-bench: cai-eval + EvalScope); other entries to be documented. |
|
|
| | Benchmark | gpt-oss-20b | gpt-oss-120b | HyperNova 60B 2602 | |
| |-----------------------|-----------------------|------------------------|--------------------------| |
| | MMLU-Pro | 74 | 78 | 74 | |
| | BFCL v4 | 61 | 64 | 62 | |
| | Tau2-bench (Telecom) | 59 | 68 | 61 | |
| | AIME25 | 72 | 80 | 76 | |
| | GPQA:d | 63 | 69 | 69 | |
| | IFBench | 55 | 63 | 60 | |
| | SciCode | 34 | 38 | 32 | |
| | LiveCodeBench | 64 | 66 | 64 | |
| | Terminal Bench | 9 | 22 | 16 | |
| | AA-LCR | 37 | 50 | 36 | |
| | AA-Omnis. Index | -40 | -36 | -41 | |
| | AA-Omnis. Accuracy | 16 | 21 | 15 | |
|
|
|  |
|  |
|
|
| ### Quantitative Results (Inference Performance) |
|
|
| Representative throughput and memory under the evaluation setup above. Comparison against **gpt-oss-120b** on the same hardware. |
|
|
| #### Performance evaluation conditions |
|
|
| - **Inference library**: vLLM 0.14.0 |
| - **Hardware**: 1× NVIDIA H200 Tensor Core GPU |
| - **Conditions**: concurrency=128 |
|
|
| **Summary of Improvements:** |
|
|
| - **Throughput (tok/s)**: Hypernova is 39.5% faster |
| - **Median TTFT (ms)**: Hypernova is 50.8% faster |
|
|
|
|
|  |
|
|
| --- |
|
|
| ## Languages |
|
|
| - **Primary language**: English |
| - **Other languages**: Not formally evaluated |
|
|
| The model was trained primarily on English-language data. Performance on other languages may vary and has not been systematically measured. |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| ### Recommended Use Cases |
|
|
| Aligned with [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) use cases, with the benefit of a smaller footprint: |
|
|
| - **Reasoning and analysis** (with configurable reasoning effort where supported) |
| - **Tool-augmented and agentic applications** (function calling, web browsing, code execution, structured outputs) |
| - **Code generation and reasoning** |
| - **Chatbots and virtual assistants** |
| - **Retrieval-augmented generation (RAG)** |
| - **Deployments** where gpt-oss-120b is desirable but memory or latency is constrained |
|
|
| ### Out-of-Scope Uses |
|
|
| - Harmful, illegal, or deceptive content generation |
| - Impersonation of real individuals without consent |
| - High-risk decision-making without human oversight |
| - Surveillance or tracking of individuals |
| - Any use that violates applicable laws or regulations |
|
|
| --- |
|
|
| ## Safety & Limitations |
|
|
| ### Known Limitations |
|
|
| - **English-centric** training data (inherited from base model). |
| - **Format:** For best results, use the same [harmony response format](https://huggingface.co/openai/gpt-oss-120b) as gpt-oss-120b where applicable; behavior may differ otherwise. |
| - **Tool calling** depends on correct schema and tool design; exact parity with gpt-oss-120b or other models is not guaranteed. |
| - **Compression** may affect some behaviors; evaluate for your use case. |
|
|
| ### Recommendations |
|
|
| - Validate tool outputs before execution |
| - Use human oversight for critical applications |
| - Perform task-specific evaluation prior to deployment |
|
|
| --- |
|
|
| ## Model Information |
|
|
| | Field | Value | |
| |--------------|--------------------- | |
| | Model name | HyperNova 60B 2602 | |
| | Based on | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | |
| | Version | 2602 | |
| | Release date | 26/02/2026 | |
| | Developed by | Multiverse Computing | |
| | License | Apache 2.0 | |
| | Contact | business@multiversecomputing.com | |
|
|
| --- |
|
|
| ## Citation |
|
|
| If you use this model, please cite the base model and this variant: |
|
|
| ```bibtex |
| @misc{openai2025gptoss120b, |
| title = {gpt-oss-120b \& gpt-oss-20b Model Card}, |
| author = {OpenAI}, |
| year = {2025}, |
| eprint = {2508.10925}, |
| archivePrefix = {arXiv}, |
| primaryClass = {cs.CL}, |
| url = {https://arxiv.org/abs/2508.10925} |
| } |
| @misc{hypernova60b2602, |
| title = {HyperNova 60B 2602: Model developed based on gpt-oss-120b}, |
| author = {Multiverse Computing}, |
| year = {2026}, |
| url = {https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602}, |
| note = {Model developed based on openai/gpt-oss-120b using CompactifAI technology} |
| } |
| ``` |
|
|
| **Built by [Multiverse Computing](https://www.multiversecomputing.com)** · [Report an issue](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602/discussions) · [Discord](https://discord.gg/8mT9FveN) |