| | --- |
| | base_model: |
| | - openai/gpt-oss-120b |
| | - MultiverseComputingCAI/HyperNova-60B |
| | library_name: transformers |
| | license: apache-2.0 |
| | --- |
| | <div align="center"> |
| |
|
| | # HyperNova 60B 2602 |
| |
|
| | ### Powered by CompactifAI |
| |
|
| | [](https://opensource.org/licenses/Apache-2.0) |
| | [](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602) |
| | [](https://discord.gg/8mT9FveN) |
| |
|
| | **Optimized for Efficient Inference** · **Reduced Memory Footprint** · **Native Tool Calling Support** |
| |
|
| | </div> |
| |
|
| | --- |
| |
|
| | ## Table of Contents |
| |
|
| | - [Highlights](#highlights) |
| | - [Model Overview](#model-overview) |
| | - [Key Characteristics](#key-characteristics) |
| | - [Quick Start](#quick-start) |
| | - [What's New in HyperNova 60B 2602](#whats-new-in-hypernova-60b-2602) |
| | - [Tool Calling](#tool-calling) |
| | - [Training & Fine-Tuning](#training--fine-tuning) |
| | - [Architecture](#architecture) |
| | - [Evaluation & Benchmarks](#evaluation--benchmarks) |
| | - [Languages](#languages) |
| | - [Intended Use](#intended-use) |
| | - [Safety & Limitations](#safety--limitations) |
| | - [Model Information](#model-information) |
| | - [Citation](#citation) |
| |
|
| | --- |
| |
|
| | ## Model Overview |
| |
|
| | **HyperNova 60B 2602** is a **model developed based on [OpenAI’s gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b)**, developed by **Multiverse Computing**. The original gpt-oss-120b is an open-weight model (117B parameters, 5.1B active in MoE) designed for powerful reasoning, agentic tasks, and versatile developer use. This version is compressed with **CompactifAI**, Multiverse Computing’s proprietary technology, reducing parameter count and memory requirements while aiming to preserve strong reasoning. |
| |
|
| | The model is **instruction-tuned** and supports **native tool calling** (function calling with defined schemas, structured outputs, and agent-style workflows). HyperNova 60B 2602 is intended for the same broad use cases as gpt-oss-120b—reasoning, code generation, RAG, and tool-augmented applications—with **lower memory footprint** and deployment flexibility. |
| |
|
| | --- |
| |
|
| | ## Key Characteristics |
| |
|
| | | Characteristic | Description | |
| | |-----------------------|-------------| |
| | | Base model | [OpenAI gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, MoE; open-weight, Apache 2.0) | |
| | | 🛠️ **Tool calling** | Native support; OpenAI-style function / tool calling schemas; agentic use (e.g. function calling, structured outputs) | |
| | | 🧠 **Parameters** | 60B total parameters after CompactifAI compression (reduced vs. base 117B) | |
| | | 📐 **Architecture** | Decoder-only Transformer (from gpt-oss lineage) | |
| | | 🗜️ **Compression** | CompactifAI (proprietary compression technology) | |
| | | Primary language | English | |
| | | Other languages | Not formally evaluated | |
| | --- |
| | ## Quick Start |
| | This model can be loaded with the **Transformers** API. Use `trust_remote_code=True` (required for the gpt-oss architecture). Recommended approach: `AutoModelForCausalLM` with `apply_chat_template`: |
| | ```python |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | model_id = "MultiverseComputingCAI/HyperNova-60B-2602" |
| | tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_id, |
| | device_map="auto", |
| | torch_dtype="auto", |
| | trust_remote_code=True, |
| | ) |
| | messages = [{"role": "user", "content": "What is a Hypernova?"}] |
| | inputs = tokenizer.apply_chat_template( |
| | messages, |
| | return_tensors="pt", |
| | add_generation_prompt=True, |
| | ) |
| | inputs = inputs.to(model.device) |
| | attention_mask = torch.ones_like(inputs, dtype=torch.long, device=inputs.device) |
| | outputs = model.generate( |
| | inputs, |
| | max_new_tokens=512, |
| | do_sample=True, |
| | temperature=0.7, |
| | attention_mask=attention_mask, |
| | ) |
| | reply = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True) |
| | print(reply) |
| | ``` |
| | Alternatively you can use the `pipeline` API with `trust_remote_code=True`; the pipeline returns the full conversation structure, so extract the assistant message from `outputs[0]["generated_text"]` as needed. |
| |
|
| | --- |
| |
|
| | ## What’s New in HyperNova 60B 2602 |
| |
|
| | **HyperNova 60B 2602** is a model developed based on **gpt-oss-120b**, retaining the base model’s strengths while reducing memory and improving deployment flexibility. |
| |
|
| | ### Summary |
| |
|
| | - **Model developed based on [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b):** Same Apache 2.0 license and design goals (reasoning, agentic tasks, tool use); smaller footprint via CompactifAI. |
| | - **Tool use:** Retains support for function calling, structured outputs, and agent-style workflows (OpenAI-style schemas). |
| | - **Reasoning:** Compatible with configurable reasoning effort (e.g. low / medium / high in system prompt) where the format is preserved; full chain-of-thought available for debugging and analysis. |
| | - **Evaluated** on tool-focused benchmarks (e.g. BFCL v4, Tau2-bench) and general benchmarks alongside other CompactifAI and gpt-oss variants. |
| |
|
| | --- |
| |
|
| | ## Tool Calling |
| |
|
| | HyperNova 60B 2602 supports **native tool use** and is well-suited for: |
| |
|
| | - **Function calling** with defined schemas |
| | - **Structured outputs** |
| | - **Agentic operations** (e.g. browser tasks, code execution where supported) |
| |
|
| | The model can detect when to invoke tools, emit structured JSON tool calls, and consume tool outputs to continue generation. Tool-calling behavior follows **OpenAI-style schemas**; compatibility refers to format and structure—exact parity with the base or other models is not guaranteed. |
| |
|
| | ### Example Tool Call |
| |
|
| | ```json |
| | { |
| | "name": "get_weather", |
| | "arguments": { |
| | "city": "Paris", |
| | "date": "2026-02-10" |
| | } |
| | } |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Training & Fine-Tuning |
| |
|
| | ### Base Model: gpt-oss-120b |
| |
|
| | The base model [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) was trained on OpenAI’s **harmony response format** and is intended for use with that format for correct behavior. It supports configurable reasoning levels (low / medium / high) and native tool use. See the [original model card](https://huggingface.co/openai/gpt-oss-120b) and [arXiv:2508.10925](https://arxiv.org/abs/2508.10925) for details. |
| |
|
| | ### CompactifAI Compression & Optional Fine-Tuning |
| |
|
| | - **Compression:** CompactifAI was applied to produce a smaller, efficient model (60B parameters) while aiming to preserve reasoning and tool-use capabilities. |
| | - **Optional fine-tuning:** This variant may include additional fine-tuning for tool calling and structured outputs; exact training details are model-specific. |
| |
|
| | --- |
| |
|
| | ## Architecture |
| |
|
| | ### Model Specifications |
| |
|
| | | Specification | Value | |
| | |-------------------|--------------------| |
| | | Base model | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) (117B params, 5.1B active MoE) | |
| | | Total parameters | 60B, 4.8B active MoE | |
| |
|
| | --- |
| |
|
| | ## Evaluation & Benchmarks |
| |
|
| | ### Evaluation Methodology |
| |
|
| | Benchmark scores were obtained with the following setups. Methodology varies by benchmark family. |
| |
|
| | #### MMLU-Pro, AIME25, GPQA:d, LiveCodeBench |
| |
|
| | - **Evaluation framework**: [Lighteval](https://github.com/huggingface/lighteval) |
| | - **Inference library**: vLLM 0.14.0 |
| | - **Reasoning effort**: medium |
| | - **Decoding**: temperature = 0.6, max_tokens = 131072, top_p = 1.0, top_k = 0 |
| | - **Batch size**: 64 |
| | |
| | #### IFBench, AA-LCR, SciCode |
| | |
| | - **Evaluation framework**: [Nemo-skills](https://github.com/NVIDIA/NeMo-Skills) |
| | - **Inference library**: vLLM 0.14.0 |
| | - **Reasoning effort**: medium |
| | - **Decoding**: temperature = 1.0, max_tokens = 131072, top_p = 1.0, top_k = 0 |
| | - **Batch size**: 64 |
| | |
| | #### BFCL v4 (17 splits) |
| |
|
| | - **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1 |
| | - **Inference library**: vLLM 0.14.0 |
| | - **Reasoning effort**: high |
| | - **Decoding**: temperature = 0.6, max_tokens = 16384, parallel_tool_calls = true, tool-call parser openai |
| | |
| | #### Tau2-bench (Telecom) |
| | |
| | - **Evaluation framework**: [EvalScope](https://github.com/EvalScope/EvalScope) 1.4.1 |
| | - **Inference library**: vLLM 0.14.0 |
| | - **Reasoning effort**: high (agent `extra_body.reasoning_effort`) |
| | - **Decoding (agent)**: temperature = 1.0, top_p = 1.0, min_tokens = 1 |
| | - **Decoding (judge / user simulator)**: temperature = 0.7, timeout = 600 |
| | - **Reproducibility**: subset telecom (default); max steps 100; repeats 3; tool-call parser openai (agent), hermes (judge) |
| | |
| | #### Terminal-Bench Hard (Artificial Analysis subset): |
| | |
| | - **Evaluation framework**: laude-institute/harbor == 0.1.43 |
| | - **Inference library**: vLLM == 0.15.0 |
| | - **Reasoning effort**: high |
| | - **Decoding**: temperature = 1.0, top_p = 1.0, max-model-len = 131072 |
| | - **Reproducibility**: subset from AA (https://artificialanalysis.ai/methodology/intelligence-benchmarking#terminal-bench-hard) |
| | - **Agent**: terminus-2, max episodes 100; repeats 3; |
| |
|
| | ### Quantitative Results (Reported & Planned) |
| |
|
| | Scores are accuracy or benchmark-specific metrics. Use `—` or *TBD* for evaluations not yet run. Reported numbers use the methodology described above (reasoning: cai-eval + Nemo-skills; BFCL v4 and Tau2-bench: cai-eval + EvalScope); other entries to be documented. |
| |
|
| | | Benchmark | gpt-oss-20b | gpt-oss-120b | HyperNova 60B 2602 | |
| | |-----------------------|-----------------------|------------------------|--------------------------| |
| | | MMLU-Pro | 74 | 78 | 74 | |
| | | BFCL v4 | 61 | 64 | 62 | |
| | | Tau2-bench (Telecom) | 59 | 68 | 61 | |
| | | AIME25 | 72 | 80 | 76 | |
| | | GPQA:d | 63 | 69 | 69 | |
| | | IFBench | 55 | 63 | 60 | |
| | | SciCode | 34 | 38 | 32 | |
| | | LiveCodeBench | 64 | 66 | 64 | |
| | | Terminal Bench | 9 | 22 | 16 | |
| | | AA-LCR | 37 | 50 | 36 | |
| | | AA-Omnis. Index | -40 | -36 | -41 | |
| | | AA-Omnis. Accuracy | 16 | 21 | 15 | |
| |
|
| |  |
| |  |
| |
|
| | ### Quantitative Results (Inference Performance) |
| |
|
| | Representative throughput and memory under the evaluation setup above. Comparison against **gpt-oss-20b** and **gpt-oss-120b** on the same hardware. |
| |
|
| | #### Performance evaluation conditions |
| |
|
| | Describe the setup used to obtain the numbers in the table below (replace the placeholders or add a short paragraph): |
| |
|
| | - **Inference library**: vLLM 0.14.0 |
| | - **Hardware**: 4× NVIDIA H200 Tensor Core GPU |
| | - **Conditions**: batch size=512, context length=512, decode length=256 |
| | - **Notes**: dtype=default |
| |
|
| | | Metric | gpt-oss-20b | gpt-oss-120b | HyperNova 60B 2602 | Hardware | |
| | |----------------------------|--------------------------|--------------------------|--------------------------|-------------------------------| |
| | | Tokens / second (decode) | 250 | 228 | 240 | 4× NVIDIA H200 Tensor Core GPU| |
| | | Time to first token (ms) | 26 | 26 | 25 | 4× NVIDIA H200 Tensor Core GPU| |
| | | Peak GPU memory (GB) | 13 | 61 | 32 | 4× NVIDIA H200 Tensor Core GPU| |
| |
|
| |  |
| |
|
| | --- |
| |
|
| | ## Languages |
| |
|
| | - **Primary language**: English |
| | - **Other languages**: Not formally evaluated |
| |
|
| | The model was trained primarily on English-language data. Performance on other languages may vary and has not been systematically measured. |
| |
|
| | --- |
| |
|
| | ## Intended Use |
| |
|
| | ### Recommended Use Cases |
| |
|
| | Aligned with [gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) use cases, with the benefit of a smaller footprint: |
| |
|
| | - **Reasoning and analysis** (with configurable reasoning effort where supported) |
| | - **Tool-augmented and agentic applications** (function calling, web browsing, code execution, structured outputs) |
| | - **Code generation and reasoning** |
| | - **Chatbots and virtual assistants** |
| | - **Retrieval-augmented generation (RAG)** |
| | - **Deployments** where gpt-oss-120b is desirable but memory or latency is constrained |
| |
|
| | ### Out-of-Scope Uses |
| |
|
| | - Harmful, illegal, or deceptive content generation |
| | - Impersonation of real individuals without consent |
| | - High-risk decision-making without human oversight |
| | - Surveillance or tracking of individuals |
| | - Any use that violates applicable laws or regulations |
| |
|
| | --- |
| |
|
| | ## Safety & Limitations |
| |
|
| | ### Known Limitations |
| |
|
| | - **English-centric** training data (inherited from base model). |
| | - **Format:** For best results, use the same [harmony response format](https://huggingface.co/openai/gpt-oss-120b) as gpt-oss-120b where applicable; behavior may differ otherwise. |
| | - **Tool calling** depends on correct schema and tool design; exact parity with gpt-oss-120b or other models is not guaranteed. |
| | - **Compression** may affect some behaviors; evaluate for your use case. |
| |
|
| | ### Recommendations |
| |
|
| | - Validate tool outputs before execution |
| | - Use human oversight for critical applications |
| | - Perform task-specific evaluation prior to deployment |
| |
|
| | --- |
| |
|
| | ## Model Information |
| |
|
| | | Field | Value | |
| | |--------------|--------------------- | |
| | | Model name | HyperNova 60B 2602 | |
| | | Based on | [openai/gpt-oss-120b](https://huggingface.co/openai/gpt-oss-120b) | |
| | | Version | 2602 | |
| | | Release date | 26/02/2026 | |
| | | Developed by | Multiverse Computing | |
| | | License | Apache 2.0 | |
| | | Contact | business@multiversecomputing.com | |
| |
|
| | --- |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite the base model and this variant: |
| |
|
| | ```bibtex |
| | @misc{openai2025gptoss120b, |
| | title = {gpt-oss-120b \& gpt-oss-20b Model Card}, |
| | author = {OpenAI}, |
| | year = {2025}, |
| | eprint = {2508.10925}, |
| | archivePrefix = {arXiv}, |
| | primaryClass = {cs.CL}, |
| | url = {https://arxiv.org/abs/2508.10925} |
| | } |
| | @misc{hypernova60b2602, |
| | title = {HyperNova 60B 2602: Model developed based on gpt-oss-120b}, |
| | author = {Multiverse Computing}, |
| | year = {2026}, |
| | url = {https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602}, |
| | note = {Model developed based on openai/gpt-oss-120b using CompactifAI technology} |
| | } |
| | ``` |
| |
|
| | **Built by [Multiverse Computing](https://www.multiversecomputing.com)** · [Report an issue](https://huggingface.co/MultiverseComputingCAI/HyperNova-60B-2602/discussions) · [Discord](https://discord.gg/8mT9FveN) |