Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,110 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# 🧠 OpenModel-1T-A50B-Instruct
|
| 7 |
+
|
| 8 |
+
**Repository:** `thenexthub/OpenModel-1T-A50B-Instruct`
|
| 9 |
+
**Organization:** NeXTHub
|
| 10 |
+
**Model Type:** Mixture-of-Experts (MoE) Large Language Model
|
| 11 |
+
**Parameters:** 1 Trillion total | 50 Billion active per forward pass
|
| 12 |
+
**Context Length:** 128K tokens
|
| 13 |
+
**Architecture:** Evo-CoT MoE Transformer (Evolutionary Chain-of-Thought)
|
| 14 |
+
**Training Tokens:** 20+ Trillion reasoning-dense, high-quality tokens
|
| 15 |
+
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
## 🔍 Overview
|
| 19 |
+
|
| 20 |
+
**OpenModel-1T-A50B-Instruct** represents a major leap in NeXTHub’s pursuit of scalable, efficient, and deeply reasoning general-purpose AI.
|
| 21 |
+
The model blends trillion-scale architecture with a **Mixture-of-Experts (MoE)** system, where **50 billion active parameters** are dynamically routed per token — balancing raw power and energy efficiency.
|
| 22 |
+
|
| 23 |
+
At its core, OpenModel-1T leverages an **Evolutionary Chain-of-Thought (Evo-CoT)** process across mid-training and post-training phases, allowing reasoning patterns to “evolve” across checkpoints rather than merely optimize static objectives. This enables emergent meta-reasoning, recursive planning, and adaptive self-correction — a new standard in interpretability and coherence.
|
| 24 |
+
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
## ⚙️ Key Features
|
| 28 |
+
|
| 29 |
+
* 🧩 **1T Total / 50B Active MoE Design:** Trillion-parameter scale with sparse activation for exceptional throughput efficiency.
|
| 30 |
+
* 🧠 **Evo-CoT Training:** Evolutionary chain-of-thought reinforcement — model learns to reason *about* its own reasoning.
|
| 31 |
+
* 📚 **20T+ Token Corpus:** Pre-trained on a curated, reasoning-dense dataset spanning code, math, science, multilingual text, and human reasoning.
|
| 32 |
+
* ⏱️ **128K Context Window:** Long-context comprehension for entire projects, books, or datasets.
|
| 33 |
+
* 🧮 **Reasoning-Optimized Objective:** Curriculum emphasizing precision in long-form logic and mathematical reasoning.
|
| 34 |
+
* 🧩 **Cross-Domain Instruction Tuning:** Fine-tuned for professional reasoning, code synthesis, mathematics, and complex dialogue.
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## 📊 Evaluation
|
| 39 |
+
|
| 40 |
+
OpenModel-1T-A50B-Instruct was evaluated against both **open-source** and **closed-source** state-of-the-art models, including:
|
| 41 |
+
|
| 42 |
+
* **DeepSeek-V3.1-Terminus**
|
| 43 |
+
* **Kimi-K2-Instruct-0905**
|
| 44 |
+
* **GPT-5-main (API)**
|
| 45 |
+
* **Gemini-2.5-Pro (API)**
|
| 46 |
+
|
| 47 |
+
### 🧩 Benchmark Results
|
| 48 |
+
|
| 49 |
+
| Domain | Benchmark | OpenModel-1T-A50B-Instruct | SOTA Comparison |
|
| 50 |
+
| :---------------------------------- | :----------------- | :--------------------------------------------------------------------- | :------------------------------- |
|
| 51 |
+
| **Mathematics (Competition-Level)** | AIME-25 | **Extended Pareto frontier** of reasoning length vs. accuracy | ✓ Superior |
|
| 52 |
+
| **Professional Math** | MATH-500 | Outperforms by **+6.2%** over DeepSeek-V3.1 | ✓ Superior |
|
| 53 |
+
| **Logical Reasoning** | ARC-C / GPQA | Demonstrates **state-of-the-art coherence** and low hallucination rate | ✓ Superior |
|
| 54 |
+
| **Code Generation** | HumanEval+ / MBPP+ | Outperforms Kimi-K2-Instruct by **~8% pass@1** | ✓ Superior |
|
| 55 |
+
| **General Dialogue** | MT-Bench | Comparable to GPT-5-main; improved factual grounding | ✓ On Par / Better in Logic Depth |
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
## 🧬 Design Philosophy
|
| 60 |
+
|
| 61 |
+
OpenModel-1T was built not just to scale intelligence, but to **evolve it**.
|
| 62 |
+
The Evo-CoT process simulates intellectual growth — allowing reasoning pathways to mutate, recombine, and self-select under performance feedback, akin to neural evolution.
|
| 63 |
+
This architecture fuses **cognitive diversity** with **efficiency**, enabling the model to “think deeper, not longer.”
|
| 64 |
+
|
| 65 |
+
---
|
| 66 |
+
|
| 67 |
+
## 💡 Applications
|
| 68 |
+
|
| 69 |
+
* Autonomous code generation and debugging
|
| 70 |
+
* AI-assisted scientific research
|
| 71 |
+
* Complex data analytics and mathematical modeling
|
| 72 |
+
* Multi-agent collaboration and orchestration
|
| 73 |
+
* Educational tutoring and theorem proving
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## 🛡️ Responsible AI
|
| 78 |
+
|
| 79 |
+
OpenModel-1T was trained with strict filtering of unsafe, biased, or synthetic low-fidelity data.
|
| 80 |
+
Safety layers include prompt-level moderation, reasoning self-checks, and toxicity filters.
|
| 81 |
+
The model does **not** produce or endorse harmful, biased, or illegal content.
|
| 82 |
+
|
| 83 |
+
---
|
| 84 |
+
|
| 85 |
+
## 📦 Technical Specs
|
| 86 |
+
|
| 87 |
+
| Specification | Detail |
|
| 88 |
+
| :-------------------- | :------------------------------------------ |
|
| 89 |
+
| **Total Parameters** | 1 Trillion |
|
| 90 |
+
| **Active Parameters** | 50 Billion |
|
| 91 |
+
| **Architecture** | Transformer-MoE with Evo-CoT |
|
| 92 |
+
| **Training Tokens** | 20 Trillion+ |
|
| 93 |
+
| **Context Length** | 128K |
|
| 94 |
+
| **Precision** | FP8 / BF16 hybrid |
|
| 95 |
+
| **License** | Apache-2.0 with AI-Responsible Use Addendum |
|
| 96 |
+
|
| 97 |
+
---
|
| 98 |
+
|
| 99 |
+
## 🧭 Citation
|
| 100 |
+
|
| 101 |
+
If you use OpenModel-1T in your research or products, please cite:
|
| 102 |
+
|
| 103 |
+
```
|
| 104 |
+
@misc{thenexthub2025openmodel1t,
|
| 105 |
+
title={OpenModel-1T-A50B-Instruct: Open Source, Trillion-Scale MoE Model with Evolutionary Chain-of-Thought},
|
| 106 |
+
author={NeXTHub},
|
| 107 |
+
year={2025},
|
| 108 |
+
howpublished={\url{https://huggingface.co/thenexthub/OpenModel-1T-A50B-Instruct}},
|
| 109 |
+
}
|
| 110 |
+
```
|