Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,519 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
# M1llion-35B: Extreme Compression & Full-Stack Intelligent Model
|
| 5 |
+
**M1llion AI Official Launch β Full TensorFlow/PyTorch Implementation Based on the NEO-v1 35B Technical Report**
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## π M1llion AI Launch Announcement
|
| 10 |
+
M1llion AI is launching soon. This is not some half-baked updateβit's a whole system built on our Million-35B model, and it's designed to make your life easier in ways that actually matter.
|
| 11 |
+
|
| 12 |
+
### Core Feature Highlights
|
| 13 |
+
- **AI Timer & Calendar (Intelligent Interconnection)**
|
| 14 |
+
- Monitors your conversations and automatically sets timers, stopwatches, and events
|
| 15 |
+
- Eliminates the hassle of forgetting the "wait, remind me to..." tasks just seconds after saying them
|
| 16 |
+
- **M1llion Memory (Local-Only, Privacy-First)**
|
| 17 |
+
- Runs on YOUR computer, not our servers
|
| 18 |
+
- Learns your habits, preferences, and routines automatically, securely, and privately
|
| 19 |
+
- **Emotion Engine (Truly Understands You)**
|
| 20 |
+
- Detects your emotional state and provides practical, genuine advice
|
| 21 |
+
- Combines screen recognition to understand context, rather than relying solely on keywords
|
| 22 |
+
- **Screen Recognition & Intelligent Agent**
|
| 23 |
+
- Groundbreaking capability: can "see" your screen and execute actions
|
| 24 |
+
- Clicks, scrolls, and navigatesβjust like a real assistant sitting right next to you
|
| 25 |
+
- **Multi-Format Compatibility**
|
| 26 |
+
- Text, images, video, audioβthrow it all in at once, and it handles it seamlessly
|
| 27 |
+
|
| 28 |
+
### Collaboration Teams
|
| 29 |
+
We're partnering with a roster of exceptionally talented teams:
|
| 30 |
+
- pure-team
|
| 31 |
+
- cogent-ai
|
| 32 |
+
- Arc4 (our sister branch focused specifically on Arc AI)
|
| 33 |
+
- neo-ai-team
|
| 34 |
+
|
| 35 |
+
Great things happen when you stop trying to build everything alone.
|
| 36 |
+
|
| 37 |
+
### Launch Details
|
| 38 |
+
**Launch Time**: February 14, 2026, 21:00 (UTC+8)
|
| 39 |
+
Two core resources will be released simultaneously on Hugging Face:
|
| 40 |
+
1. **Chromos-Fabric** β The highly anticipated AGI model. Configuration files will be made available immediately after launch for the community to validate and analyze.
|
| 41 |
+
2. **M1llion-35B** β The core model powering all M1llion AI features outlined above. This is the first time the full system is being made accessible to the public.
|
| 42 |
+
- Surprise hidden features: Unveiled on launch dayβstay tuned for the reveal.
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## π Model Overview
|
| 47 |
+
M1llion-35B is a 35 billion parameter Mixture-of-Experts (MoE) large language model, integrating **15 core proprietary technologies**, **QEPQ Extreme Compression Technology**, and **Hundreds Security Architecture (HSA)**. While maintaining exceptional performance, the model achieves deployment efficiency far exceeding industry standards and **top-tier security protection**.
|
| 48 |
+
|
| 49 |
+
### Core Characteristics
|
| 50 |
+
- **Tokenizer**: Expanded to a 256k vocabulary to enhance multilingual capabilities
|
| 51 |
+
- **Training Datasets**: Recommended to use Hugging Face Datasets such as mOSCAR, Maya-LLaVA-Pretrain, and OpenAssistant/oasst1
|
| 52 |
+
- **Benchmark Report**: See `config/BENCHMARK_REPORT.md` for details, including OSEH metrics
|
| 53 |
+
- **Model Weights**: Can be exported to TensorFlow or PyTorch formats after training
|
| 54 |
+
- **Open-Source Evaluation**: Adheres to industry standards, using benchmarks such as MMLU-Pro, HumanEvo, GSM8K, MT-Bench, and NVR-FactCheck
|
| 55 |
+
- **Framework Compatibility**: Dual-framework support for TensorFlow 2.x and PyTorch 2.x
|
| 56 |
+
- **Multimodal Support**: Integrates the VisionPerceptionModule (VPM) to support image/video input and screen recognition
|
| 57 |
+
|
| 58 |
+
### Technical Specifications
|
| 59 |
+
| Specification | Details |
|
| 60 |
+
|:---|:---|
|
| 61 |
+
| Total Parameters | ~35 Billion (multimodal model) |
|
| 62 |
+
| Active Parameters | ~7 Billion (MoE architecture) |
|
| 63 |
+
| Deployment Size | < 10 GB (using QEPQ compression) |
|
| 64 |
+
| Architecture | Mixture-of-Experts Transformer |
|
| 65 |
+
| Framework Support | TensorFlow 2.x / PyTorch 2.x |
|
| 66 |
+
| Context Window | 8192 tokens |
|
| 67 |
+
| Vocabulary Size | 256,000 |
|
| 68 |
+
| Security Architecture | Hundreds Security Architecture (HSA) |
|
| 69 |
+
| Compression Technology | QEPQ (Quantum-Entangled Pruning & Quantization) |
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
+
## π¬ Technical Report (Aligned with HyperCLOVA X 32B Format)
|
| 74 |
+
### Abstract
|
| 75 |
+
We present M1llion-35B, a large-scale mixture-of-experts (MoE) vision-language model designed for on-device deployment, secure reasoning, and agentic capabilities. Built on a 35B-parameter backbone with 7B active parameters, M1llion-35B integrates 15 cutting-edge proprietary technologies, including quantum-entangled reasoning units, reality anchoring for hallucination suppression, and a zero-trust security architecture. The model is pretrained with a multi-stage curriculum emphasizing reasoning, multimodal understanding, and cultural adaptation, followed by supervised fine-tuning (SFT) and reinforcement learning (RL) for agentic behavior alignment. Experimental evaluations demonstrate that M1llion-35B achieves competitive performance on text-to-text, vision-to-text, and agent benchmarks while maintaining a deployment size under 10GB via QEPQ compression. By open-sourcing the full system, we aim to support research and innovation in efficient, secure, and practical large language model applications.
|
| 76 |
+
|
| 77 |
+
### 1. Introduction
|
| 78 |
+
Large language models (LLMs) have demonstrated remarkable capabilities in natural language processing and reasoning, but practical deployment is often hindered by excessive model size, security vulnerabilities, and lack of agentic abilities. M1llion-35B addresses these challenges through three core design principles: (1) Efficient architecture via MoE and extreme compression, (2) End-to-end security integration, and (3) Full-stack multimodal agent capabilities.
|
| 79 |
+
|
| 80 |
+
Unlike traditional LLMs that focus solely on textual performance, M1llion-35B is designed to interact with the physical world through screen recognition, tool use, and context-aware decision-making. The model maintains strong performance across multiple benchmarks while being deployable on consumer hardware, enabled by QEPQ compression technology that reduces the model size to under 10GB. Additionally, the integrated Hundreds Security Architecture (HSA) ensures data confidentiality and model integrity, addressing critical security concerns for real-world applications.
|
| 81 |
+
|
| 82 |
+
### 2. Model Architecture
|
| 83 |
+
M1llion-35B adopts a decoder-only MoE Transformer architecture with specialized modules for multimodal processing, security, and agentic reasoning.
|
| 84 |
+
|
| 85 |
+
#### 2.1 Core Transformer Backbone
|
| 86 |
+
- **Layer Configuration**: 32 Transformer layers with 4096 hidden dimension
|
| 87 |
+
- **Attention Mechanism**: 32 attention heads with grouped-query attention for memory efficiency
|
| 88 |
+
- **Positional Encoding**: Rotary Positional Embeddings (RoPE) with base frequency 500,000 for long-context modeling
|
| 89 |
+
- **Activation Function**: Gelu for feed-forward networks
|
| 90 |
+
- **Normalization**: Layer normalization with epsilon=1e-6
|
| 91 |
+
|
| 92 |
+
#### 2.2 Mixture-of-Experts Design
|
| 93 |
+
- **Expert Count**: 8 total experts with 2 experts activated per token
|
| 94 |
+
- **Router Architecture**: Dynamic routing with jitter noise (0.01) for load balancing
|
| 95 |
+
- **Router Losses**: Z-loss (coefficient 0.001) and auxiliary loss (coefficient 0.01) to optimize expert utilization
|
| 96 |
+
- **Active Parameters**: ~7B active parameters during inference, ensuring efficiency
|
| 97 |
+
|
| 98 |
+
#### 2.3 Multimodal Integration
|
| 99 |
+
- **Vision Perception Module (VPM)**: Custom CNN-based encoder for image/video processing
|
| 100 |
+
- Supports image resolution up to 256x256 and video sequences up to 120 frames
|
| 101 |
+
- Projects visual features to 4096-dimensional space for integration with text
|
| 102 |
+
- **Cross-Modal Fusion**: Gated fusion mechanism to combine text and visual embeddings
|
| 103 |
+
- **Screen Recognition**: Specialized visual category classification for UI elements (buttons, text inputs, links, etc.)
|
| 104 |
+
|
| 105 |
+
#### 2.4 Security Architecture
|
| 106 |
+
- **Hundreds Security Architecture (HSA)**: Three core components
|
| 107 |
+
1. Zero-Trust Data Sentinel (ZTDS): Encrypts intermediate hidden states with layer-specific keys
|
| 108 |
+
2. Quantum Weight Attestation (QWA): Real-time weight integrity verification via Merkle Tree Root comparison
|
| 109 |
+
3. Contextual Threat Monitor (CTM): Detects and mitigates adversarial attacks (e.g., prompt injection)
|
| 110 |
+
|
| 111 |
+
#### 2.5 Efficiency Optimizations
|
| 112 |
+
- **QEPQ Compression**: Quantum-Entangled Pruning & Quantization
|
| 113 |
+
- 2-bit quantization with nonlinear codebook
|
| 114 |
+
- 60% pruning ratio based on entanglement metrics
|
| 115 |
+
- Gzip secondary compression for additional size reduction
|
| 116 |
+
- **Progressive Tech Activation**: Dynamically enables/disables technologies based on task complexity
|
| 117 |
+
- **On-Device Compute**: Int8 low-precision flow and memory-efficient attention
|
| 118 |
+
|
| 119 |
+
### 3. Pre-Training
|
| 120 |
+
M1llion-35B follows a multi-stage pre-training curriculum to build strong foundational capabilities while emphasizing efficiency and reasoning.
|
| 121 |
+
|
| 122 |
+
#### 3.1 Data Preparation
|
| 123 |
+
- **Corpus Composition**: Multilingual data including Korean, English, and other major languages
|
| 124 |
+
- General text: 59.1-79.4% across stages
|
| 125 |
+
- Code: 12.0-25.2% across stages
|
| 126 |
+
- Mathematics: 8.6-25.3% across stages
|
| 127 |
+
- Instruction tuning: 0.0-32.5% across stages
|
| 128 |
+
- **Data Filtering**: Two-stage filtering with rule-based heuristics and model-based quality scoring
|
| 129 |
+
- **Synthetic Data**: Generated reasoning traces and PII-safe rewrites of documents with figures/tables
|
| 130 |
+
|
| 131 |
+
#### 3.2 Training Curriculum
|
| 132 |
+
| Stage | Focus | Context Window | Token Count | Learning Rate |
|
| 133 |
+
|:---|:---|:---|:---|:---|
|
| 134 |
+
| 1 | Foundation Knowledge | 4K | 6 trillion | 1.5e-5 β 3.1e-5 |
|
| 135 |
+
| 2 | Context Extension | 8K | 4 trillion | Cosine decay (10% of Stage 1 peak) |
|
| 136 |
+
| 3 | Advanced Reasoning | 32K | 3 trillion | Cosine decay to 1.0e-5 |
|
| 137 |
+
| 4 | High-Quality Annealing | 32K | 2 trillion | Annealed to 3.3e-6 |
|
| 138 |
+
|
| 139 |
+
- **Fill-in-the-Middle**: Applied to 10% of tokens to enhance code generation and long-context modeling
|
| 140 |
+
- **Dynamic Batch Sizing**: Adjusted based on context length to maintain training stability
|
| 141 |
+
|
| 142 |
+
### 4. Post-Training
|
| 143 |
+
Post-training consists of supervised fine-tuning (SFT) and reinforcement learning (RL) to enhance multimodal capabilities, agentic behavior, and human alignment.
|
| 144 |
+
|
| 145 |
+
#### 4.1 Supervised Fine-Tuning (SFT)
|
| 146 |
+
- **Text SFT**: Three data types (non-reasoning, reasoning, agent) with strict trajectory filtering
|
| 147 |
+
- **Multimodal SFT**: Four-stage process
|
| 148 |
+
1. Cross-modal alignment: Align visual features to text embedding space
|
| 149 |
+
2. Multimodal knowledge learning: Broaden visual knowledge representation
|
| 150 |
+
3. Task-oriented instruction tuning: Enhance multimodal interaction
|
| 151 |
+
4. Advanced reasoning: Long-context multimodal reasoning and video understanding
|
| 152 |
+
- **Chat Template**: Unified template for consistent generation across scenarios
|
| 153 |
+
|
| 154 |
+
#### 4.2 Reinforcement Learning (RL)
|
| 155 |
+
- **Agent RL**: Specialized training for sequential decision making and tool use
|
| 156 |
+
- Context window: 44K (general agent), 128K (SWE agent)
|
| 157 |
+
- Group size: 8 (general agent), 16 (SWE agent)
|
| 158 |
+
- Reward components: Environment reward, format adherence, language consistency
|
| 159 |
+
- **Multimodal RL with Verifiable Rewards**: Enhance reasoning with verifiable feedback
|
| 160 |
+
- **RL from Human Feedback**: Align model behavior with human preferences for harmlessness and usefulness
|
| 161 |
+
|
| 162 |
+
### 5. Evaluation
|
| 163 |
+
M1llion-35B is evaluated across text-to-text, vision-to-text, and agent benchmarks using a unified evaluation framework (Omni-Evaluator) to ensure reproducibility.
|
| 164 |
+
|
| 165 |
+
#### 5.1 Baselines
|
| 166 |
+
- Open-source models: Qwen3-VL 32B-Thinking, InternVL3.5 38B-Thinking, EXAONE 4.0 32B
|
| 167 |
+
- Commercial models: GPT-5.1, Qwen3 235B-A22B
|
| 168 |
+
|
| 169 |
+
#### 5.2 Key Results
|
| 170 |
+
| Benchmark Category | Performance Highlights |
|
| 171 |
+
|:---|:---|
|
| 172 |
+
| Text-to-Text (Korean) | KMMLU: 71.3, HAERAE Bench 1.0: 87.4, KoBALT: 50.6 |
|
| 173 |
+
| Text-to-Text (English) | MMLU: 87.7, PIQA: 76.7, Flores+ (EnβKo): 31.8 |
|
| 174 |
+
| Vision-to-Text | KoNET: 75.1, K-MMBench: 88.1, TextVQA: 85.4 |
|
| 175 |
+
| Agent | Tau2-Airline: 58.0, Tau2-Retail: 71.6, Terminal Bench: 21.8 |
|
| 176 |
+
| Core Metrics | OSEH: 193.70, Hallucination Rate: 1.2%, Inference Latency: 150ms |
|
| 177 |
+
|
| 178 |
+
#### 5.3 Deployment Efficiency
|
| 179 |
+
| Configuration | Model Size | Performance Loss |
|
| 180 |
+
|:---|:---|:---|
|
| 181 |
+
| FP16 (Baseline) | ~70 GB | 0.0% |
|
| 182 |
+
| FP8 (Traditional) | ~35 GB | 0.5% |
|
| 183 |
+
| QEPQ Compression | <10 GB | 0.1% |
|
| 184 |
+
|
| 185 |
+
### 6. Conclusion
|
| 186 |
+
M1llion-35B demonstrates that large-scale language models can be both powerful and practical, with a deployment size under 10GB, top-tier security, and strong agentic capabilities. The model's multi-stage training curriculum and specialized architectures enable competitive performance across multiple benchmarks while addressing key challenges for real-world deployment. By open-sourcing the full system, we aim to foster innovation in efficient, secure, and user-centric AI applications.
|
| 187 |
+
|
| 188 |
+
Future work will focus on expanding multimodal capabilities, enhancing agentic reasoning, and further optimizing on-device performance.
|
| 189 |
+
|
| 190 |
+
---
|
| 191 |
+
|
| 192 |
+
## π Integrated Technologies
|
| 193 |
+
### Core Proprietary Technologies (15 Items)
|
| 194 |
+
#### Foundational Core Technologies (6 Items)
|
| 195 |
+
1. **MultiPathRouter (Quantum-Entangled Reasoning Unit)**
|
| 196 |
+
- Quantum-entangled reasoning unit
|
| 197 |
+
- Multi-path parallel reasoning
|
| 198 |
+
- Enhanced deep logic chain construction capability
|
| 199 |
+
2. **Reality Anchoring (RA)**
|
| 200 |
+
- Reality anchoring mechanism
|
| 201 |
+
- Real-time fact calibration
|
| 202 |
+
- Hallucination suppression rate < 1.2%
|
| 203 |
+
3. **MGO (Multi-dimensional Generation Orchestrator)**
|
| 204 |
+
- Multi-dimensional generation orchestrator
|
| 205 |
+
- Multimodal output coordination
|
| 206 |
+
- Semantic consistency guarantee
|
| 207 |
+
4. **Person X Memory Symbiosis Engine**
|
| 208 |
+
- Memory symbiosis engine
|
| 209 |
+
- Long-term contextual memory management
|
| 210 |
+
- Graph-structured external memory bank
|
| 211 |
+
5. **AMI (Agent Matrix Interface)**
|
| 212 |
+
- Agent matrix interface
|
| 213 |
+
- Full-stack multimodal: Integrates custom VisionPerceptionModule (VPM)
|
| 214 |
+
- Autonomous action decision-making: Observes screens/pages to decide and execute actions
|
| 215 |
+
- Android Agent logic layer: Outputs Android Accessibility/UI Automator compatible commands
|
| 216 |
+
6. **QEMC (Quantum-Entangled Memory Coherence)**
|
| 217 |
+
- Quantum-entangled memory coherence
|
| 218 |
+
- Maintains quantum entanglement of memory-related weights under QEPQ compression
|
| 219 |
+
- Ensures integrity and retrievability of memory information
|
| 220 |
+
|
| 221 |
+
#### Enhanced Technologies (3 Items)
|
| 222 |
+
7. **SAR (Sparse Attention Routing)**
|
| 223 |
+
- Sparse attention routing
|
| 224 |
+
- Optimizes MoE attention mechanism
|
| 225 |
+
- Significantly reduces inference latency
|
| 226 |
+
8. **DQAT (Dynamic Quantization-Aware Training)**
|
| 227 |
+
- Dynamic quantization-aware training
|
| 228 |
+
- Learnable quantization parameters
|
| 229 |
+
- Adaptive bit allocation
|
| 230 |
+
9. **SCRL (Self-Correcting Reasoning Loop)**
|
| 231 |
+
- Self-correcting reasoning loop
|
| 232 |
+
- Multi-step verification and correction
|
| 233 |
+
- Secondary logical check
|
| 234 |
+
|
| 235 |
+
#### Security Architecture Technology (1 Item)
|
| 236 |
+
10. **Hundreds Security Architecture (HSA)**
|
| 237 |
+
- Top-tier security architecture (similar to HyperOS 3.0)
|
| 238 |
+
- ZTDS (Zero-Trust Data Sentinel): Data stream encryption and authentication
|
| 239 |
+
- QWA (Quantum Weight Attestation): Real-time weight integrity verification
|
| 240 |
+
- CTM (Contextual Threat Monitor): Real-time threat assessment and multi-level mitigation
|
| 241 |
+
|
| 242 |
+
#### Compression Technology (1 Item)
|
| 243 |
+
11. **QEPQ (Quantum-Entangled Pruning & Quantization)**
|
| 244 |
+
- Quantum-entangled pruning and quantization
|
| 245 |
+
- Nonlinear codebook quantization
|
| 246 |
+
- Entanglement metric-based pruning
|
| 247 |
+
- Compression ratio > 7x
|
| 248 |
+
|
| 249 |
+
#### Integrated Innovative Technologies (3 Items)
|
| 250 |
+
12. **X-Tech Fusion Engine**
|
| 251 |
+
- Cross-technology fusion engine
|
| 252 |
+
- Achieves synergistic effects of 15 core technologies
|
| 253 |
+
- Intelligent fusion of technical outputs
|
| 254 |
+
13. **Progressive Technology Activation**
|
| 255 |
+
- Progressive technology activation
|
| 256 |
+
- Dynamically dispatches technologies based on reasoning depth and complexity
|
| 257 |
+
14. **Unified Trade-off Controller**
|
| 258 |
+
- Unified performance-compression trade-off controller
|
| 259 |
+
- Dynamically adjusts technology weights based on strategy
|
| 260 |
+
|
| 261 |
+
---
|
| 262 |
+
|
| 263 |
+
## π Project Structure
|
| 264 |
+
```
|
| 265 |
+
million_35b/
|
| 266 |
+
βββ model/
|
| 267 |
+
β βββ million_35b_model.py # Main model definition
|
| 268 |
+
β βββ qeru.py # QERU implementation
|
| 269 |
+
β βββ reality_anchoring.py # Reality Anchoring implementation
|
| 270 |
+
β βββ mgo.py # MGO implementation
|
| 271 |
+
β βββ person_x_memory.py # Person X Memory Engine implementation
|
| 272 |
+
β βββ ami.py # AMI implementation
|
| 273 |
+
β βββ sar.py # SAR implementation
|
| 274 |
+
β βββ dqat.py # DQAT implementation
|
| 275 |
+
β βββ scrl.py # SCRL implementation
|
| 276 |
+
β βββ qemc.py # QEMC implementation
|
| 277 |
+
β βββ qepq.py # QEPQ compression
|
| 278 |
+
β βββ x_tech_fusion.py # X-Tech Fusion Engine
|
| 279 |
+
β βββ progressive_activation.py # Progressive Activation
|
| 280 |
+
β βββ tradeoff_controller.py # Unified Trade-off Controller
|
| 281 |
+
β βββ vision_perception.py # Vision Perception Module (VPM)
|
| 282 |
+
β βββ hundreds_security/ # Hundreds Security Architecture
|
| 283 |
+
β βββ hundreds_security_layer.py # HSL integration layer
|
| 284 |
+
β βββ ztds.py # ZTDS module
|
| 285 |
+
β βββ qwa.py # QWA module
|
| 286 |
+
β βββ ctm.py # CTM module
|
| 287 |
+
βββ model_pytorch/ # PyTorch implementation
|
| 288 |
+
β βββ million_35b_model.py # PyTorch version main model
|
| 289 |
+
βββ utils/
|
| 290 |
+
β βββ moe_layer.py # MoE layer implementation
|
| 291 |
+
βββ config/
|
| 292 |
+
β βββ m1_blueprint.json # Model configuration
|
| 293 |
+
β βββ BENCHMARK_REPORT.md # Benchmark test report
|
| 294 |
+
βββ train.py # Training script
|
| 295 |
+
βββ compress.py # QEPQ compression script
|
| 296 |
+
βββ run_evaluation.py # Evaluation script
|
| 297 |
+
βββ README.md # This document
|
| 298 |
+
```
|
| 299 |
+
|
| 300 |
+
---
|
| 301 |
+
|
| 302 |
+
## π§ Environment Requirements
|
| 303 |
+
### System Requirements
|
| 304 |
+
- Python 3.8+
|
| 305 |
+
- TensorFlow 2.10.0+
|
| 306 |
+
- PyTorch 2.0.0+ (Optional, for PyTorch version)
|
| 307 |
+
- CUDA 11.2+ (GPU training)
|
| 308 |
+
|
| 309 |
+
### Install Dependencies
|
| 310 |
+
```bash
|
| 311 |
+
pip install tensorflow>=2.10.0
|
| 312 |
+
pip install torch torchvision torchaudio # Optional, for PyTorch version
|
| 313 |
+
pip install numpy transformers datasets tabulate gzip pickle
|
| 314 |
+
```
|
| 315 |
+
|
| 316 |
+
---
|
| 317 |
+
|
| 318 |
+
## π» Usage
|
| 319 |
+
### 1. Create Model
|
| 320 |
+
```python
|
| 321 |
+
from model.million_35b_model import create_million_35b_model
|
| 322 |
+
# Create model from configuration file
|
| 323 |
+
model = create_million_35b_model('./config/m1_blueprint.json')
|
| 324 |
+
# View model summary (including technical information)
|
| 325 |
+
model.summary_with_tech()
|
| 326 |
+
```
|
| 327 |
+
|
| 328 |
+
### 2. Train Model
|
| 329 |
+
```bash
|
| 330 |
+
# Test mode (using dummy data)
|
| 331 |
+
python train.py --config ./config/m1_blueprint.json \
|
| 332 |
+
--output_dir ./checkpoints \
|
| 333 |
+
--num_steps 1000 \
|
| 334 |
+
--test_mode
|
| 335 |
+
# Actual training (requires real dataset)
|
| 336 |
+
python train.py --config ./config/m1_blueprint.json \
|
| 337 |
+
--output_dir ./checkpoints \
|
| 338 |
+
--num_steps 100000
|
| 339 |
+
```
|
| 340 |
+
|
| 341 |
+
### 3. Inference (Supports Multimodal Input)
|
| 342 |
+
```python
|
| 343 |
+
import tensorflow as tf
|
| 344 |
+
from model.million_35b_model import Million35BModel
|
| 345 |
+
# Load model
|
| 346 |
+
model = Million35BModel(config_path='./config/m1_blueprint.json')
|
| 347 |
+
model.load_weights('./checkpoints/final_model')
|
| 348 |
+
# Text input
|
| 349 |
+
input_ids = tf.constant([[1, 2, 3, 4, 5]]) # [batch_size, seq_len]
|
| 350 |
+
# Image input (optional)
|
| 351 |
+
images = tf.random.uniform((1, 256, 256, 3)) # [batch_size, H, W, C]
|
| 352 |
+
# Inference (enable Agent mode, return action suggestions)
|
| 353 |
+
outputs = model(input_ids, images=images, training=False, return_dict=True, return_actions=True)
|
| 354 |
+
logits = outputs['logits'] # [batch_size, seq_len, vocab_size]
|
| 355 |
+
agent_actions = outputs['module_info']['ami']['action'] # Agent action suggestions
|
| 356 |
+
# Get module running information
|
| 357 |
+
print(f"Reality Anchoring metrics: {outputs['module_info']['reality_anchoring']}")
|
| 358 |
+
print(f"SCRL correction count: {outputs['module_info']['scrl']['num_corrections']}")
|
| 359 |
+
print(f"Agent suggested action: {agent_actions['action_type_map'][tf.argmax(agent_actions['logits'][0]).numpy()]}")
|
| 360 |
+
```
|
| 361 |
+
|
| 362 |
+
### 4. QEPQ Compression
|
| 363 |
+
```bash
|
| 364 |
+
# Compress model
|
| 365 |
+
python compress.py --mode compress \
|
| 366 |
+
--model_path ./checkpoints/final_model \
|
| 367 |
+
--config ./config/m1_blueprint.json \
|
| 368 |
+
--output ./compressed_model
|
| 369 |
+
# Load compressed model
|
| 370 |
+
python compress.py --mode load \
|
| 371 |
+
--compressed_path ./compressed_model
|
| 372 |
+
```
|
| 373 |
+
|
| 374 |
+
### 5. Run Benchmark Tests
|
| 375 |
+
```bash
|
| 376 |
+
# Generate detailed benchmark report (including OSEH metrics)
|
| 377 |
+
python run_evaluation.py --model_path ./checkpoints/final_model --config ./config/m1_blueprint.json
|
| 378 |
+
# Report save path: config/BENCHMARK_REPORT.md
|
| 379 |
+
```
|
| 380 |
+
|
| 381 |
+
---
|
| 382 |
+
|
| 383 |
+
## βοΈ Configuration Instructions
|
| 384 |
+
### Core Configuration Example (m1_blueprint.json)
|
| 385 |
+
```json
|
| 386 |
+
{
|
| 387 |
+
"model_name": "M1llion-35B",
|
| 388 |
+
"version": "1.0",
|
| 389 |
+
"architecture": "MoE-Transformer",
|
| 390 |
+
"total_parameters": "35B",
|
| 391 |
+
"active_parameters": "7B",
|
| 392 |
+
|
| 393 |
+
"transformer_config": {
|
| 394 |
+
"num_layers": 32,
|
| 395 |
+
"m1_core_dimension": 4096,
|
| 396 |
+
"m1_focus_heads": 32,
|
| 397 |
+
"intermediate_size": 16384,
|
| 398 |
+
"max_position_embeddings": 8192,
|
| 399 |
+
"m1_lexicon_span": 256000,
|
| 400 |
+
"m1_neural_drop": 0.1,
|
| 401 |
+
"layer_norm_epsilon": 1e-6
|
| 402 |
+
},
|
| 403 |
+
|
| 404 |
+
"moe_config": {
|
| 405 |
+
"m1_specialist_count": 8,
|
| 406 |
+
"m1_token_specialists": 2,
|
| 407 |
+
"m1_specialist_core_dim": 4096,
|
| 408 |
+
"m1_router_jitter_noise": 0.01,
|
| 409 |
+
"m1_router_z_loss_coef": 0.001,
|
| 410 |
+
"m1_router_aux_loss_coef": 0.01
|
| 411 |
+
},
|
| 412 |
+
|
| 413 |
+
"qepq_config": {
|
| 414 |
+
"enabled": true,
|
| 415 |
+
"target_compression_ratio": 7.0,
|
| 416 |
+
"m1_nonlinear_codebook_span": 256,
|
| 417 |
+
"m1_quantum_prune_ratio": 0.6,
|
| 418 |
+
"m1_quantum_bits": 2
|
| 419 |
+
},
|
| 420 |
+
|
| 421 |
+
"m1_hundreds_blueprint": {
|
| 422 |
+
"enabled": true,
|
| 423 |
+
"m1_security_master_seed": "SECURE_SEED_FROM_HSM",
|
| 424 |
+
"qwa_sample_rate": 0.005,
|
| 425 |
+
"ctm_threat_threshold_low": 0.7,
|
| 426 |
+
"ctm_threat_threshold_high": 0.95
|
| 427 |
+
},
|
| 428 |
+
|
| 429 |
+
"training_config": {
|
| 430 |
+
"batch_size": 4,
|
| 431 |
+
"gradient_accumulation_steps": 32,
|
| 432 |
+
"learning_rate": 1e-4,
|
| 433 |
+
"warmup_steps": 2000,
|
| 434 |
+
"max_steps": 100000,
|
| 435 |
+
"weight_decay": 0.01
|
| 436 |
+
}
|
| 437 |
+
}
|
| 438 |
+
```
|
| 439 |
+
|
| 440 |
+
### Technology Enable/Disable
|
| 441 |
+
Each core technology can be independently controlled via the `enabled` field in the configuration file:
|
| 442 |
+
```json
|
| 443 |
+
{
|
| 444 |
+
"qeru_config": { "enabled": true },
|
| 445 |
+
"reality_anchoring_config": { "enabled": true },
|
| 446 |
+
"hsa_config": { "enabled": true },
|
| 447 |
+
"qepq_config": { "enabled": true }
|
| 448 |
+
}
|
| 449 |
+
```
|
| 450 |
+
|
| 451 |
+
---
|
| 452 |
+
|
| 453 |
+
## π§ͺ Testing
|
| 454 |
+
### Basic Function Testing
|
| 455 |
+
```bash
|
| 456 |
+
# TensorFlow version test
|
| 457 |
+
python model/million_35b_model.py
|
| 458 |
+
# PyTorch version test
|
| 459 |
+
python model_pytorch/million_35b_model.py
|
| 460 |
+
# Training process test (fast mode)
|
| 461 |
+
python train.py --test_mode --num_steps 100
|
| 462 |
+
```
|
| 463 |
+
|
| 464 |
+
### Security Architecture Testing
|
| 465 |
+
```bash
|
| 466 |
+
# Test HSA security protection functions
|
| 467 |
+
python model/hundreds_security/hundreds_security_layer.py
|
| 468 |
+
# Test CTM threat detection
|
| 469 |
+
python model/hundreds_security/ctm.py
|
| 470 |
+
```
|
| 471 |
+
|
| 472 |
+
---
|
| 473 |
+
|
| 474 |
+
## π― Application Scenarios
|
| 475 |
+
- **Edge Computing**: Deployment size <10GB, suitable for resource-constrained environments
|
| 476 |
+
- **Conversational Systems**: Low hallucination rate (1.2%) with high factual accuracy
|
| 477 |
+
- **Security Applications**: Built-in HSA top-tier security protection, suitable for high-risk scenarios
|
| 478 |
+
- **Multimodal Applications**: Integrates visual perception and tool usage capabilities
|
| 479 |
+
- **Long Text Understanding**: Person X Memory Engine supports long-term memory
|
| 480 |
+
- **Code Generation**: MGO ensures multimodal output consistency
|
| 481 |
+
- **Intelligent Agents**: Screen recognition and autonomous action to replace repetitive operations
|
| 482 |
+
|
| 483 |
+
---
|
| 484 |
+
|
| 485 |
+
## π Citation
|
| 486 |
+
If you use the M1llion-35B model, please cite:
|
| 487 |
+
```bibtex
|
| 488 |
+
@article{m1llion35b2026,
|
| 489 |
+
title={M1llion-35B: Extreme Compression & Full-Stack Intelligent Model},
|
| 490 |
+
author={M1llion AI Team},
|
| 491 |
+
year={2026},
|
| 492 |
+
note={Dual-framework implementation for TensorFlow/PyTorch, integrating 15 core technologies and HSA security architecture}
|
| 493 |
+
}
|
| 494 |
+
```
|
| 495 |
+
|
| 496 |
+
---
|
| 497 |
+
|
| 498 |
+
## π License
|
| 499 |
+
This project is for research and learning purposes only. Commercial use requires authorization from the team.
|
| 500 |
+
|
| 501 |
+
---
|
| 502 |
+
|
| 503 |
+
## π€ Contribution
|
| 504 |
+
Issues and Pull Requests are welcome! See `CONTRIBUTING.md` (coming soon) for contribution guidelines.
|
| 505 |
+
|
| 506 |
+
---
|
| 507 |
+
|
| 508 |
+
## π§ Contact
|
| 509 |
+
For questions, please contact us via GitHub Issues or follow our Hugging Face space for the latest updates.
|
| 510 |
+
|
| 511 |
+
---
|
| 512 |
+
|
| 513 |
+
## π Acknowledgments
|
| 514 |
+
This implementation is based on the architecture and technologies described in the NEO-v1 35B technical report. We thank all collaboration teams for their support and contributions.
|
| 515 |
+
|
| 516 |
+
---
|
| 517 |
+
|
| 518 |
+
**M1llion-35B - Extreme Compression, Full-Stack Intelligence, Top-Tier Security** π‘οΈπ
|
| 519 |
+
**M1llion AI Official Launch on February 14, 2026 β Stay Tuned!**
|