Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -9,4 +9,309 @@ license: cc-by-nc-nd-4.0
|
|
| 9 |
short_description: Upcoming Flagship LLM series
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
short_description: Upcoming Flagship LLM series
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# Matrix Lattice β Full Architecture Specification
|
| 13 |
+
**Agentic + Multimodal Frontier MoE Family | Matrix.Corp**
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## Overview
|
| 18 |
+
|
| 19 |
+
Matrix Lattice is Matrix.Corp's flagship frontier model family. Designed from the ground up for inference provider deployment (Novita, Hyperbolic, Together, Fireworks, etc.) and accessed via OpenAI-compatible API. Agentic-first, natively multimodal, 1M+ context, MoE architecture keeping active params far below total.
|
| 20 |
+
|
| 21 |
+
| Model | Total Params | Active Params | Experts | Context | Target Hardware |
|
| 22 |
+
|---|---|---|---|---|---|
|
| 23 |
+
| Lattice-120B | 120B | ~22B active | 64 experts, top-4 | 1M tokens | 4Γ H100 / 8Γ p300a |
|
| 24 |
+
| Lattice-430B | 430B | ~38B active | 128 experts, top-4 | 1M tokens | 16Γ H100 / 28Γ p300a |
|
| 25 |
+
| Lattice-671B | 671B | ~47B active | 256 experts, top-4 | 1M tokens | 32Γ H100 / 48Γ p300a |
|
| 26 |
+
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
## Base Lineage
|
| 30 |
+
|
| 31 |
+
Mixed distillation approach:
|
| 32 |
+
- **DeepSeek-V3 / R1** β MLA attention, MoE routing strategy, math/reasoning capability
|
| 33 |
+
- **Llama 4 Scout/Maverick** β multimodal vision encoder architecture, instruction following, long-context iRoPE scaling
|
| 34 |
+
- **Custom Matrix.Corp additions** β 17 novel modules, lattice routing, agentic infrastructure
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## Core Public Architectures Used
|
| 39 |
+
|
| 40 |
+
### 1. Multi-Head Latent Attention (MLA) β DeepSeek-V3
|
| 41 |
+
Compresses KV cache via low-rank projection. At 1M context, standard KV cache is impossible β MLA makes it viable. KV cache reduced by ~90% vs standard MHA.
|
| 42 |
+
|
| 43 |
+
### 2. Mixture of Experts (MoE) β DeepSeek-V3 Style
|
| 44 |
+
- Shared experts (always active) + routed experts (top-k per token)
|
| 45 |
+
- Fine-grained expert segmentation β more smaller experts vs fewer large ones
|
| 46 |
+
- Load balancing via auxiliary-free strategy (sequence-level bias, no loss penalty)
|
| 47 |
+
- Expert capacity: no token dropping, dynamic overflow routing
|
| 48 |
+
|
| 49 |
+
### 3. Mixture of Depths (MoD) β Google Research
|
| 50 |
+
Tokens dynamically skip transformer layers based on a learned routing decision. Easy tokens skip up to 50% of layers. Hard tokens (reasoning, code, structured output) use all layers. Net result: ~30% compute reduction at same quality.
|
| 51 |
+
|
| 52 |
+
### 4. iRoPE / YaRN Scaling β Llama 4 / YaRN paper
|
| 53 |
+
Interleaved NTK-aware RoPE scaling for 1M+ context without positional degradation. Alternating full-attention and sliding window layers. Full attention every 4th layer; sliding window (8K) on intermediate layers.
|
| 54 |
+
|
| 55 |
+
### 5. Sliding Window Attention β Mistral
|
| 56 |
+
8K sliding window on non-full-attention layers. O(n) memory for most layers, O(nΒ²) only on full-attention layers.
|
| 57 |
+
|
| 58 |
+
### 6. Speculative Decoding β Google DeepMind
|
| 59 |
+
Each Lattice model ships with a paired draft model (Lattice-120B-Draft at ~4B params). 3β5Γ inference speedup on provider hardware. Draft model shares embedding weights with main model.
|
| 60 |
+
|
| 61 |
+
### 7. Multimodal Vision Encoder β Llama 4 / InternVL lineage
|
| 62 |
+
- ViT-based image encoder (6B params, separate from LM)
|
| 63 |
+
- Cross-attention visual tokens injected at every 4th layer
|
| 64 |
+
- Supports: images, video frames, documents, charts, screenshots
|
| 65 |
+
- Patch resolution: 448Γ448 base, up to 4K via dynamic tiling
|
| 66 |
+
- Audio: separate audio encoder (Whisper-large-v3 lineage) for speech/sound understanding
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## 17 Custom Modules
|
| 71 |
+
|
| 72 |
+
### Module 1 β EQ Engine V2
|
| 73 |
+
Upgraded from Zenith's V1. Now tracks emotional arc across the **entire conversation**, not just per-layer.
|
| 74 |
+
- Persistent emotional state vector across turns (GRU with conversation-length memory)
|
| 75 |
+
- 12-emotion classification (expanded from 8)
|
| 76 |
+
- Frustration trajectory prediction β detects escalation before it peaks
|
| 77 |
+
- Per-user emotional baseline calibration (inferred from first 3 turns)
|
| 78 |
+
- Feeds into Persona Stability Enforcer (Module 14)
|
| 79 |
+
- Always FP16, never quantized
|
| 80 |
+
|
| 81 |
+
### Module 2 β Lattice Router
|
| 82 |
+
Custom MoE routing built specifically for this architecture. Not standard top-k.
|
| 83 |
+
- Hierarchical routing: token β domain cluster β expert group β individual expert
|
| 84 |
+
- Domain clusters: Reasoning, Code, Vision, Language, Agentic, Science, Creative, Safety
|
| 85 |
+
- Experts self-label during training via contrastive specialization loss
|
| 86 |
+
- Router is inspectable at inference β API exposes which expert cluster handled each segment
|
| 87 |
+
- Load-aware routing: aware of current server load, can shift to less-used experts
|
| 88 |
+
|
| 89 |
+
### Module 3 β Confidence Calibration Head
|
| 90 |
+
Runs in parallel with LM head on every token.
|
| 91 |
+
- Outputs epistemic uncertainty [0β1] per token
|
| 92 |
+
- Aggregated to sentence/paragraph level for API response metadata
|
| 93 |
+
- Trained on calibration data: model rewarded for accurate uncertainty, not just correct answers
|
| 94 |
+
- Exposed via API as `X-Lattice-Confidence` header per response chunk
|
| 95 |
+
- Feeds into Knowledge Boundary Detector (Module 17)
|
| 96 |
+
|
| 97 |
+
### Module 4 β Native Tool Schema Reasoner
|
| 98 |
+
Not prompt-based function calling. Dedicated architecture.
|
| 99 |
+
- Separate attention heads trained exclusively on tool/API schemas
|
| 100 |
+
- Supports: JSON Schema, OpenAPI 3.x, GraphQL, SQL DDL
|
| 101 |
+
- Schema tokenized as structured graph, not flat text
|
| 102 |
+
- Tool call planner: generates multi-step tool execution plans before first call
|
| 103 |
+
- Parallel tool dispatch: can issue multiple tool calls simultaneously
|
| 104 |
+
- Tool result integrator: dedicated cross-attention for injecting tool results
|
| 105 |
+
|
| 106 |
+
### Module 5 β Multi-Agent Coordination Layer (MACL)
|
| 107 |
+
Designed for multi-agent systems where multiple Lattice instances talk to each other.
|
| 108 |
+
- Structured agent message format: role, task_id, confidence, partial_result, handoff_request
|
| 109 |
+
- Agent role awareness: knows if it's orchestrator, subagent, critic, or executor
|
| 110 |
+
- Shared scratchpad attention: multiple agents can attend to same working memory
|
| 111 |
+
- Conflict resolution head: when two agents disagree, dedicated reasoning path
|
| 112 |
+
- Exposed via API as `lattice-agent-protocol` extension
|
| 113 |
+
|
| 114 |
+
### Module 6 β Hierarchical Context Compression Engine (HCCE)
|
| 115 |
+
Makes 1M+ context actually usable, not just theoretically supported.
|
| 116 |
+
- Every 32K tokens: compress to summary embedding + key-fact store
|
| 117 |
+
- Every 128K tokens: meta-summary of summaries
|
| 118 |
+
- Recent 32K: always full resolution
|
| 119 |
+
- Older context: summary + retrievable detail on demand
|
| 120 |
+
- Learned compression: trained to preserve causally important information
|
| 121 |
+
- Compression ratio: ~20:1 on narrative text, ~5:1 on code/structured data
|
| 122 |
+
|
| 123 |
+
### Module 7 β Structured Output Enforcer (SOE)
|
| 124 |
+
Guaranteed valid structured outputs. Not retry-based.
|
| 125 |
+
- Constrained decoding via token masking against target schema
|
| 126 |
+
- Supports: JSON, YAML, XML, Markdown, CSV, Python, SQL, HTML
|
| 127 |
+
- Zero-shot: give it a Pydantic model or JSON Schema, get guaranteed valid output
|
| 128 |
+
- Partial streaming: streams valid partial JSON as tokens generate
|
| 129 |
+
- Integrated with Tool Schema Reasoner (Module 4) for tool call outputs
|
| 130 |
+
|
| 131 |
+
### Module 8 β Causal Reasoning Graph (CRG)
|
| 132 |
+
Builds an explicit internal cause-effect graph during generation.
|
| 133 |
+
- Each reasoning step adds nodes + edges to internal graph
|
| 134 |
+
- Graph attention: later reasoning steps attend to causal graph, not just token sequence
|
| 135 |
+
- Detects reasoning loops and contradiction chains
|
| 136 |
+
- Exposed optionally via API as structured reasoning trace
|
| 137 |
+
- Improves performance on multi-hop questions, legal reasoning, scientific causality
|
| 138 |
+
|
| 139 |
+
### Module 9 β Temporal Awareness Module
|
| 140 |
+
Time is a first-class concept.
|
| 141 |
+
- Dedicated temporal embeddings: absolute dates, relative references ("last week"), durations
|
| 142 |
+
- Timeline builder: constructs event timelines from unstructured text
|
| 143 |
+
- Temporal consistency checker: flags contradictions in event ordering
|
| 144 |
+
- Knowledge cutoff awareness: trained to know what it does and doesn't know about recency
|
| 145 |
+
- Feeds into Knowledge Boundary Detector (Module 17)
|
| 146 |
+
|
| 147 |
+
### Module 10 β Cross-Lingual Semantic Alignment Layer
|
| 148 |
+
50+ language support with deep semantic alignment, not surface translation.
|
| 149 |
+
- Language-agnostic semantic embedding space
|
| 150 |
+
- Code-switching aware: handles mixed-language inputs naturally
|
| 151 |
+
- Script normalization: handles CJK, Arabic RTL, Devanagari natively at tokenizer level
|
| 152 |
+
- Dialect modeling: distinguishes Brazilian vs European Portuguese, Simplified vs Traditional Chinese
|
| 153 |
+
- Translation quality head: can score its own translation outputs
|
| 154 |
+
|
| 155 |
+
### Module 11 β Safety Reasoning Module (SRM)
|
| 156 |
+
Auditable, explainable safety β key differentiator for inference providers.
|
| 157 |
+
- Dedicated safety reasoning chain before generation (not post-hoc filtering)
|
| 158 |
+
- Produces explicit safety trace: what risk was considered, what was ruled out, why
|
| 159 |
+
- Granular harm taxonomy: 47 harm categories with confidence scores
|
| 160 |
+
- Provider-configurable: API operators can tune safety thresholds per deployment
|
| 161 |
+
- Audit log: safety decisions logged in structured format for compliance
|
| 162 |
+
- Separate from EQ Engine β safety is logic-based, not emotion-based
|
| 163 |
+
|
| 164 |
+
### Module 12 β Vision-Language Grounding Module
|
| 165 |
+
Deep integration between visual and language understanding.
|
| 166 |
+
- Object-level grounding: links text references to bounding box regions
|
| 167 |
+
- Chart/diagram interpreter: specialized attention for data visualizations
|
| 168 |
+
- Document layout understanding: OCR + structure (tables, headings, columns)
|
| 169 |
+
- Screenshot-to-code: dedicated pathway for UI β code generation
|
| 170 |
+
- Video temporal grounding: links text references to specific frames
|
| 171 |
+
|
| 172 |
+
### Module 13 β Long-Horizon Task Planner
|
| 173 |
+
Agentic planning as a first-class capability.
|
| 174 |
+
- Task decomposition head: breaks goals into subtask DAGs
|
| 175 |
+
- Dependency resolver: identifies which subtasks block others
|
| 176 |
+
- Progress tracker: maintains task state across long conversations
|
| 177 |
+
- Replanning trigger: detects when a plan needs revision based on new info
|
| 178 |
+
- Integrates with MACL (Module 5) for distributing tasks across agents
|
| 179 |
+
- Outputs structured task graphs via API
|
| 180 |
+
|
| 181 |
+
### Module 14 β Persona Stability Enforcer (PSE)
|
| 182 |
+
Maintains consistent identity, tone, and personality across million-token contexts.
|
| 183 |
+
- Persona embedding: operator-defined persona injected as persistent memory
|
| 184 |
+
- Style consistency loss during training: penalizes tone drift
|
| 185 |
+
- Character consistency checker: ensures factual claims about self don't contradict
|
| 186 |
+
- Feeds from EQ Engine V2: adjusts warmth/formality dynamically but within persona bounds
|
| 187 |
+
- Critical for long-running API deployments and character-based applications
|
| 188 |
+
|
| 189 |
+
### Module 15 β API Telemetry & Observability Hooks
|
| 190 |
+
Built into the model, not bolted on by the provider.
|
| 191 |
+
- Per-token latency profiling embedded in forward pass
|
| 192 |
+
- Expert utilization stats per request
|
| 193 |
+
- Context compression events flagged in stream
|
| 194 |
+
- Confidence + uncertainty exposed per chunk
|
| 195 |
+
- Module activation trace: which of the 17 modules fired for each request
|
| 196 |
+
- All exposed as structured SSE metadata alongside token stream
|
| 197 |
+
|
| 198 |
+
### Module 16 β Code Intelligence Engine (CIE)
|
| 199 |
+
Goes beyond code completion β full software engineering understanding.
|
| 200 |
+
- AST-aware attention: code parsed to AST, structural tokens injected
|
| 201 |
+
- Multi-file context graph: understands cross-file dependencies
|
| 202 |
+
- Runtime simulation head: predicts execution behavior without running code
|
| 203 |
+
- Bug pattern library: trained on CVE database + common bug taxonomies
|
| 204 |
+
- Test generation: given code, generates comprehensive test suite
|
| 205 |
+
- Integrates with Tool Schema Reasoner for build/exec tool use
|
| 206 |
+
|
| 207 |
+
### Module 17 β Knowledge Boundary Detector (KBD)
|
| 208 |
+
Knows what it doesn't know.
|
| 209 |
+
- Hallucination risk scorer per claim
|
| 210 |
+
- Sources: Confidence Calibration Head + Temporal Module + retrieval signal
|
| 211 |
+
- Claim classification: known / uncertain / likely-hallucination / outside-training
|
| 212 |
+
- Citation need detector: flags claims that should be sourced
|
| 213 |
+
- Self-consistency checker: runs 3 forward passes on uncertain claims, checks agreement
|
| 214 |
+
- Exposed via API: `X-Lattice-Hallucination-Risk` per response
|
| 215 |
+
|
| 216 |
+
---
|
| 217 |
+
|
| 218 |
+
## Hardware & Inference Specs
|
| 219 |
+
|
| 220 |
+
### Lattice-120B
|
| 221 |
+
| Config | Active Params | VRAM | TPS (est.) |
|
| 222 |
+
|---|---|---|---|
|
| 223 |
+
| BF16 | ~22B | ~240GB | ~35 TPS |
|
| 224 |
+
| INT8 | ~22B | ~120GB | ~70 TPS |
|
| 225 |
+
| INT4 | ~22B | ~60GB | ~130 TPS |
|
| 226 |
+
Target: 4Γ H100 80GB (INT8) or 8Γ p300a (INT4)
|
| 227 |
+
|
| 228 |
+
### Lattice-430B
|
| 229 |
+
| Config | Active Params | VRAM | TPS (est.) |
|
| 230 |
+
|---|---|---|---|
|
| 231 |
+
| BF16 | ~38B | ~860GB | ~18 TPS |
|
| 232 |
+
| INT8 | ~38B | ~430GB | ~38 TPS |
|
| 233 |
+
| INT4 | ~38B | ~215GB | ~72 TPS |
|
| 234 |
+
Target: 8Γ H100 80GB (INT4) or 28Γ p300a (INT4)
|
| 235 |
+
|
| 236 |
+
### Lattice-671B
|
| 237 |
+
| Config | Active Params | VRAM | TPS (est.) |
|
| 238 |
+
|---|---|---|---|
|
| 239 |
+
| BF16 | ~47B | ~1.34TB | ~12 TPS |
|
| 240 |
+
| INT8 | ~47B | ~671GB | ~26 TPS |
|
| 241 |
+
| INT4 | ~47B | ~336GB | ~50 TPS |
|
| 242 |
+
Target: 32Γ H100 80GB (INT4) or 48Γ p300a (INT4)
|
| 243 |
+
|
| 244 |
+
---
|
| 245 |
+
|
| 246 |
+
## Training Strategy
|
| 247 |
+
|
| 248 |
+
### Phase 1 β Foundation (all sizes)
|
| 249 |
+
- Mixed distillation from DeepSeek-V3, DeepSeek-R1, Llama 4 Scout/Maverick
|
| 250 |
+
- Data: web text, code, scientific papers, books, multimodal datasets
|
| 251 |
+
- Context: start at 8K, scale to 1M via curriculum
|
| 252 |
+
- MoE load balancing stabilization
|
| 253 |
+
|
| 254 |
+
### Phase 2 β Module Integration
|
| 255 |
+
- Each of 17 modules trained with task-specific auxiliary losses
|
| 256 |
+
- Module loss weights tuned per module (see training_config.py)
|
| 257 |
+
- Modules frozen in turn as they converge
|
| 258 |
+
|
| 259 |
+
### Phase 3 β Agentic Fine-tuning
|
| 260 |
+
- Tool use, multi-agent coordination, long-horizon task completion
|
| 261 |
+
- Synthetic agentic trajectories generated by Lattice-120B bootstrapping larger models
|
| 262 |
+
- RLHF / GRPO on agentic task completion + safety
|
| 263 |
+
|
| 264 |
+
### Phase 4 β Alignment & Safety
|
| 265 |
+
- Safety Reasoning Module fine-tuning on harm taxonomy
|
| 266 |
+
- Constitutional AI-style self-critique
|
| 267 |
+
- Red-team adversarial fine-tuning
|
| 268 |
+
|
| 269 |
+
---
|
| 270 |
+
|
| 271 |
+
## API Design (Inference Provider Ready)
|
| 272 |
+
|
| 273 |
+
OpenAI-compatible with Lattice extensions:
|
| 274 |
+
|
| 275 |
+
```python
|
| 276 |
+
from openai import OpenAI
|
| 277 |
+
|
| 278 |
+
client = OpenAI(
|
| 279 |
+
base_url="https://api.provider.com/v1",
|
| 280 |
+
api_key="your-key"
|
| 281 |
+
)
|
| 282 |
+
|
| 283 |
+
response = client.chat.completions.create(
|
| 284 |
+
model="matrix-lattice-671b",
|
| 285 |
+
messages=[{"role": "user", "content": "Your prompt"}],
|
| 286 |
+
tools=[...], # Native tool schemas
|
| 287 |
+
extra_body={
|
| 288 |
+
"lattice": {
|
| 289 |
+
"expose_confidence": True,
|
| 290 |
+
"expose_module_trace": False,
|
| 291 |
+
"expose_reasoning_graph": False,
|
| 292 |
+
"safety_tier": "standard", # standard | strict | minimal
|
| 293 |
+
"persona": "helpful-assistant",
|
| 294 |
+
"agent_role": "orchestrator" # orchestrator | subagent | critic
|
| 295 |
+
}
|
| 296 |
+
}
|
| 297 |
+
)
|
| 298 |
+
|
| 299 |
+
# Response includes standard OpenAI fields PLUS:
|
| 300 |
+
# response.lattice.confidence_scores
|
| 301 |
+
# response.lattice.active_modules
|
| 302 |
+
# response.lattice.hallucination_risk
|
| 303 |
+
# response.lattice.expert_clusters_used
|
| 304 |
+
```
|
| 305 |
+
|
| 306 |
+
---
|
| 307 |
+
|
| 308 |
+
## Status
|
| 309 |
+
- π΄ Planned β Architecture specification complete
|
| 310 |
+
- Training infrastructure: TBD
|
| 311 |
+
- Timeline: TBD (depends on compute access at scale)
|
| 312 |
+
|
| 313 |
+
## HuggingFace
|
| 314 |
+
- `Matrix-Corp/Lattice-120B-V1` (planned)
|
| 315 |
+
- `Matrix-Corp/Lattice-430B-V1` (planned)
|
| 316 |
+
- `Matrix-Corp/Lattice-671B-V1` (planned)
|
| 317 |
+
- Collection: `Matrix-Corp/lattice-v1` (planned)
|