ianshank commited on
Commit
6d50ac4
Β·
verified Β·
1 Parent(s): 0e196af

Upload MangoMAS MoE-7M model weights and config

Browse files
Files changed (3) hide show
  1. README.md +112 -0
  2. config.json +12 -0
  3. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ library_name: pytorch
5
+ tags:
6
+ - mixture-of-experts
7
+ - multi-agent
8
+ - neural-routing
9
+ - cognitive-architecture
10
+ - reinforcement-learning
11
+ pipeline_tag: text-classification
12
+ ---
13
+
14
+ # MangoMAS-MoE-7M
15
+
16
+ A ~7 million parameter **Mixture-of-Experts** (MoE) neural routing model for multi-agent task orchestration.
17
+
18
+ ## Model Architecture
19
+
20
+ ```
21
+ Input (64-dim feature vector from featurize64())
22
+ β”‚
23
+ β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”
24
+ β”‚ GATE β”‚ Linear(64β†’512) β†’ ReLU β†’ Linear(512β†’16) β†’ Softmax
25
+ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
26
+ β”‚
27
+ ╔═══════════════════════════════════════════════════╗
28
+ β•‘ 16 Expert Towers (parallel) β•‘
29
+ β•‘ Each: Linear(64β†’512) β†’ ReLU β†’ Linear(512β†’512) β•‘
30
+ β•‘ β†’ ReLU β†’ Linear(512β†’256) β•‘
31
+ β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
32
+ β”‚
33
+ Weighted Sum (gate_weights Γ— expert_outputs)
34
+ β”‚
35
+ Classifier Head: Linear(256β†’N_classes)
36
+ β”‚
37
+ Output Logits
38
+ ```
39
+
40
+ ### Parameter Count
41
+
42
+ | Component | Parameters |
43
+ |-----------|-----------|
44
+ | Gate Network | 64Γ—512 + 512 + 512Γ—16 + 16 = ~41K |
45
+ | 16 Expert Towers | 16 Γ— (64Γ—512 + 512 + 512Γ—512 + 512 + 512Γ—256 + 256) = ~6.9M |
46
+ | Classifier Head | 256Γ—10 + 10 = ~2.6K |
47
+ | **Total** | **~6.95M** |
48
+
49
+ ## Input: 64-Dimensional Feature Vector
50
+
51
+ The model consumes a 64-dimensional feature vector produced by `featurize64()`:
52
+
53
+ - **Dims 0-31**: Hash-based sinusoidal encoding (content fingerprint)
54
+ - **Dims 32-47**: Domain tag detection (code, security, architecture, etc.)
55
+ - **Dims 48-55**: Structural signals (length, punctuation, questions)
56
+ - **Dims 56-59**: Sentiment polarity estimates
57
+ - **Dims 60-63**: Novelty/complexity scores
58
+
59
+ ## Training
60
+
61
+ - **Optimizer**: AdamW (lr=1e-4, weight_decay=0.01)
62
+ - **Updates**: Online learning from routing feedback
63
+ - **Minimum reward threshold**: 0.1
64
+ - **Device**: CPU / MPS / CUDA (auto-detected)
65
+
66
+ ## Usage
67
+
68
+ ```python
69
+ import torch
70
+ from moe_model import MixtureOfExperts7M, featurize64
71
+
72
+ # Create model
73
+ model = MixtureOfExperts7M(num_classes=10, num_experts=16)
74
+
75
+ # Extract features
76
+ features = featurize64("Design a secure REST API with authentication")
77
+ x = torch.tensor([features], dtype=torch.float32)
78
+
79
+ # Forward pass
80
+ logits, gate_weights = model(x)
81
+ print(f"Expert weights: {gate_weights}")
82
+ print(f"Top expert: {gate_weights.argmax().item()}")
83
+ ```
84
+
85
+ ## Intended Use
86
+
87
+ This model is part of the **MangoMAS** multi-agent orchestration platform. It routes incoming tasks to the most appropriate expert agents based on the task's semantic content.
88
+
89
+ **Primary use cases:**
90
+
91
+ - Multi-agent task routing
92
+ - Expert selection for cognitive cell orchestration
93
+ - Research demonstration of MoE architectures
94
+
95
+ ## Interactive Demo
96
+
97
+ Try the model live on the [MangoMAS HuggingFace Space](https://huggingface.co/spaces/ianshank/MangoMAS).
98
+
99
+ ## Citation
100
+
101
+ ```bibtex
102
+ @software{mangomas2026,
103
+ title={MangoMAS: Multi-Agent Cognitive Architecture},
104
+ author={Shanker, Ian},
105
+ year={2026},
106
+ url={https://github.com/ianshank/MangoMAS}
107
+ }
108
+ ```
109
+
110
+ ## Author
111
+
112
+ Built by [Ian Shanker](https://huggingface.co/ianshank) β€” MangoMAS Engineering
config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "MixtureOfExperts7M",
3
+ "num_classes": 10,
4
+ "num_experts": 16,
5
+ "input_dim": 64,
6
+ "expert_hidden1": 512,
7
+ "expert_hidden2": 512,
8
+ "expert_output_dim": 256,
9
+ "gate_hidden": 512,
10
+ "parameter_count": 6880282,
11
+ "framework": "pytorch"
12
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0fe5f5a7afeb0e16c82289fd12933adbc2a9ac92461a291a74a4ecd97b26ec82
3
+ size 27547547