amewebstudio commited on
Commit
b62e54b
·
verified ·
1 Parent(s): 5fe4d99

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +175 -98
  2. cognitive_modules.py +1206 -0
  3. setup.py +105 -0
README.md CHANGED
@@ -1,6 +1,6 @@
1
  # COGNITIVE-CORE Framework
2
 
3
- > Standard universel pour les architectures cognitives d'Ame Web Studio
4
 
5
  ## 🏗️ Structure
6
 
@@ -9,151 +9,228 @@ cognitive-core/
9
  ├── __init__.py # Exports du package
10
  ├── cognitive_base.py # Classes de base (Config, Modules, PreTrainedModel)
11
  ├── cognitive_checkpoint.py # Chargement/sauvegarde avec remappage auto
 
 
12
  ├── cognitive_utils.py # Utilitaires (device, mémoire, tokens)
13
  └── README.md # Cette documentation
14
  ```
15
 
 
16
  ## 🚀 Installation
17
 
18
- ```python
19
- # Ajouter à votre modèle
20
- import sys
21
- sys.path.append("/path/to/standardisation")
22
 
23
- from cognitive_core import (
24
- CognitiveConfig,
25
- CognitivePreTrainedModel,
26
- setup_environment,
27
- get_device
28
- )
29
- ```
30
 
31
- ## 📖 Guide d'Utilisation
 
32
 
33
- ### 1. Créer une Configuration
 
34
 
35
- ```python
36
- from cognitive_core import CognitiveConfig
37
 
38
- class MyModelConfig(CognitiveConfig):
39
- model_type = "my_cognitive_model"
40
-
41
- def __init__(
42
- self,
43
- vocab_size: int = 50000,
44
- # ... vos paramètres
45
- **kwargs
46
- ):
47
- super().__init__(**kwargs)
48
- self.vocab_size = vocab_size
49
  ```
50
 
51
- ### 2. Créer un Modèle Cognitif
52
 
53
- ```python
54
- from cognitive_core import CognitivePreTrainedModel, CognitiveModule
55
- import torch.nn as nn
56
 
57
- class MyMemoryModule(CognitiveModule):
58
- def __init__(self, config):
59
- super().__init__(config)
60
- self.memory = nn.Parameter(torch.randn(1000, config.d_model))
61
-
62
- def forward(self, x, **kwargs):
63
- # Votre logique
64
- return {"output": x, "memory_used": True}
65
-
66
- def reset_state(self):
67
- pass
68
 
69
- class MyModel(CognitivePreTrainedModel):
70
- config_class = MyModelConfig
71
-
72
- def __init__(self, config):
73
- super().__init__(config)
74
- self.embeddings = nn.Embedding(config.vocab_size, config.d_model)
75
- self.memory = MyMemoryModule(config)
76
- self.lm_head = nn.Linear(config.d_model, config.vocab_size)
77
- self.post_init()
78
-
79
- def forward(self, input_ids, **kwargs):
80
- x = self.embeddings(input_ids)
81
- mem_out = self.memory(x)
82
- logits = self.lm_head(mem_out["output"])
83
- return logits
84
  ```
85
 
86
- ### 3. Chargement Automatique
 
 
 
 
 
 
87
 
88
- Le framework gère automatiquement:
89
- - ✅ Remappage des clés (avec/sans préfixe `model.`)
90
- - ✅ Validation du checkpoint
91
- - ✅ Compatibilité HuggingFace
92
 
93
  ```python
94
- from cognitive_core import load_cognitive_checkpoint
 
95
 
96
- # Charger un checkpoint personnalisé
97
- info = load_cognitive_checkpoint(model, "path/to/checkpoint.pt", verbose=True)
98
- print(f"Clés chargées: {info['validation']['matched_keys']}")
99
  ```
100
 
101
- ### 4. Configuration Environnement (Kaggle/Colab)
102
 
103
- ```python
104
- from cognitive_core import setup_environment, get_device, get_hf_token
 
 
105
 
106
- # Configure cache HuggingFace dans répertoire accessible
107
- cache_dir = setup_environment()
 
 
 
108
 
109
- # Détection automatique GPU/CPU
110
- device = get_device()
 
 
 
111
 
112
- # Récupérer token HuggingFace
113
- token = get_hf_token()
114
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
 
116
- ## 🔧 Modules Disponibles
 
 
 
 
117
 
 
118
  | Module | Description |
119
  |--------|-------------|
120
- | `CognitiveConfig` | Configuration de base héritant de PretrainedConfig |
121
- | `CognitiveModule` | Interface abstraite pour modules cognitifs |
122
- | `MemoryModule` | Interface pour modules de mémoire (store/retrieve) |
123
- | `TemporalModule` | Interface pour modules temporels (predict) |
124
- | `WorldModelModule` | Interface pour modèles du monde (update/imagine) |
125
- | `CognitivePreTrainedModel` | Modèle HuggingFace avec remappage auto |
126
 
127
- ## 🎯 Cas d'Usage
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
 
129
- ### Vision Cognitive
130
  ```python
131
- class CognitiveViTConfig(CognitiveConfig):
132
- model_type = "cognitive_vit"
133
- # ... config vision
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  ```
135
 
136
- ### World Model
 
137
  ```python
138
- class CognitiveWorldConfig(CognitiveConfig):
 
 
 
 
 
139
  model_type = "cognitive_world"
140
- # ... config world model
 
 
 
 
 
 
 
 
 
141
  ```
142
 
143
- ### Multimodal
 
144
  ```python
145
- class CognitiveMultimodalConfig(CognitiveConfig):
 
 
 
 
 
146
  model_type = "cognitive_multimodal"
147
- vision_enabled = True
148
- audio_enabled = True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
  ```
150
 
151
  ## 📊 Garanties du Standard
152
 
153
- - ✅ **Intégrité des poids** - Aucun poids réinitialisé silencieusement
154
- - ✅ **Compatibilité HuggingFace** - AutoModel fonctionne nativement
 
 
155
  - ✅ **Portabilité** - Kaggle, Colab, Local sans modification
156
- - ✅ **Extensibilité** - Ajouter vos modules facilement
157
 
158
  ## 📄 Licence
159
 
 
1
  # COGNITIVE-CORE Framework
2
 
3
+ > 🧠 Standard universel pour les architectures cognitives d'Ame Web Studio
4
 
5
  ## 🏗️ Structure
6
 
 
9
  ├── __init__.py # Exports du package
10
  ├── cognitive_base.py # Classes de base (Config, Modules, PreTrainedModel)
11
  ├── cognitive_checkpoint.py # Chargement/sauvegarde avec remappage auto
12
+ ├── cognitive_modules.py # 🆕 TOUS les modules cognitifs réutilisables
13
+ ├── cognitive_training.py # Utilitaires d'entraînement
14
  ├── cognitive_utils.py # Utilitaires (device, mémoire, tokens)
15
  └── README.md # Cette documentation
16
  ```
17
 
18
+
19
  ## 🚀 Installation
20
 
21
+ ### Option 1: Via Pip (Recommandé)
 
 
 
22
 
23
+ ```bash
24
+ # Installation standard
25
+ pip install cognitive-core
 
 
 
 
26
 
27
+ # Avec support Vision
28
+ pip install "cognitive-core[vision]"
29
 
30
+ # Avec support Audio
31
+ pip install "cognitive-core[audio]"
32
 
33
+ # Avec support Entraînement (WandB/Tensorboard)
34
+ pip install "cognitive-core[training]"
35
 
36
+ # Installation Complète
37
+ pip install "cognitive-core[all]"
 
 
 
 
 
 
 
 
 
38
  ```
39
 
40
+ ### Option 2: Via Git (Dernière version)
41
 
42
+ ```bash
43
+ pip install git+https://github.com/Volgat/nexus-standardisation.git
44
+ ```
45
 
46
+ ### Option 3: Via HuggingFace
 
 
 
 
 
 
 
 
 
 
47
 
48
+ ```bash
49
+ pip install git+https://huggingface.co/amewebstudio/cognitive-core
 
 
 
 
 
 
 
 
 
 
 
 
 
50
  ```
51
 
52
+ ### Option 4: Mode Développement (Local)
53
+
54
+ ```bash
55
+ git clone https://github.com/Volgat/nexus-standardisation.git
56
+ cd nexus-standardisation
57
+ pip install -e .
58
+ ```
59
 
60
+ Si vous n'utilisez pas pip, vous pouvez simplement ajouter le chemin :
 
 
 
61
 
62
  ```python
63
+ import sys
64
+ sys.path.append("/path/to/standardisation")
65
 
66
+ from cognitive_core import *
 
 
67
  ```
68
 
69
+ ## 📦 Modules Disponibles
70
 
71
+ ### Normalisation
72
+ | Module | Description |
73
+ |--------|-------------|
74
+ | `RMSNorm` | Root Mean Square Normalization (plus efficace que LayerNorm) |
75
 
76
+ ### Encodage Positionnel
77
+ | Module | Description |
78
+ |--------|-------------|
79
+ | `RotaryEmbedding` | RoPE avec scaling pour contextes longs |
80
+ | `SinusoidalPositionalEncoding` | Encodage sinusoïdal classique |
81
 
82
+ ### Attention
83
+ | Module | Description |
84
+ |--------|-------------|
85
+ | `GroupedQueryAttention` | GQA avec RoPE et KV-Cache |
86
+ | `CrossAttention` | Attention croisée pour fusion multimodale |
87
 
88
+ ### Réseaux Feed-Forward
89
+ | Module | Description |
90
+ |--------|-------------|
91
+ | `SwiGLU` | Activation SwiGLU (meilleure que GELU) |
92
+ | `MLP` | MLP standard avec GELU |
93
+
94
+ ### Mixture of Experts
95
+ | Module | Description |
96
+ |--------|-------------|
97
+ | `Expert` | Expert unique avec SwiGLU |
98
+ | `SparseMoE` | MoE sparse avec routing Top-K |
99
+
100
+ ### Systèmes de Mémoire
101
+ | Module | Description |
102
+ |--------|-------------|
103
+ | `ContrastiveLPOL` | Mémoire LPOL avec 9 domaines de connaissances |
104
+ | `MultiScaleMemory` | Mémoire court/long terme avec consolidation |
105
+ | `EpisodicMemory` | Mémoire épisodique pour expériences |
106
+
107
+ ### World Model
108
+ | Module | Description |
109
+ |--------|-------------|
110
+ | `WorldBuffer` | Buffer de monde unique avec prédiction |
111
+ | `MultiWorldBuffer` | Buffers multi-domaines (physical, social, abstract, temporal) |
112
+
113
+ ### État Interne
114
+ | Module | Description |
115
+ |--------|-------------|
116
+ | `NonVerbalTension` | Tracker de tension basé sur erreur de prédiction |
117
+ | `InternalState` | État cognitif interne complet |
118
 
119
+ ### Rêve & Identité
120
+ | Module | Description |
121
+ |--------|-------------|
122
+ | `DreamPhase` | Phase de rêve pour consolidation mémoire |
123
+ | `SelfTrace` | Tracking d'identité à travers le temps |
124
 
125
+ ### Neurogenèse
126
  | Module | Description |
127
  |--------|-------------|
128
+ | `NeurogenesisLayer` | Couche avec naissance/mort dynamique de neurones |
 
 
 
 
 
129
 
130
+ ### EARCP
131
+ | Module | Description |
132
+ |--------|-------------|
133
+ | `EARCPModule` | Ensemble Auto-Regulated Coherence Protocol |
134
+
135
+ ### VAE (Vision/World Model)
136
+ | Module | Description |
137
+ |--------|-------------|
138
+ | `VAEEncoder` | Encodeur VAE convolutionnel |
139
+ | `VAEDecoder` | Décodeur VAE convolutionnel |
140
+
141
+ ### Espace Latent Universel
142
+ | Module | Description |
143
+ |--------|-------------|
144
+ | `UniversalLatentSpace` | ULS pour alignement cross-modal (text, vision, audio) |
145
+
146
+ ## 🎯 Exemples d'Utilisation
147
+
148
+ ### Modèle de Langage Cognitif
149
 
 
150
  ```python
151
+ from cognitive_core import (
152
+ CognitiveConfig, CognitivePreTrainedModel,
153
+ GroupedQueryAttention, SparseMoE, ContrastiveLPOL,
154
+ MultiScaleMemory, RMSNorm
155
+ )
156
+
157
+ class MyLLMConfig(CognitiveConfig):
158
+ model_type = "cognitive_llm"
159
+ vocab_size = 50000
160
+
161
+ class MyCognitiveLayer(nn.Module):
162
+ def __init__(self, config):
163
+ super().__init__()
164
+ self.attn = GroupedQueryAttention(config.d_model, config.n_heads)
165
+ self.moe = SparseMoE(config.d_model, config.d_ff, num_experts=8)
166
+ self.norm1 = RMSNorm(config.d_model)
167
+ self.norm2 = RMSNorm(config.d_model)
168
+
169
+ def forward(self, x):
170
+ x = x + self.attn(self.norm1(x))[0]
171
+ moe_out, aux = self.moe(self.norm2(x))
172
+ return x + moe_out, aux
173
  ```
174
 
175
+ ### World Model Cognitif
176
+
177
  ```python
178
+ from cognitive_core import (
179
+ CognitiveConfig, VAEEncoder, VAEDecoder,
180
+ MultiWorldBuffer, EpisodicMemory, NeurogenesisLayer
181
+ )
182
+
183
+ class WorldModelConfig(CognitiveConfig):
184
  model_type = "cognitive_world"
185
+ world_state_dim = 256
186
+
187
+ class CognitiveWorldModel(nn.Module):
188
+ def __init__(self, config):
189
+ super().__init__()
190
+ self.encoder = VAEEncoder(in_channels=3, latent_dim=256)
191
+ self.decoder = VAEDecoder(latent_dim=256, out_channels=3)
192
+ self.world = MultiWorldBuffer(config.d_model, config)
193
+ self.memory = EpisodicMemory(config.d_model, config)
194
+ self.neurogenesis = NeurogenesisLayer(256, 64, config)
195
  ```
196
 
197
+ ### Vision-Language Multimodal
198
+
199
  ```python
200
+ from cognitive_core import (
201
+ CognitiveConfig, UniversalLatentSpace, CrossAttention,
202
+ ContrastiveLPOL, DreamPhase, SelfTrace
203
+ )
204
+
205
+ class MultimodalConfig(CognitiveConfig):
206
  model_type = "cognitive_multimodal"
207
+
208
+ class CognitiveMultimodal(nn.Module):
209
+ def __init__(self, config):
210
+ super().__init__()
211
+ self.uls = UniversalLatentSpace(config.d_model, config)
212
+ self.cross_attn = CrossAttention(config.d_model)
213
+ self.memory = ContrastiveLPOL(config.d_model, config)
214
+ self.dream = DreamPhase(config.d_model, config)
215
+ self.self_trace = SelfTrace(config.d_model, config)
216
+
217
+ def forward(self, text_features, vision_features):
218
+ # Fusion dans l'espace latent universel
219
+ uls_out = self.uls({"text": text_features, "vision": vision_features})
220
+ # Attention croisée
221
+ fused = self.cross_attn(text_features, vision_features)
222
+ # Mémoire
223
+ mem_out = self.memory(fused)
224
+ return mem_out["output"]
225
  ```
226
 
227
  ## 📊 Garanties du Standard
228
 
229
+ - ✅ **Agnostique** - Fonctionne pour LLM, Vision, Audio, World Model, Multimodal
230
+ - ✅ **Composable** - Tous les modules sont indépendants et combinables
231
+ - ✅ **HuggingFace-Compatible** - Hérite de PreTrainedModel
232
+ - ✅ **Remappage Auto** - Gère les différences de format de checkpoint
233
  - ✅ **Portabilité** - Kaggle, Colab, Local sans modification
 
234
 
235
  ## 📄 Licence
236
 
cognitive_modules.py ADDED
@@ -0,0 +1,1206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ COGNITIVE-CORE: Reusable Cognitive Modules
3
+ ===========================================
4
+
5
+ Complete library of cognitive modules that can be composed to build
6
+ any cognitive model: vision, language, world model, multimodal, etc.
7
+
8
+ All modules are agnostic and can be configured for different use cases.
9
+
10
+ Copyright © 2026 Mike Amega (Logo) - Ame Web Studio
11
+ License: Proprietary - All Rights Reserved
12
+ """
13
+
14
+ import math
15
+ import torch
16
+ import torch.nn as nn
17
+ import torch.nn.functional as F
18
+ from typing import Dict, List, Optional, Any, Tuple
19
+ from collections import deque
20
+ from abc import ABC, abstractmethod
21
+
22
+ from .cognitive_base import CognitiveConfig, CognitiveModule
23
+
24
+
25
+ # ==============================================================================
26
+ # SECTION 1: NORMALIZATION LAYERS
27
+ # ==============================================================================
28
+
29
+
30
+ class RMSNorm(nn.Module):
31
+ """Root Mean Square Layer Normalization - More efficient than LayerNorm."""
32
+
33
+ def __init__(self, dim: int, eps: float = 1e-6):
34
+ super().__init__()
35
+ self.eps = eps
36
+ self.weight = nn.Parameter(torch.ones(dim))
37
+
38
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
39
+ rms = torch.sqrt(torch.mean(x**2, dim=-1, keepdim=True) + self.eps)
40
+ return x / rms * self.weight
41
+
42
+
43
+ # ==============================================================================
44
+ # SECTION 2: POSITIONAL ENCODINGS
45
+ # ==============================================================================
46
+
47
+
48
+ class RotaryEmbedding(nn.Module):
49
+ """Rotary Position Embedding (RoPE) with scaling support."""
50
+
51
+ def __init__(
52
+ self, dim: int, max_seq_len: int = 4096, base: int = 10000, scaling: float = 1.0
53
+ ):
54
+ super().__init__()
55
+ self.dim = dim
56
+ self.scaling = scaling
57
+
58
+ inv_freq = 1.0 / (base ** (torch.arange(0, dim, 2).float() / dim))
59
+ self.register_buffer("inv_freq", inv_freq)
60
+
61
+ t = torch.arange(max_seq_len).float() / scaling
62
+ freqs = torch.einsum("i,j->ij", t, inv_freq)
63
+ emb = torch.cat([freqs, freqs], dim=-1)
64
+ self.register_buffer("cos_cache", emb.cos()[None, None, :, :])
65
+ self.register_buffer("sin_cache", emb.sin()[None, None, :, :])
66
+
67
+ def forward(
68
+ self, q: torch.Tensor, k: torch.Tensor, seq_len: int, offset: int = 0
69
+ ) -> Tuple[torch.Tensor, torch.Tensor]:
70
+ cos = self.cos_cache[:, :, offset : offset + seq_len, :].to(q.dtype)
71
+ sin = self.sin_cache[:, :, offset : offset + seq_len, :].to(q.dtype)
72
+ q_rot = (q * cos) + (self._rotate_half(q) * sin)
73
+ k_rot = (k * cos) + (self._rotate_half(k) * sin)
74
+ return q_rot, k_rot
75
+
76
+ def _rotate_half(self, x: torch.Tensor) -> torch.Tensor:
77
+ x1, x2 = x[..., : x.shape[-1] // 2], x[..., x.shape[-1] // 2 :]
78
+ return torch.cat([-x2, x1], dim=-1)
79
+
80
+
81
+ class SinusoidalPositionalEncoding(nn.Module):
82
+ """Classical sinusoidal positional encoding."""
83
+
84
+ def __init__(self, d_model: int, max_seq_len: int = 4096, dropout: float = 0.1):
85
+ super().__init__()
86
+ self.dropout = nn.Dropout(dropout)
87
+
88
+ pe = torch.zeros(max_seq_len, d_model)
89
+ position = torch.arange(0, max_seq_len, dtype=torch.float).unsqueeze(1)
90
+ div_term = torch.exp(
91
+ torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model)
92
+ )
93
+
94
+ pe[:, 0::2] = torch.sin(position * div_term)
95
+ pe[:, 1::2] = torch.cos(position * div_term)
96
+ pe = pe.unsqueeze(0)
97
+
98
+ self.register_buffer("pe", pe)
99
+
100
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
101
+ x = x + self.pe[:, : x.size(1)]
102
+ return self.dropout(x)
103
+
104
+
105
+ # ==============================================================================
106
+ # SECTION 3: ATTENTION MECHANISMS
107
+ # ==============================================================================
108
+
109
+
110
+ class GroupedQueryAttention(nn.Module):
111
+ """Grouped Query Attention (GQA) with RoPE and KV-Cache support."""
112
+
113
+ def __init__(
114
+ self,
115
+ d_model: int,
116
+ n_heads: int = 8,
117
+ n_kv_heads: int = 4,
118
+ max_seq_len: int = 4096,
119
+ dropout: float = 0.1,
120
+ use_rope: bool = True,
121
+ ):
122
+ super().__init__()
123
+ self.n_heads = n_heads
124
+ self.n_kv_heads = n_kv_heads
125
+ self.head_dim = d_model // n_heads
126
+ self.n_rep = n_heads // n_kv_heads
127
+ self.scale = self.head_dim**-0.5
128
+
129
+ self.q_proj = nn.Linear(d_model, n_heads * self.head_dim, bias=False)
130
+ self.k_proj = nn.Linear(d_model, n_kv_heads * self.head_dim, bias=False)
131
+ self.v_proj = nn.Linear(d_model, n_kv_heads * self.head_dim, bias=False)
132
+ self.o_proj = nn.Linear(n_heads * self.head_dim, d_model, bias=False)
133
+
134
+ self.dropout = nn.Dropout(dropout)
135
+ self.rope = RotaryEmbedding(self.head_dim, max_seq_len) if use_rope else None
136
+
137
+ def _repeat_kv(self, x: torch.Tensor) -> torch.Tensor:
138
+ if self.n_rep == 1:
139
+ return x
140
+ B, n_kv, T, D = x.shape
141
+ return (
142
+ x[:, :, None, :, :]
143
+ .expand(B, n_kv, self.n_rep, T, D)
144
+ .reshape(B, self.n_heads, T, D)
145
+ )
146
+
147
+ def forward(
148
+ self,
149
+ x: torch.Tensor,
150
+ mask: Optional[torch.Tensor] = None,
151
+ kv_cache: Optional[Tuple[torch.Tensor, torch.Tensor]] = None,
152
+ use_cache: bool = False,
153
+ ) -> Tuple[torch.Tensor, Optional[Tuple]]:
154
+ B, T, C = x.shape
155
+
156
+ q = self.q_proj(x).view(B, T, self.n_heads, self.head_dim).transpose(1, 2)
157
+ k = self.k_proj(x).view(B, T, self.n_kv_heads, self.head_dim).transpose(1, 2)
158
+ v = self.v_proj(x).view(B, T, self.n_kv_heads, self.head_dim).transpose(1, 2)
159
+
160
+ offset = 0
161
+ if kv_cache is not None:
162
+ k_cache, v_cache = kv_cache
163
+ offset = k_cache.size(2)
164
+ k = torch.cat([k_cache, k], dim=2)
165
+ v = torch.cat([v_cache, v], dim=2)
166
+
167
+ if self.rope is not None:
168
+ q, _ = self.rope(q, q, T, offset)
169
+ _, k = self.rope(k, k, k.size(2), 0)
170
+
171
+ k = self._repeat_kv(k)
172
+ v = self._repeat_kv(v)
173
+
174
+ attn = (q @ k.transpose(-2, -1)) * self.scale
175
+ if mask is not None:
176
+ attn = attn.masked_fill(mask == 0, float("-inf"))
177
+
178
+ attn = F.softmax(attn, dim=-1)
179
+ attn = self.dropout(attn)
180
+
181
+ out = (attn @ v).transpose(1, 2).reshape(B, T, -1)
182
+ out = self.o_proj(out)
183
+
184
+ new_cache = None
185
+ if use_cache:
186
+ k_to_cache = (
187
+ self.k_proj(x)
188
+ .view(B, T, self.n_kv_heads, self.head_dim)
189
+ .transpose(1, 2)
190
+ )
191
+ v_to_cache = (
192
+ self.v_proj(x)
193
+ .view(B, T, self.n_kv_heads, self.head_dim)
194
+ .transpose(1, 2)
195
+ )
196
+ if kv_cache is not None:
197
+ k_to_cache = torch.cat([kv_cache[0], k_to_cache], dim=2)
198
+ v_to_cache = torch.cat([kv_cache[1], v_to_cache], dim=2)
199
+ new_cache = (k_to_cache, v_to_cache)
200
+
201
+ return out, new_cache
202
+
203
+
204
+ class CrossAttention(nn.Module):
205
+ """Cross-attention for multimodal fusion."""
206
+
207
+ def __init__(self, d_model: int, n_heads: int = 8, dropout: float = 0.1):
208
+ super().__init__()
209
+ self.n_heads = n_heads
210
+ self.head_dim = d_model // n_heads
211
+ self.scale = self.head_dim**-0.5
212
+
213
+ self.q_proj = nn.Linear(d_model, d_model, bias=False)
214
+ self.k_proj = nn.Linear(d_model, d_model, bias=False)
215
+ self.v_proj = nn.Linear(d_model, d_model, bias=False)
216
+ self.o_proj = nn.Linear(d_model, d_model, bias=False)
217
+ self.dropout = nn.Dropout(dropout)
218
+
219
+ def forward(
220
+ self,
221
+ query: torch.Tensor,
222
+ key_value: torch.Tensor,
223
+ mask: Optional[torch.Tensor] = None,
224
+ ) -> torch.Tensor:
225
+ B, T, C = query.shape
226
+ _, S, _ = key_value.shape
227
+
228
+ q = self.q_proj(query).view(B, T, self.n_heads, self.head_dim).transpose(1, 2)
229
+ k = (
230
+ self.k_proj(key_value)
231
+ .view(B, S, self.n_heads, self.head_dim)
232
+ .transpose(1, 2)
233
+ )
234
+ v = (
235
+ self.v_proj(key_value)
236
+ .view(B, S, self.n_heads, self.head_dim)
237
+ .transpose(1, 2)
238
+ )
239
+
240
+ attn = (q @ k.transpose(-2, -1)) * self.scale
241
+ if mask is not None:
242
+ attn = attn.masked_fill(mask == 0, float("-inf"))
243
+
244
+ attn = F.softmax(attn, dim=-1)
245
+ attn = self.dropout(attn)
246
+
247
+ out = (attn @ v).transpose(1, 2).reshape(B, T, -1)
248
+ return self.o_proj(out)
249
+
250
+
251
+ # ==============================================================================
252
+ # SECTION 4: FEEDFORWARD NETWORKS
253
+ # ==============================================================================
254
+
255
+
256
+ class SwiGLU(nn.Module):
257
+ """SwiGLU activation - better than GELU for transformers."""
258
+
259
+ def __init__(self, d_model: int, d_ff: int, dropout: float = 0.1):
260
+ super().__init__()
261
+ hidden = int(d_ff * 2 / 3)
262
+ hidden = ((hidden + 63) // 64) * 64 # Align to 64
263
+
264
+ self.w1 = nn.Linear(d_model, hidden, bias=False)
265
+ self.w2 = nn.Linear(hidden, d_model, bias=False)
266
+ self.w3 = nn.Linear(d_model, hidden, bias=False)
267
+ self.dropout = nn.Dropout(dropout)
268
+
269
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
270
+ return self.dropout(self.w2(F.silu(self.w1(x)) * self.w3(x)))
271
+
272
+
273
+ class MLP(nn.Module):
274
+ """Standard MLP with GELU activation."""
275
+
276
+ def __init__(self, d_model: int, d_ff: int, dropout: float = 0.1):
277
+ super().__init__()
278
+ self.net = nn.Sequential(
279
+ nn.Linear(d_model, d_ff),
280
+ nn.GELU(),
281
+ nn.Dropout(dropout),
282
+ nn.Linear(d_ff, d_model),
283
+ nn.Dropout(dropout),
284
+ )
285
+
286
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
287
+ return self.net(x)
288
+
289
+
290
+ # ==============================================================================
291
+ # SECTION 5: SPARSE MIXTURE OF EXPERTS
292
+ # ==============================================================================
293
+
294
+
295
+ class Expert(nn.Module):
296
+ """Single expert module."""
297
+
298
+ def __init__(self, d_model: int, d_ff: int, expert_type: str = "general"):
299
+ super().__init__()
300
+ self.expert_type = expert_type
301
+ self.ffn = SwiGLU(d_model, d_ff)
302
+
303
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
304
+ return self.ffn(x)
305
+
306
+
307
+ class SparseMoE(nn.Module):
308
+ """Sparse Mixture of Experts with Top-K routing."""
309
+
310
+ def __init__(
311
+ self,
312
+ d_model: int,
313
+ d_ff: int,
314
+ num_experts: int = 8,
315
+ top_k: int = 2,
316
+ expert_types: Optional[List[str]] = None,
317
+ aux_loss_weight: float = 0.01,
318
+ ):
319
+ super().__init__()
320
+ self.num_experts = num_experts
321
+ self.top_k = top_k
322
+ self.aux_loss_weight = aux_loss_weight
323
+
324
+ if expert_types is None:
325
+ expert_types = ["general"]
326
+
327
+ self.router = nn.Linear(d_model, num_experts, bias=False)
328
+ self.experts = nn.ModuleList(
329
+ [
330
+ Expert(d_model, d_ff, expert_types[i % len(expert_types)])
331
+ for i in range(num_experts)
332
+ ]
333
+ )
334
+
335
+ def forward(self, x: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
336
+ B, T, C = x.shape
337
+ x_flat = x.view(-1, C)
338
+
339
+ router_logits = self.router(x_flat)
340
+ topk_weights, topk_indices = torch.topk(
341
+ F.softmax(router_logits, dim=-1), self.top_k, dim=-1
342
+ )
343
+ topk_weights = topk_weights / topk_weights.sum(dim=-1, keepdim=True)
344
+
345
+ output = torch.zeros_like(x_flat)
346
+
347
+ for i, expert in enumerate(self.experts):
348
+ mask = (topk_indices == i).any(dim=-1)
349
+ if not mask.any():
350
+ continue
351
+ expert_weight = torch.where(
352
+ topk_indices == i, topk_weights, torch.zeros_like(topk_weights)
353
+ ).sum(dim=-1)
354
+ expert_out = expert(x_flat[mask])
355
+ output[mask] += expert_out * expert_weight[mask].unsqueeze(-1)
356
+
357
+ # Auxiliary load balancing loss
358
+ router_probs = F.softmax(router_logits, dim=-1)
359
+ expert_usage = router_probs.mean(dim=0)
360
+ aux_loss = (
361
+ self.num_experts
362
+ * (expert_usage * expert_usage).sum()
363
+ * self.aux_loss_weight
364
+ )
365
+
366
+ return output.view(B, T, C), aux_loss
367
+
368
+
369
+ # ==============================================================================
370
+ # SECTION 6: MEMORY SYSTEMS
371
+ # ==============================================================================
372
+
373
+
374
+ class ContrastiveLPOL(CognitiveModule):
375
+ """
376
+ LPOL Memory System with configurable knowledge domains.
377
+ Uses contrastive learning for memory retrieval.
378
+ """
379
+
380
+ def __init__(
381
+ self,
382
+ d_model: int,
383
+ config: CognitiveConfig,
384
+ domains: Optional[List[str]] = None,
385
+ slots_per_domain: int = 512,
386
+ retrieval_k: int = 8,
387
+ ):
388
+ super().__init__(config)
389
+
390
+ if domains is None:
391
+ domains = [
392
+ "semantic",
393
+ "episodic",
394
+ "procedural",
395
+ "spatial",
396
+ "temporal",
397
+ "causal",
398
+ "social",
399
+ "emotional",
400
+ "conceptual",
401
+ ]
402
+
403
+ self.domains = domains
404
+ self.k = retrieval_k
405
+
406
+ self.memories = nn.ParameterDict(
407
+ {
408
+ domain: nn.Parameter(torch.randn(slots_per_domain, d_model) * 0.01)
409
+ for domain in domains
410
+ }
411
+ )
412
+
413
+ self.domain_clf = nn.Sequential(
414
+ nn.Linear(d_model, len(domains) * 2),
415
+ nn.GELU(),
416
+ nn.Linear(len(domains) * 2, len(domains)),
417
+ )
418
+
419
+ self.q_proj = nn.Linear(d_model, d_model)
420
+ self.k_proj = nn.Linear(d_model, d_model)
421
+ self.v_proj = nn.Linear(d_model, d_model)
422
+ self.out_proj = nn.Linear(d_model * 2, d_model)
423
+
424
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
425
+ B, T, C = x.shape
426
+
427
+ domain_probs = F.softmax(self.domain_clf(x.mean(dim=1)), dim=-1)
428
+ all_mem = torch.cat([self.memories[d] for d in self.domains], dim=0)
429
+
430
+ q = self.q_proj(x)
431
+ k = self.k_proj(all_mem)
432
+ v = self.v_proj(all_mem)
433
+
434
+ sim = torch.matmul(q, k.T) / math.sqrt(C)
435
+ topk_sim, topk_idx = torch.topk(sim, min(self.k, all_mem.size(0)), dim=-1)
436
+ weights = F.softmax(topk_sim, dim=-1)
437
+ retrieved = (weights.unsqueeze(-1) * v[topk_idx]).sum(dim=2)
438
+ output = self.out_proj(torch.cat([x, retrieved], dim=-1))
439
+
440
+ return {
441
+ "output": output,
442
+ "domain_probs": domain_probs,
443
+ "retrieval_weights": weights,
444
+ }
445
+
446
+ def reset_state(self):
447
+ pass
448
+
449
+ def update_memory(self, x: torch.Tensor, domain: str, lr: float = 0.01):
450
+ """Online memory update."""
451
+ if domain in self.memories:
452
+ with torch.no_grad():
453
+ mem = self.memories[domain]
454
+ sim = F.cosine_similarity(
455
+ x.mean(dim=1, keepdim=True), mem.unsqueeze(0), dim=-1
456
+ )
457
+ _, idx = sim.min(dim=-1)
458
+ mem[idx] = (1 - lr) * mem[idx] + lr * x.mean(dim=1)
459
+
460
+
461
+ class MultiScaleMemory(CognitiveModule):
462
+ """Short-term and long-term memory with consolidation."""
463
+
464
+ def __init__(
465
+ self,
466
+ d_model: int,
467
+ config: CognitiveConfig,
468
+ short_term_dim: int = 512,
469
+ long_term_dim: int = 256,
470
+ st_decay: float = 0.95,
471
+ lt_decay: float = 0.99,
472
+ consolidation_threshold: float = 0.7,
473
+ ):
474
+ super().__init__(config)
475
+
476
+ self.st_decay = st_decay
477
+ self.lt_decay = lt_decay
478
+ self.consolidation_threshold = consolidation_threshold
479
+
480
+ # Short-term memory
481
+ self.st_compress = nn.Sequential(
482
+ nn.Linear(d_model, short_term_dim),
483
+ nn.GELU(),
484
+ nn.Linear(short_term_dim, short_term_dim),
485
+ )
486
+ self.st_gate = nn.GRUCell(short_term_dim, short_term_dim)
487
+
488
+ # Long-term memory
489
+ self.consolidation = nn.Sequential(
490
+ nn.Linear(short_term_dim + long_term_dim, 256),
491
+ nn.SiLU(),
492
+ nn.Linear(256, 1),
493
+ nn.Sigmoid(),
494
+ )
495
+ self.st_to_lt = nn.Linear(short_term_dim, long_term_dim)
496
+ self.lt_gate = nn.GRUCell(long_term_dim, long_term_dim)
497
+
498
+ # Fusion
499
+ self.fusion = nn.Sequential(
500
+ nn.Linear(short_term_dim + long_term_dim, d_model), nn.Tanh()
501
+ )
502
+
503
+ # State buffers
504
+ self.register_buffer("st_state", torch.zeros(1, short_term_dim))
505
+ self.register_buffer("lt_state", torch.zeros(1, long_term_dim))
506
+
507
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
508
+ B = x.size(0)
509
+ h_compressed = self.st_compress(x.mean(dim=1))
510
+
511
+ st_prev = self.st_state.expand(B, -1)
512
+ st_new = self.st_decay * st_prev + (1 - self.st_decay) * self.st_gate(
513
+ h_compressed, st_prev
514
+ )
515
+
516
+ lt_prev = self.lt_state.expand(B, -1)
517
+ consolidation_score = self.consolidation(torch.cat([st_new, lt_prev], dim=-1))
518
+
519
+ if (consolidation_score > self.consolidation_threshold).any():
520
+ lt_input = self.st_to_lt(st_new)
521
+ lt_new = self.lt_decay * lt_prev + (1 - self.lt_decay) * self.lt_gate(
522
+ lt_input, lt_prev
523
+ )
524
+ else:
525
+ lt_new = lt_prev
526
+
527
+ self.st_state = st_new[:1].detach()
528
+ self.lt_state = lt_new[:1].detach()
529
+
530
+ fused = self.fusion(torch.cat([st_new, lt_new], dim=-1))
531
+
532
+ return {
533
+ "st": st_new,
534
+ "lt": lt_new,
535
+ "fused": fused,
536
+ "consolidation_score": consolidation_score.mean().item(),
537
+ }
538
+
539
+ def reset_state(self):
540
+ self.st_state.zero_()
541
+ self.lt_state.zero_()
542
+
543
+
544
+ class EpisodicMemory(CognitiveModule):
545
+ """Episodic memory for experience storage and retrieval."""
546
+
547
+ def __init__(self, d_model: int, config: CognitiveConfig, max_episodes: int = 1000):
548
+ super().__init__(config)
549
+
550
+ self.encoder = nn.Sequential(
551
+ nn.Linear(d_model, d_model // 2),
552
+ nn.GELU(),
553
+ nn.Linear(d_model // 2, d_model),
554
+ )
555
+
556
+ self.register_buffer("episodes", torch.zeros(max_episodes, d_model))
557
+ self.register_buffer("count", torch.tensor(0))
558
+ self.max = max_episodes
559
+
560
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
561
+ encoded = self.encoder(x)
562
+ return {"encoded": encoded}
563
+
564
+ def store(self, x: torch.Tensor):
565
+ """Store an experience."""
566
+ with torch.no_grad():
567
+ idx = self.count.item() % self.max
568
+ self.episodes[idx] = x.mean(dim=(0, 1)) if x.dim() == 3 else x.mean(dim=0)
569
+ self.count += 1
570
+
571
+ def retrieve(self, query: torch.Tensor, k: int = 5) -> torch.Tensor:
572
+ """Retrieve k most similar episodes."""
573
+ n = min(self.count.item(), self.max)
574
+ if n == 0:
575
+ return torch.zeros_like(query)
576
+
577
+ episodes = self.episodes[:n]
578
+ sim = F.cosine_similarity(query.unsqueeze(1), episodes.unsqueeze(0), dim=-1)
579
+ _, indices = sim.topk(min(k, n), dim=-1)
580
+ return episodes[indices].mean(dim=1)
581
+
582
+ def reset_state(self):
583
+ self.count.zero_()
584
+
585
+
586
+ # ==============================================================================
587
+ # SECTION 7: WORLD MODEL COMPONENTS
588
+ # ==============================================================================
589
+
590
+
591
+ class WorldBuffer(CognitiveModule):
592
+ """Single domain world buffer with state prediction."""
593
+
594
+ def __init__(self, d_model: int, config: CognitiveConfig, domain: str = "physical"):
595
+ super().__init__(config)
596
+ self.domain = domain
597
+
598
+ state_dim = getattr(config, "world_state_dim", 256)
599
+
600
+ self.encoder = nn.Sequential(
601
+ nn.Linear(d_model, state_dim), nn.GELU(), nn.Linear(state_dim, state_dim)
602
+ )
603
+
604
+ self.dynamics = nn.GRUCell(state_dim, state_dim)
605
+
606
+ self.predictor = nn.Sequential(
607
+ nn.Linear(state_dim, state_dim), nn.Tanh(), nn.Linear(state_dim, state_dim)
608
+ )
609
+
610
+ self.register_buffer("state", torch.zeros(1, state_dim))
611
+ self.register_buffer("prediction", torch.zeros(1, state_dim))
612
+ self.register_buffer("surprise", torch.tensor(0.0))
613
+
614
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
615
+ if x.dim() == 3:
616
+ x = x.mean(dim=1)
617
+
618
+ encoded = self.encoder(x)
619
+
620
+ # Compute surprise
621
+ if self.prediction.norm() > 0:
622
+ surprise = F.mse_loss(
623
+ encoded, self.prediction.expand(encoded.size(0), -1)
624
+ ).item()
625
+ else:
626
+ surprise = 0.0
627
+
628
+ self.surprise = torch.tensor(surprise)
629
+
630
+ # Update state
631
+ new_state = self.dynamics(encoded, self.state.expand(encoded.size(0), -1))
632
+ update_rate = getattr(self.config, "world_update_rate", 0.1)
633
+ self.state = (
634
+ update_rate * new_state[:1] + (1 - update_rate) * self.state
635
+ ).detach()
636
+ self.prediction = self.predictor(self.state).detach()
637
+
638
+ return {"surprise": surprise, "state": new_state}
639
+
640
+ def reset_state(self):
641
+ self.state.zero_()
642
+ self.prediction.zero_()
643
+ self.surprise.zero_()
644
+
645
+
646
+ class MultiWorldBuffer(CognitiveModule):
647
+ """Multi-domain world model buffers."""
648
+
649
+ def __init__(
650
+ self, d_model: int, config: CognitiveConfig, domains: Optional[List[str]] = None
651
+ ):
652
+ super().__init__(config)
653
+
654
+ if domains is None:
655
+ domains = ["physical", "social", "abstract", "temporal"]
656
+
657
+ self.world_buffers = nn.ModuleDict(
658
+ {d: WorldBuffer(d_model, config, d) for d in domains}
659
+ )
660
+ self.register_buffer("aggregate_surprise", torch.tensor(0.0))
661
+
662
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
663
+ results = {}
664
+ total_surprise = 0.0
665
+
666
+ for domain, buffer in self.world_buffers.items():
667
+ result = buffer(x)
668
+ results[domain] = result
669
+ total_surprise += result["surprise"]
670
+
671
+ self.aggregate_surprise = torch.tensor(total_surprise / len(self.world_buffers))
672
+
673
+ return {
674
+ "domain_results": results,
675
+ "aggregate_surprise": self.aggregate_surprise.item(),
676
+ }
677
+
678
+ def reset_state(self):
679
+ for buffer in self.world_buffers.values():
680
+ buffer.reset_state()
681
+
682
+
683
+ # ==============================================================================
684
+ # SECTION 8: INTERNAL STATE SYSTEMS
685
+ # ==============================================================================
686
+
687
+
688
+ class NonVerbalTension(nn.Module):
689
+ """Tracks prediction error as internal tension signal."""
690
+
691
+ def __init__(self, integration_rate: float = 0.1, buffer_size: int = 100):
692
+ super().__init__()
693
+ self.integration_rate = integration_rate
694
+ self.register_buffer("prediction_errors", torch.zeros(buffer_size))
695
+ self.register_buffer("error_idx", torch.tensor(0))
696
+ self.register_buffer("integrated_tension", torch.tensor(0.0))
697
+
698
+ def update(self, pred: torch.Tensor, actual: torch.Tensor):
699
+ with torch.no_grad():
700
+ error = F.mse_loss(pred.float(), actual.float()).item()
701
+ idx = self.error_idx.item() % len(self.prediction_errors)
702
+ self.prediction_errors[idx] = error
703
+ self.error_idx += 1
704
+
705
+ def integrate(self) -> float:
706
+ n = min(self.error_idx.item(), len(self.prediction_errors))
707
+ if n > 0:
708
+ raw = self.prediction_errors[:n].mean().item()
709
+ self.integrated_tension = (
710
+ 1 - self.integration_rate
711
+ ) * self.integrated_tension + self.integration_rate * raw
712
+ return self.integrated_tension.item()
713
+
714
+
715
+ class InternalState(CognitiveModule):
716
+ """Complete internal cognitive state tracker."""
717
+
718
+ def __init__(self, d_model: int, config: CognitiveConfig):
719
+ super().__init__(config)
720
+
721
+ internal_dim = getattr(config, "internal_state_dim", 128)
722
+ latent_dim = getattr(config, "latent_state_dim", 768)
723
+
724
+ self.tension = NonVerbalTension()
725
+
726
+ self.encoder = nn.Sequential(nn.Linear(latent_dim, internal_dim), nn.Tanh())
727
+
728
+ self.register_buffer("discomfort", torch.zeros(1, internal_dim))
729
+
730
+ def forward(
731
+ self,
732
+ fused: torch.Tensor,
733
+ pred: Optional[torch.Tensor] = None,
734
+ actual: Optional[torch.Tensor] = None,
735
+ **kwargs,
736
+ ) -> Dict[str, Any]:
737
+ if pred is not None and actual is not None:
738
+ self.tension.update(pred, actual)
739
+
740
+ tension = self.tension.integrate()
741
+
742
+ encoded = self.encoder(fused)
743
+ if encoded.dim() == 3:
744
+ encoded = encoded.mean(dim=1)
745
+
746
+ self.discomfort = 0.9 * self.discomfort + 0.1 * encoded[:1].detach()
747
+
748
+ return {
749
+ "tension": tension,
750
+ "discomfort": self.discomfort,
751
+ "encoded_state": encoded,
752
+ }
753
+
754
+ def reset_state(self):
755
+ self.discomfort.zero_()
756
+
757
+
758
+ # ==============================================================================
759
+ # SECTION 9: DREAM & SELF-TRACE
760
+ # ==============================================================================
761
+
762
+
763
+ class DreamPhase(CognitiveModule):
764
+ """Dream phase for memory consolidation."""
765
+
766
+ def __init__(
767
+ self,
768
+ d_model: int,
769
+ config: CognitiveConfig,
770
+ buffer_size: int = 256,
771
+ dream_threshold: float = 0.7,
772
+ ):
773
+ super().__init__(config)
774
+
775
+ internal_dim = getattr(config, "internal_state_dim", 128)
776
+
777
+ self.buffer = deque(maxlen=buffer_size)
778
+ self.is_dreaming = False
779
+ self.dream_steps = 0
780
+ self.dream_threshold = dream_threshold
781
+ self.total_dreams = 0
782
+
783
+ self.consolidator = nn.Sequential(
784
+ nn.Linear(internal_dim, internal_dim),
785
+ nn.GELU(),
786
+ nn.Linear(internal_dim, internal_dim),
787
+ nn.Tanh(),
788
+ )
789
+
790
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
791
+ return {"is_dreaming": self.is_dreaming, "dream_steps": self.dream_steps}
792
+
793
+ def record(self, state: torch.Tensor, tension: float):
794
+ """Record state for potential dream consolidation."""
795
+ self.buffer.append((state.detach().cpu(), tension))
796
+
797
+ def should_dream(self) -> bool:
798
+ if len(self.buffer) < 10:
799
+ return False
800
+ recent = [t for _, t in list(self.buffer)[-10:]]
801
+ return sum(recent) / len(recent) > self.dream_threshold
802
+
803
+ def enter_dream(self):
804
+ self.is_dreaming = True
805
+ self.dream_steps = 0
806
+ self.total_dreams += 1
807
+
808
+ def dream_step(self, identity: torch.Tensor) -> Optional[torch.Tensor]:
809
+ """Execute one dream consolidation step."""
810
+ if not self.is_dreaming or len(self.buffer) == 0:
811
+ return None
812
+
813
+ self.dream_steps += 1
814
+
815
+ # Sample from buffer
816
+ idx = torch.randint(0, len(self.buffer), (1,)).item()
817
+ state, _ = self.buffer[idx]
818
+ state = state.to(identity.device)
819
+
820
+ # Consolidate
821
+ consolidated = self.consolidator(state)
822
+
823
+ # Exit dream after some steps
824
+ if self.dream_steps > 50:
825
+ self.is_dreaming = False
826
+
827
+ return consolidated
828
+
829
+ def reset_state(self):
830
+ self.buffer.clear()
831
+ self.is_dreaming = False
832
+ self.dream_steps = 0
833
+
834
+
835
+ class SelfTrace(CognitiveModule):
836
+ """Identity tracking across time."""
837
+
838
+ def __init__(self, d_model: int, config: CognitiveConfig):
839
+ super().__init__(config)
840
+
841
+ internal_dim = getattr(config, "internal_state_dim", 128)
842
+
843
+ self.register_buffer("identity", torch.zeros(1, internal_dim))
844
+ self.register_buffer("n_traces", torch.tensor(0))
845
+
846
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
847
+ return {"identity": self.identity, "n_traces": self.n_traces.item()}
848
+
849
+ def record(self, state: torch.Tensor, tension: float):
850
+ """Update identity based on state and tension."""
851
+ with torch.no_grad():
852
+ if state.dim() > 2:
853
+ state = state.mean(dim=1)
854
+
855
+ # Weight by tension (high tension = more salient)
856
+ weight = min(0.1, 0.01 * max(1.0, tension))
857
+ self.identity = (1 - weight) * self.identity + weight * state[:1]
858
+ self.n_traces += 1
859
+
860
+ def get_identity(self) -> torch.Tensor:
861
+ return self.identity
862
+
863
+ def reset_state(self):
864
+ self.identity.zero_()
865
+ self.n_traces.zero_()
866
+
867
+
868
+ # ==============================================================================
869
+ # SECTION 10: NEUROGENESIS
870
+ # ==============================================================================
871
+
872
+
873
+ class NeurogenesisLayer(CognitiveModule):
874
+ """Layer with dynamic neuron birth/death based on usage."""
875
+
876
+ def __init__(
877
+ self,
878
+ input_dim: int,
879
+ n_neurons: int,
880
+ config: CognitiveConfig,
881
+ max_neurons: int = 256,
882
+ usage_decay: float = 0.99,
883
+ birth_threshold: float = 0.8,
884
+ death_threshold: float = 0.01,
885
+ ):
886
+ super().__init__(config)
887
+
888
+ self.input_dim = input_dim
889
+ self.max_neurons = max_neurons
890
+ self.usage_decay = usage_decay
891
+ self.birth_threshold = birth_threshold
892
+ self.death_threshold = death_threshold
893
+
894
+ self.weights = nn.Parameter(torch.randn(max_neurons, input_dim) * 0.02)
895
+ self.bias = nn.Parameter(torch.zeros(max_neurons))
896
+
897
+ self.register_buffer("n_neurons", torch.tensor(n_neurons))
898
+ self.register_buffer("usage", torch.ones(max_neurons))
899
+ self.register_buffer("lifetime", torch.zeros(max_neurons))
900
+ self.register_buffer("births", torch.tensor(0))
901
+ self.register_buffer("deaths", torch.tensor(0))
902
+
903
+ def forward(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
904
+ n = self.n_neurons.item()
905
+ out = torch.tanh(F.linear(x, self.weights[:n], self.bias[:n]))
906
+
907
+ with torch.no_grad():
908
+ activation = out.abs().mean(dim=0) if out.dim() > 1 else out.abs()
909
+ if activation.size(-1) >= n:
910
+ self.usage[:n] = (
911
+ self.usage_decay * self.usage[:n]
912
+ + (1 - self.usage_decay) * activation[..., :n].mean(dim=0)
913
+ if activation.dim() > 1
914
+ else activation[:n]
915
+ )
916
+ self.lifetime[:n] += 1
917
+
918
+ return {
919
+ "output": out,
920
+ "n_neurons": n,
921
+ "avg_usage": self.usage[:n].mean().item(),
922
+ }
923
+
924
+ def maybe_birth(self, coherence: float) -> bool:
925
+ """Try to add a neuron if coherence is high."""
926
+ n = self.n_neurons.item()
927
+ if coherence > self.birth_threshold and n < self.max_neurons:
928
+ with torch.no_grad():
929
+ nn.init.normal_(self.weights[n], std=0.02)
930
+ self.bias[n] = 0
931
+ self.usage[n] = 1.0
932
+ self.lifetime[n] = 0
933
+ self.n_neurons += 1
934
+ self.births += 1
935
+ return True
936
+ return False
937
+
938
+ def maybe_death(self) -> int:
939
+ """Remove underused neurons."""
940
+ n = self.n_neurons.item()
941
+ if n <= 8:
942
+ return 0
943
+
944
+ dead = 0
945
+ with torch.no_grad():
946
+ for i in range(n - 1, 7, -1):
947
+ if self.usage[i] < self.death_threshold and self.lifetime[i] > 100:
948
+ # Swap with last active
949
+ last = self.n_neurons.item() - 1
950
+ if i < last:
951
+ self.weights.data[i] = self.weights.data[last]
952
+ self.bias.data[i] = self.bias.data[last]
953
+ self.usage[i] = self.usage[last]
954
+ self.lifetime[i] = self.lifetime[last]
955
+ self.n_neurons -= 1
956
+ self.deaths += 1
957
+ dead += 1
958
+ return dead
959
+
960
+ def get_stats(self) -> Dict[str, Any]:
961
+ n = self.n_neurons.item()
962
+ return {
963
+ "total_neurons": n,
964
+ "births": self.births.item(),
965
+ "deaths": self.deaths.item(),
966
+ "avg_usage": self.usage[:n].mean().item() if n > 0 else 0,
967
+ }
968
+
969
+ def reset_state(self):
970
+ pass
971
+
972
+
973
+ # ==============================================================================
974
+ # SECTION 11: EARCP MODULE
975
+ # ==============================================================================
976
+
977
+
978
+ class EARCPModule(CognitiveModule):
979
+ """
980
+ Ensemble Auto-Regulated Coherence Protocol.
981
+ Compresses hidden states and regulates information flow.
982
+ """
983
+
984
+ def __init__(self, d_model: int, config: CognitiveConfig):
985
+ super().__init__(config)
986
+
987
+ latent_dim = getattr(config, "latent_state_dim", 768)
988
+ d_ff = getattr(config, "d_ff", 2048)
989
+
990
+ self.compress = nn.Sequential(
991
+ nn.Linear(d_model, (d_model + latent_dim) // 2),
992
+ nn.SiLU(),
993
+ nn.Linear((d_model + latent_dim) // 2, latent_dim),
994
+ )
995
+
996
+ self.state_gate = nn.Linear(latent_dim * 2, latent_dim)
997
+
998
+ self.q_proj = nn.Linear(d_model, d_model)
999
+ self.k_proj = nn.Linear(latent_dim, d_model)
1000
+ self.v_proj = nn.Linear(latent_dim, d_model)
1001
+ self.out_proj = nn.Linear(d_model, d_model)
1002
+
1003
+ self.coherence_proc = nn.Sequential(
1004
+ nn.Linear(d_model, d_ff), nn.SiLU(), nn.Linear(d_ff, d_model)
1005
+ )
1006
+
1007
+ # Initialize small for residual
1008
+ nn.init.zeros_(self.out_proj.weight)
1009
+ nn.init.zeros_(self.coherence_proc[-1].weight)
1010
+
1011
+ def forward(self, h: torch.Tensor, fused: torch.Tensor, **kwargs) -> Dict[str, Any]:
1012
+ h_compressed = self.compress(h.mean(dim=1))
1013
+
1014
+ gate = torch.sigmoid(self.state_gate(torch.cat([h_compressed, fused], dim=-1)))
1015
+ state = (1 - gate) * fused + gate * h_compressed
1016
+
1017
+ q = self.q_proj(h)
1018
+ k = self.k_proj(state).unsqueeze(1)
1019
+ v = self.v_proj(state).unsqueeze(1)
1020
+
1021
+ attn = F.softmax(q @ k.transpose(-2, -1) / math.sqrt(h.size(-1)), dim=-1)
1022
+ h = h + 0.02 * self.out_proj(attn @ v)
1023
+ h = h + 0.1 * self.coherence_proc(h)
1024
+
1025
+ coherence = torch.sigmoid(h.mean()).item()
1026
+
1027
+ return {"hidden": h, "state": state, "coherence": coherence}
1028
+
1029
+ def reset_state(self):
1030
+ pass
1031
+
1032
+
1033
+ # ==============================================================================
1034
+ # SECTION 12: VAE COMPONENTS (for World Models / Vision)
1035
+ # ==============================================================================
1036
+
1037
+
1038
+ class VAEEncoder(nn.Module):
1039
+ """Convolutional VAE Encoder for visual inputs."""
1040
+
1041
+ def __init__(
1042
+ self, in_channels: int = 3, latent_dim: int = 256, channels: List[int] = None
1043
+ ):
1044
+ super().__init__()
1045
+
1046
+ if channels is None:
1047
+ channels = [32, 64, 128, 256]
1048
+
1049
+ layers = []
1050
+ prev_c = in_channels
1051
+
1052
+ for c in channels:
1053
+ layers.extend(
1054
+ [
1055
+ nn.Conv2d(prev_c, c, 4, 2, 1),
1056
+ nn.BatchNorm2d(c),
1057
+ nn.LeakyReLU(0.2, inplace=True),
1058
+ ]
1059
+ )
1060
+ prev_c = c
1061
+
1062
+ self.encoder = nn.Sequential(*layers)
1063
+
1064
+ # Calculate flattened size (assumes 64x64 input)
1065
+ self.flat_size = channels[-1] * 4 * 4
1066
+
1067
+ self.fc_mu = nn.Linear(self.flat_size, latent_dim)
1068
+ self.fc_logvar = nn.Linear(self.flat_size, latent_dim)
1069
+
1070
+ def forward(
1071
+ self, x: torch.Tensor
1072
+ ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
1073
+ h = self.encoder(x)
1074
+ h = h.view(h.size(0), -1)
1075
+
1076
+ mu = self.fc_mu(h)
1077
+ logvar = self.fc_logvar(h)
1078
+
1079
+ # Reparameterization
1080
+ std = torch.exp(0.5 * logvar)
1081
+ eps = torch.randn_like(std)
1082
+ z = mu + eps * std
1083
+
1084
+ return z, mu, logvar
1085
+
1086
+
1087
+ class VAEDecoder(nn.Module):
1088
+ """Convolutional VAE Decoder for visual outputs."""
1089
+
1090
+ def __init__(
1091
+ self, latent_dim: int = 256, out_channels: int = 3, channels: List[int] = None
1092
+ ):
1093
+ super().__init__()
1094
+
1095
+ if channels is None:
1096
+ channels = [256, 128, 64, 32]
1097
+
1098
+ self.fc = nn.Linear(latent_dim, channels[0] * 4 * 4)
1099
+ self.init_channels = channels[0]
1100
+
1101
+ layers = []
1102
+ for i in range(len(channels) - 1):
1103
+ layers.extend(
1104
+ [
1105
+ nn.ConvTranspose2d(channels[i], channels[i + 1], 4, 2, 1),
1106
+ nn.BatchNorm2d(channels[i + 1]),
1107
+ nn.ReLU(inplace=True),
1108
+ ]
1109
+ )
1110
+
1111
+ # Final layer
1112
+ layers.extend(
1113
+ [nn.ConvTranspose2d(channels[-1], out_channels, 4, 2, 1), nn.Sigmoid()]
1114
+ )
1115
+
1116
+ self.decoder = nn.Sequential(*layers)
1117
+
1118
+ def forward(self, z: torch.Tensor) -> torch.Tensor:
1119
+ h = self.fc(z)
1120
+ h = h.view(h.size(0), self.init_channels, 4, 4)
1121
+ return self.decoder(h)
1122
+
1123
+
1124
+ # ==============================================================================
1125
+ # SECTION 13: UNIVERSAL LATENT SPACE
1126
+ # ==============================================================================
1127
+
1128
+
1129
+ class UniversalLatentSpace(CognitiveModule):
1130
+ """Universal Latent Space for cross-modal alignment."""
1131
+
1132
+ def __init__(
1133
+ self,
1134
+ d_model: int,
1135
+ config: CognitiveConfig,
1136
+ uls_dim: int = 1024,
1137
+ n_anchors: int = 64,
1138
+ ):
1139
+ super().__init__(config)
1140
+
1141
+ self.uls_dim = uls_dim
1142
+
1143
+ self.anchors = nn.Parameter(torch.randn(n_anchors, uls_dim) * 0.02)
1144
+
1145
+ # Modality projections
1146
+ self.text_to_uls = nn.Sequential(
1147
+ nn.Linear(d_model, d_model),
1148
+ nn.GELU(),
1149
+ nn.Linear(d_model, uls_dim),
1150
+ RMSNorm(uls_dim),
1151
+ )
1152
+
1153
+ self.vision_to_uls = nn.Sequential(
1154
+ nn.Linear(d_model, d_model),
1155
+ nn.GELU(),
1156
+ nn.Linear(d_model, uls_dim),
1157
+ RMSNorm(uls_dim),
1158
+ )
1159
+
1160
+ self.audio_to_uls = nn.Sequential(
1161
+ nn.Linear(d_model, d_model),
1162
+ nn.GELU(),
1163
+ nn.Linear(d_model, uls_dim),
1164
+ RMSNorm(uls_dim),
1165
+ )
1166
+
1167
+ self.uls_to_model = nn.Sequential(
1168
+ nn.Linear(uls_dim, d_model),
1169
+ nn.GELU(),
1170
+ nn.Linear(d_model, d_model),
1171
+ RMSNorm(d_model),
1172
+ )
1173
+
1174
+ self.anchor_attn = nn.MultiheadAttention(uls_dim, num_heads=4, batch_first=True)
1175
+
1176
+ def forward(self, features: Dict[str, torch.Tensor], **kwargs) -> Dict[str, Any]:
1177
+ unified_features = []
1178
+
1179
+ if "text" in features and features["text"] is not None:
1180
+ unified_features.append(self.text_to_uls(features["text"]))
1181
+
1182
+ if "vision" in features and features["vision"] is not None:
1183
+ unified_features.append(self.vision_to_uls(features["vision"]))
1184
+
1185
+ if "audio" in features and features["audio"] is not None:
1186
+ unified_features.append(self.audio_to_uls(features["audio"]))
1187
+
1188
+ if not unified_features:
1189
+ B = 1
1190
+ device = self.anchors.device
1191
+ unified = torch.zeros(B, 1, self.uls_dim, device=device)
1192
+ else:
1193
+ # Average all modalities
1194
+ unified = torch.stack(unified_features, dim=0).mean(dim=0)
1195
+
1196
+ # Anchor attention
1197
+ anchors_expanded = self.anchors.unsqueeze(0).expand(unified.size(0), -1, -1)
1198
+ enhanced, _ = self.anchor_attn(unified, anchors_expanded, anchors_expanded)
1199
+ enhanced = unified + 0.1 * enhanced
1200
+
1201
+ output = self.uls_to_model(enhanced)
1202
+
1203
+ return {"unified": unified, "enhanced": enhanced, "output": output}
1204
+
1205
+ def reset_state(self):
1206
+ pass
setup.py ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ COGNITIVE-CORE: Universal Cognitive Architecture Framework
3
+ ===========================================================
4
+
5
+ A robust, agnostic framework for building cognitive AI models.
6
+ Supports vision, language, world model, audio, and multimodal architectures.
7
+
8
+ Installation:
9
+ pip install cognitive-core
10
+
11
+ Or from HuggingFace:
12
+ pip install git+https://huggingface.co/amewebstudio/cognitive-core
13
+
14
+ Copyright © 2026 Mike Amega (Logo) - Ame Web Studio
15
+ License: Proprietary - All Rights Reserved
16
+ """
17
+
18
+ from setuptools import setup, find_packages
19
+
20
+ with open("cognitive-core/README.md", "r", encoding="utf-8") as f:
21
+ long_description = f.read()
22
+
23
+ setup(
24
+ name="cognitive-core",
25
+ version="1.0.0",
26
+ author="Mike Amega",
27
+ author_email="contact@amewebstudio.com",
28
+ description="Universal Cognitive Architecture Framework for AI Models",
29
+ long_description=long_description,
30
+ long_description_content_type="text/markdown",
31
+ url="https://github.com/Volgat/nexus-standardisation",
32
+ project_urls={
33
+ "HuggingFace": "https://huggingface.co/amewebstudio/cognitive-core",
34
+ "Documentation": "https://github.com/Volgat/nexus-standardisation#readme",
35
+ "Bug Tracker": "https://github.com/Volgat/nexus-standardisation/issues",
36
+ },
37
+ packages=find_packages(),
38
+ package_dir={"cognitive_core": "cognitive-core"},
39
+ py_modules=["cognitive_core"],
40
+ classifiers=[
41
+ "Development Status :: 4 - Beta",
42
+ "Intended Audience :: Developers",
43
+ "Intended Audience :: Science/Research",
44
+ "License :: Other/Proprietary License",
45
+ "Operating System :: OS Independent",
46
+ "Programming Language :: Python :: 3",
47
+ "Programming Language :: Python :: 3.8",
48
+ "Programming Language :: Python :: 3.9",
49
+ "Programming Language :: Python :: 3.10",
50
+ "Programming Language :: Python :: 3.11",
51
+ "Programming Language :: Python :: 3.12",
52
+ "Topic :: Scientific/Engineering :: Artificial Intelligence",
53
+ "Topic :: Software Development :: Libraries :: Python Modules",
54
+ ],
55
+ python_requires=">=3.8",
56
+ install_requires=[
57
+ "torch>=2.0.0",
58
+ "transformers>=4.35.0",
59
+ "datasets>=2.14.0",
60
+ "huggingface_hub>=0.19.0",
61
+ "accelerate>=0.24.0",
62
+ ],
63
+ extras_require={
64
+ "dev": [
65
+ "pytest>=7.0.0",
66
+ "black>=23.0.0",
67
+ "ruff>=0.1.0",
68
+ ],
69
+ "training": [
70
+ "wandb>=0.15.0",
71
+ "tensorboard>=2.14.0",
72
+ ],
73
+ "vision": [
74
+ "torchvision>=0.15.0",
75
+ "pillow>=9.0.0",
76
+ ],
77
+ "audio": [
78
+ "torchaudio>=2.0.0",
79
+ "librosa>=0.10.0",
80
+ ],
81
+ "all": [
82
+ "wandb>=0.15.0",
83
+ "tensorboard>=2.14.0",
84
+ "torchvision>=0.15.0",
85
+ "pillow>=9.0.0",
86
+ "torchaudio>=2.0.0",
87
+ "librosa>=0.10.0",
88
+ ],
89
+ },
90
+ keywords=[
91
+ "cognitive-ai",
92
+ "neural-network",
93
+ "transformer",
94
+ "llm",
95
+ "world-model",
96
+ "multimodal",
97
+ "huggingface",
98
+ "pytorch",
99
+ "deep-learning",
100
+ "neurogenesis",
101
+ "memory-system",
102
+ ],
103
+ include_package_data=True,
104
+ zip_safe=False,
105
+ )