| --- |
| license: mit |
| language: |
| - en |
| - fr |
| - code |
| tags: |
| - non-transformer |
| - cognitive-routing |
| - hierarchical-memory |
| - character-level |
| - aicl |
| - text-generation |
| - custom-architecture |
| pipeline_tag: text-generation |
| library_name: pytorch |
| --- |
| |
| # CogNet-40M |
|
|
| A 39.7M parameter non-transformer language model with O(n) cognitive routing and hierarchical memory. |
|
|
| ## Architecture |
|
|
| | Component | Detail | |
| |-----------|--------| |
| | Architecture | Non-transformer (Cognitive Routing) | |
| | Parameters | 39,718,536 (~40M) | |
| | Hidden Dim | 512 | |
| | Blocks | 6 cognitive blocks | |
| | Channels | 6 routing channels x 128 dim | |
| | FF Dim | 1024 | |
| | Max Seq Len | 256 | |
| | Tokenizer | Character-level (136 vocab) | |
|
|
| ## Hierarchical Memory |
|
|
| - Working Memory (32 slots): Active processing |
| - Episodic Memory (64 slots): Short-term recall |
| - Semantic Memory (128 slots): Long-term knowledge |
|
|
| ## Training |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Steps | 50,000 | |
| | Batch Size | 64 | |
| | LR | 3e-4 (cosine) | |
| | Precision | FP16 AMP | |
| | GPU | RTX 5060 Ti 16GB | |
| | Final Loss | ~0.005 | |
| | Final PPL | ~1.01 | |
|
|
| ## Quick Start |
|
|
| ```python |
| from inference import CogNetInference |
| ai = CogNetInference("cognet_best.pt", "tokenizer_v3.json") |
| print(ai.generate("Once upon a time")) |
| ``` |
|
|
| ## AICL Integration |
|
|
| CogNet powers AICL (Architecture Compilation Language) as its native AI engine for code generation, diagnosis, and repair. |
|
|
| ## Files |
|
|
| | File | Size | Description | |
| |------|------|-------------| |
| | cognet_best.pt | 152MB | FP32 checkpoint | |
| | cognet_fp16.pt | 77MB | FP16 checkpoint | |
| | tokenizer_v3.json | - | Char tokenizer (136 vocab) | |
| | config.json | - | Model config | |
| | cognet_model.py | - | Architecture source | |
| | inference.py | - | Inference script | |
|
|
| ## Roadmap |
|
|
| - [x] CogNet-40M (39.7M) |
| - [x] HuggingFace integration |
| - [x] AICL native engine |
| - [ ] CogNet-1B (1B params) |
| - [ ] ONNX export |
|
|
| MIT License. Built with PyTorch on RTX 5060 Ti via QuickPod. |
|
|