CogNet-40M / README.md
thefinalboss's picture
Upload README.md with huggingface_hub
39a16d2 verified
|
Raw
History Blame Contribute Delete
1.92 kB
metadata
license: mit
language:
  - en
  - fr
  - code
tags:
  - non-transformer
  - cognitive-routing
  - hierarchical-memory
  - character-level
  - aicl
  - text-generation
  - custom-architecture
pipeline_tag: text-generation
library_name: pytorch

CogNet-40M

A 39.7M parameter non-transformer language model with O(n) cognitive routing and hierarchical memory.

Architecture

Component Detail
Architecture Non-transformer (Cognitive Routing)
Parameters 39,718,536 (~40M)
Hidden Dim 512
Blocks 6 cognitive blocks
Channels 6 routing channels x 128 dim
FF Dim 1024
Max Seq Len 256
Tokenizer Character-level (136 vocab)

Hierarchical Memory

  • Working Memory (32 slots): Active processing
  • Episodic Memory (64 slots): Short-term recall
  • Semantic Memory (128 slots): Long-term knowledge

Training

Metric Value
Steps 50,000
Batch Size 64
LR 3e-4 (cosine)
Precision FP16 AMP
GPU RTX 5060 Ti 16GB
Final Loss ~0.005
Final PPL ~1.01

Quick Start

from inference import CogNetInference
ai = CogNetInference("cognet_best.pt", "tokenizer_v3.json")
print(ai.generate("Once upon a time"))

AICL Integration

CogNet powers AICL (Architecture Compilation Language) as its native AI engine for code generation, diagnosis, and repair.

Files

File Size Description
cognet_best.pt 152MB FP32 checkpoint
cognet_fp16.pt 77MB FP16 checkpoint
tokenizer_v3.json - Char tokenizer (136 vocab)
config.json - Model config
cognet_model.py - Architecture source
inference.py - Inference script

Roadmap

  • CogNet-40M (39.7M)
  • HuggingFace integration
  • AICL native engine
  • CogNet-1B (1B params)
  • ONNX export

MIT License. Built with PyTorch on RTX 5060 Ti via QuickPod.