``` ██████╗██╗ ██████╗ ██╗ ██╗ █████╗ ██╗ ██╔════╝██║ ██╔═══██╗██║ ██╔╝██╔══██╗██║ ██║ ██║ ██║ ██║█████╔╝ ███████║██║ ██║ ██║ ██║ ██║██╔═██╗ ██╔══██║██║ ╚██████╗███████╗╚██████╔╝██║ ██╗██║ ██║██║ ╚═════╝╚══════╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ``` # CLOKAI — The Spiking-KAN PCB Synthesis Engine **Circuit Logic Oriented Knowledge AI** [![Status](https://img.shields.io/badge/Status-Pre--Release%20Alpha-red?style=for-the-badge&logo=rocket)](https://github.com) [![Architecture](https://img.shields.io/badge/Architecture-ClokArch%20System-blueviolet?style=for-the-badge&logo=buffer)](https://github.com) [![Parameters](https://img.shields.io/badge/Parameters-~1.5B--1.8B-blue?style=for-the-badge&logo=brain)](https://github.com) [![Training](https://img.shields.io/badge/Training-2×%20NVIDIA%20T4%20DDP-76b900?style=for-the-badge&logo=nvidia)](https://github.com) [![Precision](https://img.shields.io/badge/Precision-FP16-orange?style=for-the-badge)](https://github.com) [![License](https://img.shields.io/badge/License-Apache%202.0-green?style=for-the-badge)](https://github.com) > *"Not just a language model. A logic engine that thinks in circuits."*
--- ## ⚡ Overview **CLOKAI** is an experimental heavyweight language model (~1.5B–1.8B parameters), purpose-engineered for the frontier of **Electronic Design Automation (EDA)** and **PCB Logic Synthesis**. Where conventional LLMs predict tokens, CLOKAI extracts logic — combining the raw expressivity of Neuromorphic Computing with the mathematical precision of Non-linear Function Approximation. This is not a fine-tuned chatbot. This is a **ClokArch** — a domain-native intelligence forged at the intersection of three revolutionary neural paradigms, designed to make PCB design as intuitive as a conversation. | | | |---|---| | **Datasets** | `Open-Orca/SlimOrca` · `Abhishekcr448/Hinglish-Everyday-Conversations-1M` | | **Languages** | English · Hindi (Hinglish) | | **Task** | Text Generation → Netlist Synthesis · Hardware Debugging · EDA Reasoning | | **Model Type** | `clokarch` (Custom Architecture) | --- ## 🧠 Model Architecture — *ClokArch* CLOKAI is a **ClokArch**: a three-architecture fusion that transcends the limitations of standard transformer-based LLMs. ``` ┌─────────────────────────────────────────────────────────┐ │ CLOKAI ClokArch ENGINE │ │ │ │ ┌───────────────────────────────────────────────┐ │ │ │ [1] KAN-Integrated Backbone │ │ │ │ Kolmogorov-Arnold Networks │ │ │ │ Learnable Spline Activations │ │ │ └───────────────────────────────────────────────┘ │ │ ↓ │ │ ┌───────────────────────────────────────────────┐ │ │ │ [2] Temporal Spiking Attention (TASA) │ │ │ │ SNN Layers + Async Firing Emulation │ │ │ │ Clock-Domain Temporal Processing │ │ │ └───────────────────────────────────────────────┘ │ │ ↓ │ │ ┌───────────────────────────────────────────────┐ │ │ │ [3] Neuro-Symbolic Logic Verifier │ │ │ │ KCL / KVL / Ohm's Law Validation │ │ │ │ Latent-Space Constraint Enforcement │ │ │ └───────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` ### 1. KAN-Integrated Backbone *(Kolmogorov-Arnold Networks)* Standard Multi-Layer Perceptrons have been **surgically replaced** with KAN layers — networks built on learnable activation functions defined by B-splines. Instead of fixed activation curves, every neuron in CLOKAI's backbone adapts its own mathematical function during training. > **Expert Insight:** This grants CLOKAI the ability to **mathematically resolve** hardware logic and parametric circuit constraints — not merely predict text patterns associated with them. The model doesn't guess component values; it derives them. ### 2. Temporal Spiking Attention — *TASA* Integrated Spiking Neural Network (SNN) layers emulate the brain's asynchronous firing mechanism at the attention level. The **Time-Aware Spiking Attention (TASA)** mechanism processes information in discrete temporal pulses rather than continuous dense activations. > **Expert Insight:** TASA enables CLOKAI to process **high-frequency signal integrity** and **clock-domain logic** with genuine temporal accuracy — critical for designs where timing is not a suggestion but a constraint. ### 3. Neuro-Symbolic Logic Verifier Embedded within CLOKAI's latent space is a **Symbolic Verifier** — a rule-enforcement layer that intercepts generated outputs and validates them against the immutable laws of electronics: Ohm's Law, Kirchhoff's Current Law (KCL), and Kirchhoff's Voltage Law (KVL). > **Expert Insight:** This creates a **self-correcting synthesis loop**. CLOKAI doesn't just generate netlists — it generates netlists that *pass physical law verification* before they ever leave the model. --- ## 🛠️ Key Capabilities | Capability | Description | |---|---| | 🔌 **Autonomous Netlist Synthesis** | Translate natural language requirements into Altium/KiCad-compatible JSON netlists — zero manual schematic entry | | 🎯 **Component Optimization** | Infer optimal resistor, capacitor, and inductor values from hidden design constraints and circuit context | | 🌐 **Hinglish Technical Reasoning** | Native-level comprehension and explanation of complex electronics engineering in English and Hinglish | | 🔍 **Hardware Debugging** | Detect design-rule violations, potential short circuits, and logic conflicts through pure **Logical Inference** — no simulation required | --- ## 📊 Technical Specifications | Parameter | Specification | |---|---| | **Parameter Count** | ~1.5 Billion – 1.8 Billion | | **Architecture** | ClokArch (Custom SNN-KAN Hybrid) | | **Hidden Dimension** | 1024 | | **Depth** | 16 Layers | | **Training Precision** | FP16 with Gradient Checkpointing | | **Tokenization** | Domain-Specific BPE (VCC, GND, GPIO, PWM, I²C, SPI optimized) | | **Training Hardware** | 2× NVIDIA T4 GPUs (Distributed Data Parallel) | | **Languages** | English, Hindi (Hinglish) | | **License** | Apache 2.0 | --- ## 🚀 Training & Optimization — *The Founder's Secret* CLOKAI was trained under a bespoke optimization regime on **2× NVIDIA T4 GPUs** in **Distributed Data Parallel (DDP)** mode. Every training decision was made to maximize logic extraction over pattern memorization. ### Entropy Maximization The data loader employs **high-entropy shuffling** and deliberate **hardware-netlist variability injection**. The training distribution was engineered to be maximally non-repetitive, forcing the model to generalize circuit logic rather than overfit to specific design signatures. ### Warm Restart Schedule A **Cosine Annealing with Warm Restarts** (SGDR) learning rate schedule was used to aggressively break loss plateaus. Each restart resets the learning rate to escape local minima, progressively narrowing the exploration radius. ### Memory Architecture Training a ~1.7B parameter ClokArch on constrained VRAM required surgical memory management: ``` Memory Optimization Stack: ┌──────────────────────────────────────────┐ │ FP16 Mixed Precision (Forward Pass) │ │ Activation Checkpointing (Backward) │ │ Bucketed Gradient Sync (DDP Layer) │ │ Dynamic Loss Scaling (Stability) │ └──────────────────────────────────────────┘ ↓ Result: ~1.7B params on 2× T4 ``` - **Activation Checkpointing** — recompute forward activations during backprop instead of storing them - **Bucketed Gradient Views** — DDP gradient communication bucketed for optimal bandwidth utilization - **FP16 Mixed Precision** — half-precision forward passes with FP32 master weights for numerical stability --- ## 🚀 Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Ghosthets/CLOKAI") model = AutoModelForCausalLM.from_pretrained( "Ghosthets/CLOKAI", torch_dtype="auto", device_map="auto" ) prompt = "Circuit design for LED with current limiting resistor at 5V:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, repetition_penalty=1.1 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## 📦 Training Data | Dataset | Purpose | |---|---| | [`Open-Orca/SlimOrca`](https://huggingface.co/datasets/Open-Orca/SlimOrca) | General instruction-following and reasoning alignment | | [`Abhishekcr448/Hinglish-Everyday-Conversations-1M`](https://huggingface.co/datasets/Abhishekcr448/Hinglish-Everyday-Conversations-1M) | Hinglish language comprehension and bilingual dialogue | > Domain-specific EDA corpora (netlist datasets, schematic descriptions, hardware design documents) were additionally used during training. --- ## 🛡️ Pre-Release Status ``` ╔══════════════════════════════════════════════════╗ ║ ⚠ PRE-RELEASE ALPHA ⚠ ║ ║ ║ ║ CLOKAI is currently in active development. ║ ║ Outputs should be verified before production║ ║ hardware deployment. ║ ╚══════════════════════════════════════════════════╝ ``` CLOKAI is in **Pre-Release Alpha**. The architecture is stable; the mission is not yet complete. Current development priorities include expanding the training corpus, refining the Neuro-Symbolic Verifier's constraint ruleset, and optimizing inference latency for real-time PCB design workflows. The ultimate objective: **redefine AI's role in the EDA industry** — making PCB design as natural and accessible as talking to a colleague. --- ## 🔭 Roadmap - [ ] Expand domain-specific tokenizer vocabulary (VHDL, Verilog, SPICE) - [ ] Release quantized GGUF/AWQ variants for edge deployment - [ ] Public benchmark suite against baseline EDA-LLMs - [ ] REST API + KiCad plugin integration - [ ] Multilingual expansion (Tamil-English, Bangla-English) - [ ] Full public release with model weights --- ## ⚠️ Limitations & Intended Use **Intended Use:** CLOKAI is designed for electronics engineers, PCB designers, and EDA researchers working on hardware synthesis, component selection, and circuit debugging tasks. **Current Limitations:** - Pre-release alpha — outputs must be verified by a qualified engineer before physical hardware deployment - Complex multi-layer board designs may require iterative prompting - Symbolic Verifier covers fundamental laws; advanced RF/high-speed signal integrity rules are under active development --- ## 📄 License This model is released under the **Apache 2.0 License**. See [LICENSE](LICENSE) for full terms. Training data licenses apply per their respective sources: - `Open-Orca/SlimOrca` — MIT License - `Abhishekcr448/Hinglish-Everyday-Conversations-1M` — See dataset card --- ## 📬 Citation If you use CLOKAI in your research or projects, please cite: ```bibtex @misc{clokai2025, title = {CLOKAI: The Spiking-KAN PCB Synthesis Engine}, author = {Ghosthets}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/Ghosthets/CLOKAI}}, note = {Pre-Release Alpha — ClokArch Architecture} } ``` ---
``` Made with @Ghosthets. Powered by ClokAI. ``` *CLOKAI — Where Neuromorphic Circuits Meet the Language of Design.*