| <div align="center"> | |
| ``` | |
| ββββββββββ βββββββ βββ βββ ββββββ βββ | |
| βββββββββββ ββββββββββββ βββββββββββββββ | |
| βββ βββ βββ ββββββββββ βββββββββββ | |
| βββ βββ βββ ββββββββββ βββββββββββ | |
| ββββββββββββββββββββββββββββ ββββββ ββββββ | |
| βββββββββββββββ βββββββ βββ ββββββ ββββββ | |
| ``` | |
| # CLOKAI β The Spiking-KAN PCB Synthesis Engine | |
| **Circuit Logic Oriented Knowledge AI** | |
| [](https://github.com) | |
| [](https://github.com) | |
| [](https://github.com) | |
| [](https://github.com) | |
| [](https://github.com) | |
| [](https://github.com) | |
| > *"Not just a language model. A logic engine that thinks in circuits."* | |
| </div> | |
| --- | |
| ## β‘ Overview | |
| **CLOKAI** is an experimental heavyweight language model (~1.5Bβ1.8B parameters), purpose-engineered for the frontier of **Electronic Design Automation (EDA)** and **PCB Logic Synthesis**. Where conventional LLMs predict tokens, CLOKAI extracts logic β combining the raw expressivity of Neuromorphic Computing with the mathematical precision of Non-linear Function Approximation. | |
| This is not a fine-tuned chatbot. This is a **ClokArch** β a domain-native intelligence forged at the intersection of three revolutionary neural paradigms, designed to make PCB design as intuitive as a conversation. | |
| | | | | |
| |---|---| | |
| | **Datasets** | `Open-Orca/SlimOrca` Β· `Abhishekcr448/Hinglish-Everyday-Conversations-1M` | | |
| | **Languages** | English Β· Hindi (Hinglish) | | |
| | **Task** | Text Generation β Netlist Synthesis Β· Hardware Debugging Β· EDA Reasoning | | |
| | **Model Type** | `clokarch` (Custom Architecture) | | |
| --- | |
| ## π§ Model Architecture β *ClokArch* | |
| CLOKAI is a **ClokArch**: a three-architecture fusion that transcends the limitations of standard transformer-based LLMs. | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β CLOKAI ClokArch ENGINE β | |
| β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β [1] KAN-Integrated Backbone β β | |
| β β Kolmogorov-Arnold Networks β β | |
| β β Learnable Spline Activations β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β [2] Temporal Spiking Attention (TASA) β β | |
| β β SNN Layers + Async Firing Emulation β β | |
| β β Clock-Domain Temporal Processing β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| β β [3] Neuro-Symbolic Logic Verifier β β | |
| β β KCL / KVL / Ohm's Law Validation β β | |
| β β Latent-Space Constraint Enforcement β β | |
| β βββββββββββββββββββββββββββββββββββββββββββββββββ β | |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ### 1. KAN-Integrated Backbone *(Kolmogorov-Arnold Networks)* | |
| Standard Multi-Layer Perceptrons have been **surgically replaced** with KAN layers β networks built on learnable activation functions defined by B-splines. Instead of fixed activation curves, every neuron in CLOKAI's backbone adapts its own mathematical function during training. | |
| > **Expert Insight:** This grants CLOKAI the ability to **mathematically resolve** hardware logic and parametric circuit constraints β not merely predict text patterns associated with them. The model doesn't guess component values; it derives them. | |
| ### 2. Temporal Spiking Attention β *TASA* | |
| Integrated Spiking Neural Network (SNN) layers emulate the brain's asynchronous firing mechanism at the attention level. The **Time-Aware Spiking Attention (TASA)** mechanism processes information in discrete temporal pulses rather than continuous dense activations. | |
| > **Expert Insight:** TASA enables CLOKAI to process **high-frequency signal integrity** and **clock-domain logic** with genuine temporal accuracy β critical for designs where timing is not a suggestion but a constraint. | |
| ### 3. Neuro-Symbolic Logic Verifier | |
| Embedded within CLOKAI's latent space is a **Symbolic Verifier** β a rule-enforcement layer that intercepts generated outputs and validates them against the immutable laws of electronics: Ohm's Law, Kirchhoff's Current Law (KCL), and Kirchhoff's Voltage Law (KVL). | |
| > **Expert Insight:** This creates a **self-correcting synthesis loop**. CLOKAI doesn't just generate netlists β it generates netlists that *pass physical law verification* before they ever leave the model. | |
| --- | |
| ## π οΈ Key Capabilities | |
| | Capability | Description | | |
| |---|---| | |
| | π **Autonomous Netlist Synthesis** | Translate natural language requirements into Altium/KiCad-compatible JSON netlists β zero manual schematic entry | | |
| | π― **Component Optimization** | Infer optimal resistor, capacitor, and inductor values from hidden design constraints and circuit context | | |
| | π **Hinglish Technical Reasoning** | Native-level comprehension and explanation of complex electronics engineering in English and Hinglish | | |
| | π **Hardware Debugging** | Detect design-rule violations, potential short circuits, and logic conflicts through pure **Logical Inference** β no simulation required | | |
| --- | |
| ## π Technical Specifications | |
| | Parameter | Specification | | |
| |---|---| | |
| | **Parameter Count** | ~1.5 Billion β 1.8 Billion | | |
| | **Architecture** | ClokArch (Custom SNN-KAN Hybrid) | | |
| | **Hidden Dimension** | 1024 | | |
| | **Depth** | 16 Layers | | |
| | **Training Precision** | FP16 with Gradient Checkpointing | | |
| | **Tokenization** | Domain-Specific BPE (VCC, GND, GPIO, PWM, IΒ²C, SPI optimized) | | |
| | **Training Hardware** | 2Γ NVIDIA T4 GPUs (Distributed Data Parallel) | | |
| | **Languages** | English, Hindi (Hinglish) | | |
| | **License** | Apache 2.0 | | |
| --- | |
| ## π Training & Optimization β *The Founder's Secret* | |
| CLOKAI was trained under a bespoke optimization regime on **2Γ NVIDIA T4 GPUs** in **Distributed Data Parallel (DDP)** mode. Every training decision was made to maximize logic extraction over pattern memorization. | |
| ### Entropy Maximization | |
| The data loader employs **high-entropy shuffling** and deliberate **hardware-netlist variability injection**. The training distribution was engineered to be maximally non-repetitive, forcing the model to generalize circuit logic rather than overfit to specific design signatures. | |
| ### Warm Restart Schedule | |
| A **Cosine Annealing with Warm Restarts** (SGDR) learning rate schedule was used to aggressively break loss plateaus. Each restart resets the learning rate to escape local minima, progressively narrowing the exploration radius. | |
| ### Memory Architecture | |
| Training a ~1.7B parameter ClokArch on constrained VRAM required surgical memory management: | |
| ``` | |
| Memory Optimization Stack: | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| β FP16 Mixed Precision (Forward Pass) β | |
| β Activation Checkpointing (Backward) β | |
| β Bucketed Gradient Sync (DDP Layer) β | |
| β Dynamic Loss Scaling (Stability) β | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| β Result: ~1.7B params on 2Γ T4 | |
| ``` | |
| - **Activation Checkpointing** β recompute forward activations during backprop instead of storing them | |
| - **Bucketed Gradient Views** β DDP gradient communication bucketed for optimal bandwidth utilization | |
| - **FP16 Mixed Precision** β half-precision forward passes with FP32 master weights for numerical stability | |
| --- | |
| ## π Quick Start | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| tokenizer = AutoTokenizer.from_pretrained("Ghosthets/CLOKAI") | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "Ghosthets/CLOKAI", | |
| torch_dtype="auto", | |
| device_map="auto" | |
| ) | |
| prompt = "Circuit design for LED with current limiting resistor at 5V:" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=256, | |
| temperature=0.7, | |
| do_sample=True, | |
| repetition_penalty=1.1 | |
| ) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| --- | |
| ## π¦ Training Data | |
| | Dataset | Purpose | | |
| |---|---| | |
| | [`Open-Orca/SlimOrca`](https://huggingface.co/datasets/Open-Orca/SlimOrca) | General instruction-following and reasoning alignment | | |
| | [`Abhishekcr448/Hinglish-Everyday-Conversations-1M`](https://huggingface.co/datasets/Abhishekcr448/Hinglish-Everyday-Conversations-1M) | Hinglish language comprehension and bilingual dialogue | | |
| > Domain-specific EDA corpora (netlist datasets, schematic descriptions, hardware design documents) were additionally used during training. | |
| --- | |
| ## π‘οΈ Pre-Release Status | |
| ``` | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| β β PRE-RELEASE ALPHA β β | |
| β β | |
| β CLOKAI is currently in active development. β | |
| β Outputs should be verified before productionβ | |
| β hardware deployment. β | |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| CLOKAI is in **Pre-Release Alpha**. The architecture is stable; the mission is not yet complete. Current development priorities include expanding the training corpus, refining the Neuro-Symbolic Verifier's constraint ruleset, and optimizing inference latency for real-time PCB design workflows. | |
| The ultimate objective: **redefine AI's role in the EDA industry** β making PCB design as natural and accessible as talking to a colleague. | |
| --- | |
| ## π Roadmap | |
| - [ ] Expand domain-specific tokenizer vocabulary (VHDL, Verilog, SPICE) | |
| - [ ] Release quantized GGUF/AWQ variants for edge deployment | |
| - [ ] Public benchmark suite against baseline EDA-LLMs | |
| - [ ] REST API + KiCad plugin integration | |
| - [ ] Multilingual expansion (Tamil-English, Bangla-English) | |
| - [ ] Full public release with model weights | |
| --- | |
| ## β οΈ Limitations & Intended Use | |
| **Intended Use:** CLOKAI is designed for electronics engineers, PCB designers, and EDA researchers working on hardware synthesis, component selection, and circuit debugging tasks. | |
| **Current Limitations:** | |
| - Pre-release alpha β outputs must be verified by a qualified engineer before physical hardware deployment | |
| - Complex multi-layer board designs may require iterative prompting | |
| - Symbolic Verifier covers fundamental laws; advanced RF/high-speed signal integrity rules are under active development | |
| --- | |
| ## π License | |
| This model is released under the **Apache 2.0 License**. See [LICENSE](LICENSE) for full terms. | |
| Training data licenses apply per their respective sources: | |
| - `Open-Orca/SlimOrca` β MIT License | |
| - `Abhishekcr448/Hinglish-Everyday-Conversations-1M` β See dataset card | |
| --- | |
| ## π¬ Citation | |
| If you use CLOKAI in your research or projects, please cite: | |
| ```bibtex | |
| @misc{clokai2025, | |
| title = {CLOKAI: The Spiking-KAN PCB Synthesis Engine}, | |
| author = {Ghosthets}, | |
| year = {2025}, | |
| publisher = {HuggingFace}, | |
| howpublished = {\url{https://huggingface.co/Ghosthets/CLOKAI}}, | |
| note = {Pre-Release Alpha β ClokArch Architecture} | |
| } | |
| ``` | |
| --- | |
| <div align="center"> | |
| ``` | |
| Made with @Ghosthets. Powered by ClokAI. | |
| ``` | |
| *CLOKAI β Where Neuromorphic Circuits Meet the Language of Design.* | |
| </div> |