File size: 13,553 Bytes
0d16d47 8cb7ad4 0d16d47 8cb7ad4 0d16d47 8cb7ad4 0d16d47 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 | <div align="center">
```
ββββββββββ βββββββ βββ βββ ββββββ βββ
βββββββββββ ββββββββββββ βββββββββββββββ
βββ βββ βββ ββββββββββ βββββββββββ
βββ βββ βββ ββββββββββ βββββββββββ
ββββββββββββββββββββββββββββ ββββββ ββββββ
βββββββββββββββ βββββββ βββ ββββββ ββββββ
```
# CLOKAI β The Spiking-KAN PCB Synthesis Engine
**Circuit Logic Oriented Knowledge AI**
[](https://github.com)
[](https://github.com)
[](https://github.com)
[](https://github.com)
[](https://github.com)
[](https://github.com)
> *"Not just a language model. A logic engine that thinks in circuits."*
</div>
---
## β‘ Overview
**CLOKAI** is an experimental heavyweight language model (~1.5Bβ1.8B parameters), purpose-engineered for the frontier of **Electronic Design Automation (EDA)** and **PCB Logic Synthesis**. Where conventional LLMs predict tokens, CLOKAI extracts logic β combining the raw expressivity of Neuromorphic Computing with the mathematical precision of Non-linear Function Approximation.
This is not a fine-tuned chatbot. This is a **ClokArch** β a domain-native intelligence forged at the intersection of three revolutionary neural paradigms, designed to make PCB design as intuitive as a conversation.
| | |
|---|---|
| **Datasets** | `Open-Orca/SlimOrca` Β· `Abhishekcr448/Hinglish-Everyday-Conversations-1M` |
| **Languages** | English Β· Hindi (Hinglish) |
| **Task** | Text Generation β Netlist Synthesis Β· Hardware Debugging Β· EDA Reasoning |
| **Model Type** | `clokarch` (Custom Architecture) |
---
## π§ Model Architecture β *ClokArch*
CLOKAI is a **ClokArch**: a three-architecture fusion that transcends the limitations of standard transformer-based LLMs.
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLOKAI ClokArch ENGINE β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β [1] KAN-Integrated Backbone β β
β β Kolmogorov-Arnold Networks β β
β β Learnable Spline Activations β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β [2] Temporal Spiking Attention (TASA) β β
β β SNN Layers + Async Firing Emulation β β
β β Clock-Domain Temporal Processing β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
β β [3] Neuro-Symbolic Logic Verifier β β
β β KCL / KVL / Ohm's Law Validation β β
β β Latent-Space Constraint Enforcement β β
β βββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
### 1. KAN-Integrated Backbone *(Kolmogorov-Arnold Networks)*
Standard Multi-Layer Perceptrons have been **surgically replaced** with KAN layers β networks built on learnable activation functions defined by B-splines. Instead of fixed activation curves, every neuron in CLOKAI's backbone adapts its own mathematical function during training.
> **Expert Insight:** This grants CLOKAI the ability to **mathematically resolve** hardware logic and parametric circuit constraints β not merely predict text patterns associated with them. The model doesn't guess component values; it derives them.
### 2. Temporal Spiking Attention β *TASA*
Integrated Spiking Neural Network (SNN) layers emulate the brain's asynchronous firing mechanism at the attention level. The **Time-Aware Spiking Attention (TASA)** mechanism processes information in discrete temporal pulses rather than continuous dense activations.
> **Expert Insight:** TASA enables CLOKAI to process **high-frequency signal integrity** and **clock-domain logic** with genuine temporal accuracy β critical for designs where timing is not a suggestion but a constraint.
### 3. Neuro-Symbolic Logic Verifier
Embedded within CLOKAI's latent space is a **Symbolic Verifier** β a rule-enforcement layer that intercepts generated outputs and validates them against the immutable laws of electronics: Ohm's Law, Kirchhoff's Current Law (KCL), and Kirchhoff's Voltage Law (KVL).
> **Expert Insight:** This creates a **self-correcting synthesis loop**. CLOKAI doesn't just generate netlists β it generates netlists that *pass physical law verification* before they ever leave the model.
---
## π οΈ Key Capabilities
| Capability | Description |
|---|---|
| π **Autonomous Netlist Synthesis** | Translate natural language requirements into Altium/KiCad-compatible JSON netlists β zero manual schematic entry |
| π― **Component Optimization** | Infer optimal resistor, capacitor, and inductor values from hidden design constraints and circuit context |
| π **Hinglish Technical Reasoning** | Native-level comprehension and explanation of complex electronics engineering in English and Hinglish |
| π **Hardware Debugging** | Detect design-rule violations, potential short circuits, and logic conflicts through pure **Logical Inference** β no simulation required |
---
## π Technical Specifications
| Parameter | Specification |
|---|---|
| **Parameter Count** | ~1.5 Billion β 1.8 Billion |
| **Architecture** | ClokArch (Custom SNN-KAN Hybrid) |
| **Hidden Dimension** | 1024 |
| **Depth** | 16 Layers |
| **Training Precision** | FP16 with Gradient Checkpointing |
| **Tokenization** | Domain-Specific BPE (VCC, GND, GPIO, PWM, IΒ²C, SPI optimized) |
| **Training Hardware** | 2Γ NVIDIA T4 GPUs (Distributed Data Parallel) |
| **Languages** | English, Hindi (Hinglish) |
| **License** | Apache 2.0 |
---
## π Training & Optimization β *The Founder's Secret*
CLOKAI was trained under a bespoke optimization regime on **2Γ NVIDIA T4 GPUs** in **Distributed Data Parallel (DDP)** mode. Every training decision was made to maximize logic extraction over pattern memorization.
### Entropy Maximization
The data loader employs **high-entropy shuffling** and deliberate **hardware-netlist variability injection**. The training distribution was engineered to be maximally non-repetitive, forcing the model to generalize circuit logic rather than overfit to specific design signatures.
### Warm Restart Schedule
A **Cosine Annealing with Warm Restarts** (SGDR) learning rate schedule was used to aggressively break loss plateaus. Each restart resets the learning rate to escape local minima, progressively narrowing the exploration radius.
### Memory Architecture
Training a ~1.7B parameter ClokArch on constrained VRAM required surgical memory management:
```
Memory Optimization Stack:
ββββββββββββββββββββββββββββββββββββββββββββ
β FP16 Mixed Precision (Forward Pass) β
β Activation Checkpointing (Backward) β
β Bucketed Gradient Sync (DDP Layer) β
β Dynamic Loss Scaling (Stability) β
ββββββββββββββββββββββββββββββββββββββββββββ
β Result: ~1.7B params on 2Γ T4
```
- **Activation Checkpointing** β recompute forward activations during backprop instead of storing them
- **Bucketed Gradient Views** β DDP gradient communication bucketed for optimal bandwidth utilization
- **FP16 Mixed Precision** β half-precision forward passes with FP32 master weights for numerical stability
---
## π Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Ghosthets/CLOKAI")
model = AutoModelForCausalLM.from_pretrained(
"Ghosthets/CLOKAI",
torch_dtype="auto",
device_map="auto"
)
prompt = "Circuit design for LED with current limiting resistor at 5V:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## π¦ Training Data
| Dataset | Purpose |
|---|---|
| [`Open-Orca/SlimOrca`](https://huggingface.co/datasets/Open-Orca/SlimOrca) | General instruction-following and reasoning alignment |
| [`Abhishekcr448/Hinglish-Everyday-Conversations-1M`](https://huggingface.co/datasets/Abhishekcr448/Hinglish-Everyday-Conversations-1M) | Hinglish language comprehension and bilingual dialogue |
> Domain-specific EDA corpora (netlist datasets, schematic descriptions, hardware design documents) were additionally used during training.
---
## π‘οΈ Pre-Release Status
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β PRE-RELEASE ALPHA β β
β β
β CLOKAI is currently in active development. β
β Outputs should be verified before productionβ
β hardware deployment. β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
CLOKAI is in **Pre-Release Alpha**. The architecture is stable; the mission is not yet complete. Current development priorities include expanding the training corpus, refining the Neuro-Symbolic Verifier's constraint ruleset, and optimizing inference latency for real-time PCB design workflows.
The ultimate objective: **redefine AI's role in the EDA industry** β making PCB design as natural and accessible as talking to a colleague.
---
## π Roadmap
- [ ] Expand domain-specific tokenizer vocabulary (VHDL, Verilog, SPICE)
- [ ] Release quantized GGUF/AWQ variants for edge deployment
- [ ] Public benchmark suite against baseline EDA-LLMs
- [ ] REST API + KiCad plugin integration
- [ ] Multilingual expansion (Tamil-English, Bangla-English)
- [ ] Full public release with model weights
---
## β οΈ Limitations & Intended Use
**Intended Use:** CLOKAI is designed for electronics engineers, PCB designers, and EDA researchers working on hardware synthesis, component selection, and circuit debugging tasks.
**Current Limitations:**
- Pre-release alpha β outputs must be verified by a qualified engineer before physical hardware deployment
- Complex multi-layer board designs may require iterative prompting
- Symbolic Verifier covers fundamental laws; advanced RF/high-speed signal integrity rules are under active development
---
## π License
This model is released under the **Apache 2.0 License**. See [LICENSE](LICENSE) for full terms.
Training data licenses apply per their respective sources:
- `Open-Orca/SlimOrca` β MIT License
- `Abhishekcr448/Hinglish-Everyday-Conversations-1M` β See dataset card
---
## π¬ Citation
If you use CLOKAI in your research or projects, please cite:
```bibtex
@misc{clokai2025,
title = {CLOKAI: The Spiking-KAN PCB Synthesis Engine},
author = {Ghosthets},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/Ghosthets/CLOKAI}},
note = {Pre-Release Alpha β ClokArch Architecture}
}
```
---
<div align="center">
```
Made with @Ghosthets. Powered by ClokAI.
```
*CLOKAI β Where Neuromorphic Circuits Meet the Language of Design.*
</div> |