File size: 13,553 Bytes
0d16d47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8cb7ad4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d16d47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8cb7ad4
 
 
 
0d16d47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8cb7ad4
 
 
 
 
0d16d47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
<div align="center">

```
 β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•—      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—  β–ˆβ–ˆβ•— β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—
β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘
β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β• β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•— β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β•šβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘
 β•šβ•β•β•β•β•β•β•šβ•β•β•β•β•β•β• β•šβ•β•β•β•β•β• β•šβ•β•  β•šβ•β•β•šβ•β•  β•šβ•β•β•šβ•β•
```

# CLOKAI β€” The Spiking-KAN PCB Synthesis Engine
**Circuit Logic Oriented Knowledge AI**

[![Status](https://img.shields.io/badge/Status-Pre--Release%20Alpha-red?style=for-the-badge&logo=rocket)](https://github.com)
[![Architecture](https://img.shields.io/badge/Architecture-ClokArch%20System-blueviolet?style=for-the-badge&logo=buffer)](https://github.com)
[![Parameters](https://img.shields.io/badge/Parameters-~1.5B--1.8B-blue?style=for-the-badge&logo=brain)](https://github.com)
[![Training](https://img.shields.io/badge/Training-2Γ—%20NVIDIA%20T4%20DDP-76b900?style=for-the-badge&logo=nvidia)](https://github.com)
[![Precision](https://img.shields.io/badge/Precision-FP16-orange?style=for-the-badge)](https://github.com)
[![License](https://img.shields.io/badge/License-Apache%202.0-green?style=for-the-badge)](https://github.com)

> *"Not just a language model. A logic engine that thinks in circuits."*

</div>

---

## ⚑ Overview

**CLOKAI** is an experimental heavyweight language model (~1.5B–1.8B parameters), purpose-engineered for the frontier of **Electronic Design Automation (EDA)** and **PCB Logic Synthesis**. Where conventional LLMs predict tokens, CLOKAI extracts logic β€” combining the raw expressivity of Neuromorphic Computing with the mathematical precision of Non-linear Function Approximation.

This is not a fine-tuned chatbot. This is a **ClokArch** β€” a domain-native intelligence forged at the intersection of three revolutionary neural paradigms, designed to make PCB design as intuitive as a conversation.

| | |
|---|---|
| **Datasets** | `Open-Orca/SlimOrca` Β· `Abhishekcr448/Hinglish-Everyday-Conversations-1M` |
| **Languages** | English Β· Hindi (Hinglish) |
| **Task** | Text Generation β†’ Netlist Synthesis Β· Hardware Debugging Β· EDA Reasoning |
| **Model Type** | `clokarch` (Custom Architecture) |

---

## 🧠 Model Architecture β€” *ClokArch*

CLOKAI is a **ClokArch**: a three-architecture fusion that transcends the limitations of standard transformer-based LLMs.

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 CLOKAI ClokArch ENGINE             β”‚
β”‚                                                    β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚  [1] KAN-Integrated Backbone              β”‚    β”‚
β”‚   β”‚      Kolmogorov-Arnold Networks           β”‚    β”‚
β”‚   β”‚      Learnable Spline Activations         β”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                         ↓                          β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚  [2] Temporal Spiking Attention (TASA)    β”‚    β”‚
β”‚   β”‚      SNN Layers + Async Firing Emulation  β”‚    β”‚
β”‚   β”‚      Clock-Domain Temporal Processing     β”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                         ↓                          β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚  [3] Neuro-Symbolic Logic Verifier        β”‚    β”‚
β”‚   β”‚      KCL / KVL / Ohm's Law Validation     β”‚    β”‚
β”‚   β”‚      Latent-Space Constraint Enforcement  β”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### 1. KAN-Integrated Backbone *(Kolmogorov-Arnold Networks)*

Standard Multi-Layer Perceptrons have been **surgically replaced** with KAN layers β€” networks built on learnable activation functions defined by B-splines. Instead of fixed activation curves, every neuron in CLOKAI's backbone adapts its own mathematical function during training.

> **Expert Insight:** This grants CLOKAI the ability to **mathematically resolve** hardware logic and parametric circuit constraints β€” not merely predict text patterns associated with them. The model doesn't guess component values; it derives them.

### 2. Temporal Spiking Attention β€” *TASA*

Integrated Spiking Neural Network (SNN) layers emulate the brain's asynchronous firing mechanism at the attention level. The **Time-Aware Spiking Attention (TASA)** mechanism processes information in discrete temporal pulses rather than continuous dense activations.

> **Expert Insight:** TASA enables CLOKAI to process **high-frequency signal integrity** and **clock-domain logic** with genuine temporal accuracy β€” critical for designs where timing is not a suggestion but a constraint.

### 3. Neuro-Symbolic Logic Verifier

Embedded within CLOKAI's latent space is a **Symbolic Verifier** β€” a rule-enforcement layer that intercepts generated outputs and validates them against the immutable laws of electronics: Ohm's Law, Kirchhoff's Current Law (KCL), and Kirchhoff's Voltage Law (KVL).

> **Expert Insight:** This creates a **self-correcting synthesis loop**. CLOKAI doesn't just generate netlists β€” it generates netlists that *pass physical law verification* before they ever leave the model.

---

## πŸ› οΈ Key Capabilities

| Capability | Description |
|---|---|
| πŸ”Œ **Autonomous Netlist Synthesis** | Translate natural language requirements into Altium/KiCad-compatible JSON netlists β€” zero manual schematic entry |
| 🎯 **Component Optimization** | Infer optimal resistor, capacitor, and inductor values from hidden design constraints and circuit context |
| 🌐 **Hinglish Technical Reasoning** | Native-level comprehension and explanation of complex electronics engineering in English and Hinglish |
| πŸ” **Hardware Debugging** | Detect design-rule violations, potential short circuits, and logic conflicts through pure **Logical Inference** β€” no simulation required |

---

## πŸ“Š Technical Specifications

| Parameter | Specification |
|---|---|
| **Parameter Count** | ~1.5 Billion – 1.8 Billion |
| **Architecture** | ClokArch (Custom SNN-KAN Hybrid) |
| **Hidden Dimension** | 1024 |
| **Depth** | 16 Layers |
| **Training Precision** | FP16 with Gradient Checkpointing |
| **Tokenization** | Domain-Specific BPE (VCC, GND, GPIO, PWM, IΒ²C, SPI optimized) |
| **Training Hardware** | 2Γ— NVIDIA T4 GPUs (Distributed Data Parallel) |
| **Languages** | English, Hindi (Hinglish) |
| **License** | Apache 2.0 |

---

## πŸš€ Training & Optimization β€” *The Founder's Secret*

CLOKAI was trained under a bespoke optimization regime on **2Γ— NVIDIA T4 GPUs** in **Distributed Data Parallel (DDP)** mode. Every training decision was made to maximize logic extraction over pattern memorization.

### Entropy Maximization
The data loader employs **high-entropy shuffling** and deliberate **hardware-netlist variability injection**. The training distribution was engineered to be maximally non-repetitive, forcing the model to generalize circuit logic rather than overfit to specific design signatures.

### Warm Restart Schedule
A **Cosine Annealing with Warm Restarts** (SGDR) learning rate schedule was used to aggressively break loss plateaus. Each restart resets the learning rate to escape local minima, progressively narrowing the exploration radius.

### Memory Architecture
Training a ~1.7B parameter ClokArch on constrained VRAM required surgical memory management:

```
Memory Optimization Stack:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  FP16 Mixed Precision (Forward Pass)  β”‚
β”‚  Activation Checkpointing (Backward)  β”‚
β”‚  Bucketed Gradient Sync (DDP Layer)   β”‚
β”‚  Dynamic Loss Scaling (Stability)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓ Result: ~1.7B params on 2Γ— T4
```

- **Activation Checkpointing** β€” recompute forward activations during backprop instead of storing them
- **Bucketed Gradient Views** β€” DDP gradient communication bucketed for optimal bandwidth utilization
- **FP16 Mixed Precision** β€” half-precision forward passes with FP32 master weights for numerical stability

---

## πŸš€ Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Ghosthets/CLOKAI")
model = AutoModelForCausalLM.from_pretrained(
    "Ghosthets/CLOKAI",
    torch_dtype="auto",
    device_map="auto"
)

prompt = "Circuit design for LED with current limiting resistor at 5V:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## πŸ“¦ Training Data

| Dataset | Purpose |
|---|---|
| [`Open-Orca/SlimOrca`](https://huggingface.co/datasets/Open-Orca/SlimOrca) | General instruction-following and reasoning alignment |
| [`Abhishekcr448/Hinglish-Everyday-Conversations-1M`](https://huggingface.co/datasets/Abhishekcr448/Hinglish-Everyday-Conversations-1M) | Hinglish language comprehension and bilingual dialogue |

> Domain-specific EDA corpora (netlist datasets, schematic descriptions, hardware design documents) were additionally used during training.

---

## πŸ›‘οΈ Pre-Release Status

```
╔══════════════════════════════════════════════════╗
β•‘           ⚠  PRE-RELEASE ALPHA  ⚠           β•‘
β•‘                                              β•‘
β•‘  CLOKAI is currently in active development.  β•‘
β•‘  Outputs should be verified before productionβ•‘
β•‘  hardware deployment.                        β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
```

CLOKAI is in **Pre-Release Alpha**. The architecture is stable; the mission is not yet complete. Current development priorities include expanding the training corpus, refining the Neuro-Symbolic Verifier's constraint ruleset, and optimizing inference latency for real-time PCB design workflows.

The ultimate objective: **redefine AI's role in the EDA industry** β€” making PCB design as natural and accessible as talking to a colleague.

---

## πŸ”­ Roadmap

- [ ] Expand domain-specific tokenizer vocabulary (VHDL, Verilog, SPICE)
- [ ] Release quantized GGUF/AWQ variants for edge deployment
- [ ] Public benchmark suite against baseline EDA-LLMs
- [ ] REST API + KiCad plugin integration
- [ ] Multilingual expansion (Tamil-English, Bangla-English)
- [ ] Full public release with model weights

---

## ⚠️ Limitations & Intended Use

**Intended Use:** CLOKAI is designed for electronics engineers, PCB designers, and EDA researchers working on hardware synthesis, component selection, and circuit debugging tasks.

**Current Limitations:**
- Pre-release alpha β€” outputs must be verified by a qualified engineer before physical hardware deployment
- Complex multi-layer board designs may require iterative prompting
- Symbolic Verifier covers fundamental laws; advanced RF/high-speed signal integrity rules are under active development

---

## πŸ“„ License

This model is released under the **Apache 2.0 License**. See [LICENSE](LICENSE) for full terms.

Training data licenses apply per their respective sources:
- `Open-Orca/SlimOrca` β€” MIT License
- `Abhishekcr448/Hinglish-Everyday-Conversations-1M` β€” See dataset card

---

## πŸ“¬ Citation

If you use CLOKAI in your research or projects, please cite:

```bibtex
@misc{clokai2025,
  title        = {CLOKAI: The Spiking-KAN PCB Synthesis Engine},
  author       = {Ghosthets},
  year         = {2025},
  publisher    = {HuggingFace},
  howpublished = {\url{https://huggingface.co/Ghosthets/CLOKAI}},
  note         = {Pre-Release Alpha β€” ClokArch Architecture}
}
```

---

<div align="center">

```
Made with @Ghosthets. Powered by ClokAI.
```

*CLOKAI β€” Where Neuromorphic Circuits Meet the Language of Design.*

</div>