File size: 4,875 Bytes
d5739f0
eb0ca58
 
 
 
 
d5739f0
eb0ca58
 
 
 
 
 
 
 
 
d5739f0
 
eb0ca58
d5739f0
eb0ca58
 
d5739f0
eb0ca58
 
 
 
 
 
 
 
 
 
 
d5739f0
eb0ca58
d5739f0
eb0ca58
 
 
 
 
 
 
 
 
 
 
d5739f0
 
eb0ca58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5739f0
eb0ca58
d5739f0
eb0ca58
 
d5739f0
eb0ca58
 
d5739f0
eb0ca58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5739f0
eb0ca58
d5739f0
eb0ca58
d5739f0
eb0ca58
d5739f0
eb0ca58
d5739f0
eb0ca58
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
---
language:
- en

license: mit

tags:
- PCB
- EDA
- KiCAD
- Hardware-Design
- Schematic-Generation
- LLM
- Circuit-Design

library_name: transformers
---

# SchGen

[![License](https://img.shields.io/badge/License-MIT-green.svg)]()
[![Model](https://img.shields.io/badge/Model-GPT--OSS--20B-blue)]()

**SchGen** is a large language model for **PCB schematic generation from natural-language requests**.

The model is supervised fine-tuned from **GPT-OSS-20B** using a custom dataset of approximately **8K paired user requests and schematic-generation code samples**.  
SchGen generates executable Python code that can be rendered into **KiCad schematic designs** using customized schematic APIs.

➡️ **Base Model:** GPT-OSS-20B  
➡️ **License:** MIT  
➡️ **Framework:** Transformers  
➡️ **Context Length:** 13,312 tokens  

---

## Overview

Printed circuit board (PCB) design is a critical but expertise-intensive process in embedded systems, IoT, robotics, and AI hardware.

SchGen explores whether large language models can assist hardware design by generating schematic construction code directly from natural-language descriptions.

The input is a user request describing a circuit design requirement, and the output is executable Python code that can generate a KiCad schematic using custom APIs.

Example input:

```text
I want a 1.8V regulated supply from VIN using an AP2112K LDO,
with a test point on the 1.8V rail and a solder-jumper-selectable LED indicator.
```

---

## 🔥 Key Features

- 🔌 **Natural Language to Schematic Code**  
  Generates executable Python schematic-generation code directly from user requests.

- 🧠 **KiCad-Oriented Design Flow**  
  Designed around custom Code-to-Schematic APIs for KiCad schematic construction.

- 📐 **Structured Hardware Generation**  
  Produces editable and programmatic schematic representations instead of images.

- 🛠️ **Research-Focused PCB Generation**  
  Intended for experimentation, benchmarking, and AI-assisted hardware prototyping.

---

## Model Details

| Item | Value |
|---|---|
| Base Model | GPT-OSS-20B |
| Parameters | 20B |
| Architecture | Supervised Fine-Tuned LLM |
| Input | Natural-language design requests |
| Output | Python schematic-generation code |
| Context Length | 13,312 |
| Training Hardware | 1× NVIDIA A100 |
| Training Time | ~21 hours |

---

## Usage

The recommended workflow is:

1. Provide a natural-language circuit request
2. Generate Python schematic-construction code
3. Execute the code to render a KiCad schematic
4. Verify outputs using ERC/DRC tools

The model is designed for integration into:

- EDA automation pipelines
- Hardware engineering copilots
- Synthetic schematic generation systems
- Research workflows for AI-assisted PCB design

---

## Evaluation

SchGen was evaluated using several schematic-generation metrics:

- **Valid Circuits**  
  Measures whether generated code executes successfully and produces valid schematics.

- **Spatial Violation**  
  Measures overlaps among symbols, labels, and wires.

- **Netlist Accuracy**  
  Measures connectivity correctness against ground-truth netlists.

SchGen outperforms several frontier LLM baselines on schematic generation tasks when all models are provided with the same schematic-generation APIs.

---

## Limitations

SchGen is an early-stage research system and currently focuses on:

- small and medium-scale schematic modules
- hobbyist and open-source hardware designs
- English-language requests

The model may underperform on:

- RF or high-frequency circuits
- industrial or enterprise hardware
- large multi-board systems
- safety-critical applications

Generated outputs should always undergo:

- Electrical Rule Checking (ERC)
- Design Rule Checking (DRC)
- human engineering review

SchGen is intended as an assistive tool rather than a fully autonomous hardware engineer.

---

## Technical Requirements

The model generates executable Python code and requires:

- Python environment
- KiCad installation
- Custom schematic-generation APIs

Inference was validated on:

- NVIDIA A100 GPUs
- 4-bit quantized configurations

---

## Dataset

SchGen was trained on a custom dataset of approximately 8K pairs of:

- natural-language hardware requests
- Python schematic-generation code

The dataset was synthesized through:

1. GPT-generated draft schematics
2. Human correction and annotation
3. LLM-generated user requests

The dataset is available at `https://huggingface.co/datasets/microsoft/SchGen_dataset`

---

## License

This project is licensed under the MIT License.

---

## Contact

This project was conducted by members of Microsoft Research.

For questions, feedback, or collaboration inquiries:

- ruichunma@microsoft.com

If issues or problematic behavior are identified, the repository may be updated with appropriate mitigations.