zwpride commited on
Commit
87cdf5f
Β·
verified Β·
1 Parent(s): fb10c93

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +191 -0
README.md ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - code
7
+ - industrial-code
8
+ - verilog
9
+ - cuda
10
+ - triton
11
+ - chip-design
12
+ - cad
13
+ ---
14
+
15
+ # InCoder-32B: Code Foundation Model for Industrial Scenarios
16
+
17
+ <div align="center">
18
+
19
+ [![HuggingFace](https://img.shields.io/badge/πŸ€—-Model%20Hub-yellow)](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder)
20
+ [![GitHub](https://img.shields.io/badge/GitHub-Industrial--Coder-blue)](https://github.com/CSJianYang/Industrial-Coder)
21
+ [![arXiv](https://img.shields.io/badge/arXiv-2603.16790-red)](https://huggingface.co/papers/2603.16790)
22
+ [![License](https://img.shields.io/badge/License-Apache%202.0-green)](LICENSE)
23
+
24
+ </div>
25
+
26
+ ## Model Summary
27
+
28
+ **InCoder-32B** (Industrial-Coder-32B) is the first 32B-parameter code foundation model purpose-built for industrial code intelligence. While general-purpose code LLMs excel at mainstream software tasks, they often struggle with the unique demands of industrial programming β€” hardware semantics, specialized language constructs, strict resource constraints, and domain-specific correctness verification.
29
+
30
+ Presented in the paper [InCoder-32B: Code Foundation Model for Industrial Scenarios](https://huggingface.co/papers/2603.16790), InCoder-32B unifies code intelligence across five industrial domains:
31
+
32
+ | Domain | Languages & Frameworks |
33
+ |---|---|
34
+ | πŸ”§ **Chip Design** | Verilog, SystemVerilog, RTL |
35
+ | ⚑ **GPU Kernel Optimization** | CUDA, Triton |
36
+ | πŸ–₯️ **Embedded Systems** | C/C++, ARM Cortex-M4, STM32 |
37
+ | πŸ”¨ **Compiler Optimization** | x86-64 ASM, C/C++, LLVM-IR |
38
+ | πŸ“ **3D Modeling / CAD** | CadQuery, OpenCascade, Python |
39
+
40
+ InCoder-32B achieves highly competitive performance on general tasks while establishing the strongest open-source baselines across all evaluated industrial domains.
41
+
42
+ ---
43
+
44
+ ## Key Results
45
+
46
+ ### General Code Benchmarks
47
+
48
+ | Benchmark | InCoder-32B |
49
+ |---|---|
50
+ | SWE-bench Verified | **74.8%** |
51
+ | LiveCodeBench (Pass@1) | **49.14%** |
52
+ | BFCL v3 | **60.99%** |
53
+ | HumanEval+ | **89.6%** |
54
+ | MBPP+ | **78.3%** |
55
+ | BigCodeBench (Full) | **49.8%** |
56
+
57
+ ### Industrial Code Benchmarks
58
+
59
+ | Benchmark | Domain | InCoder-32B | Best Competing Open-Weight |
60
+ |---|---|---|---|
61
+ | VeriScope Score | Chip Design | **80.7** | 83.2 (GLM-5) |
62
+ | CAD-Coder Compile | 3D Modeling | **82.0%** | 48.0% (Kimi-K2-Thinking) |
63
+ | KernelBench L1 | GPU Optimization | **22.2%** | 16.2% (GLM-5) |
64
+ | KernelBench L2 | GPU Optimization | **36.0%** | 28.0% (KernelBench L2) |
65
+
66
+ > InCoder-32B leads all open-weight baselines on CAD-Coder and KernelBench (all three levels), and even surpasses proprietary models like Claude-Sonnet-4.6 on CAD-Coder IoU and KernelBench L1/L2/L3.
67
+
68
+ ---
69
+
70
+ ## Model Architecture
71
+
72
+ InCoder-32B adopts a standard decoder-only Transformer architecture with the following configuration:
73
+
74
+ | Hyperparameter | Value |
75
+ |---|---|
76
+ | Parameters | ~32B |
77
+ | Layers | 64 |
78
+ | Hidden Size | 5,120 |
79
+ | Max Context Length | 131,072 (128K) |
80
+ | Positional Encoding | RoPE (ΞΈ = 500,000) |
81
+ | Precision | BFloat16 |
82
+
83
+ ---
84
+
85
+ ## Training Pipeline: Code-Flow
86
+
87
+ InCoder-32B is trained through a three-stage **Code-Flow** pipeline:
88
+
89
+ ### Stage 1 β€” Pre-training & Annealing
90
+ - **Industrial Recall**: Data pipeline using rule-based filtering, FastText classifiers, and semantic retrieval for Verilog, CUDA, firmware C, and CadQuery.
91
+ - **Refinement**: OCR extraction from technical manuals, multi-level deduplication, and repository-level fork consolidation.
92
+ - **Training**: 15T total tokens using Autoregressive LM + Fill-in-the-Middle (FIM) objectives.
93
+
94
+ ### Stage 2 β€” Mid-Training (Context Extension)
95
+ Context window extended progressively from 8K to 128K tokens:
96
+ - **8K β†’ 32K**: Targets file-level tasks like completing RTL modules or kernel functions.
97
+ - **32K β†’ 128K**: Unlocks long-context capabilities for extended debugging and cross-module projects.
98
+
99
+ ### Stage 3 β€” Post-Training
100
+ 2.5M supervised fine-tuning (SFT) samples constructed from real industrial tasks with execution-grounded verification using toolchains like Icarus Verilog, `nvcc`, and Renode (STM32 simulator).
101
+
102
+ ---
103
+
104
+ ## Usage
105
+
106
+ ### Installation
107
+
108
+ ```bash
109
+ pip install transformers accelerate
110
+ ```
111
+
112
+ ### Basic Inference
113
+
114
+ ```python
115
+ from transformers import AutoTokenizer, AutoModelForCausalLM
116
+ import torch
117
+
118
+ model_id = "Multilingual-Multimodal-NLP/IndustrialCoder"
119
+
120
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
121
+ model = AutoModelForCausalLM.from_pretrained(
122
+ model_id,
123
+ torch_dtype=torch.bfloat16,
124
+ device_map="auto"
125
+ )
126
+
127
+ prompt = """Write a synthesizable Verilog module for a UART transmitter (8N1 protocol).
128
+ The module should accept 8-bit parallel data and serialize it onto a TX line."""
129
+
130
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
131
+ outputs = model.generate(
132
+ **inputs,
133
+ max_new_tokens=1024,
134
+ temperature=0.2,
135
+ do_sample=True,
136
+ )
137
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
138
+ ```
139
+
140
+ ### Deployment with vLLM
141
+ For production deployment, you can use vLLM to create an OpenAI-compatible API endpoint.
142
+
143
+ ```
144
+ vllm serve Multilingual-Multimodal-NLP/IndustrialCoder --tensor-parallel-size 8
145
+ ```
146
+
147
+ ### Fill-in-the-Middle (FIM)
148
+
149
+ InCoder-32B supports FIM completion for code infilling tasks:
150
+
151
+ ```python
152
+ prefix = """// CUDA kernel for RMS Normalization
153
+ __global__ void rms_norm_kernel(float* output, const float* input,
154
+ const float* weight, int N, float eps) {
155
+ int idx = blockIdx.x;
156
+ """
157
+ suffix = """
158
+ output[idx * N + tid] = normalized * weight[tid];
159
+ }"""
160
+
161
+ fim_prompt = f"<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>"
162
+ inputs = tokenizer(fim_prompt, return_tensors="pt").to(model.device)
163
+ outputs = model.generate(**inputs, max_new_tokens=256)
164
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
165
+ ```
166
+
167
+ ---
168
+
169
+ ## Limitations & Disclaimers
170
+
171
+ Based on failure analysis, the model may struggle with:
172
+ - **API Knowledge**: Linker errors from undefined HAL/CMSIS functions in embedded C.
173
+ - **Functional Semantics**: Producing compilable but functionally incorrect RTL under complex logic scenarios.
174
+ - **Optimization**: Correct but sub-optimal GPU kernel performance.
175
+
176
+ Always review and test generated code in a sandboxed environment. Industrial code (RTL, embedded firmware) requires expert review before deployment.
177
+
178
+ ---
179
+
180
+ ## Citation
181
+
182
+ ```bibtex
183
+ @article{yang2026incoder,
184
+ title={InCoder-32B: Code Foundation Model for Industrial Scenarios},
185
+ author={Yang, Jian and Zhang, Wei and Wu, Jiajun and Cheng, Junhang and Guo, Shawn
186
+ and Wang, Haowen and Gu, Weicheng and Du, Yaxin and Li, Joseph and Xu, Fanglin
187
+ and others},
188
+ journal={arXiv preprint arXiv:2603.16790},
189
+ year={2026}
190
+ }
191
+ ```