File size: 6,861 Bytes
87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 87cdf5f 8ae9584 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | ---
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- code
- industrial-code
- pretrained
- base-model
- verilog
- cuda
- triton
- chip-design
- cad
---
# InCoder-32B-Base: Code Foundation Model for Industrial Scenarios
<div align="center">
[](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-Base)
[](https://github.com/CSJianYang/Industrial-Coder)
[](https://huggingface.co/papers/2603.16790)
[](LICENSE)
</div>
## Model Summary
**InCoder-32B-Base** is the pre-trained base model of the InCoder family β the first 32B-parameter code foundation model purpose-built for industrial code intelligence. This is the base (non-instruction-tuned) checkpoint, suitable for code completion, fill-in-the-middle (FIM), and further fine-tuning.
For the instruction-tuned variant, see [IndustrialCoder](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder). For the reasoning variant, see [IndustrialCoder-Thinking](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-Thinking).
Presented in the paper [InCoder-32B: Code Foundation Model for Industrial Scenarios](https://huggingface.co/papers/2603.16790), InCoder-32B unifies code intelligence across five industrial domains:
| Domain | Languages & Frameworks |
|---|---|
| π§ **Chip Design** | Verilog, SystemVerilog, RTL |
| β‘ **GPU Kernel Optimization** | CUDA, Triton |
| π₯οΈ **Embedded Systems** | C/C++, ARM Cortex-M4, STM32 |
| π¨ **Compiler Optimization** | x86-64 ASM, C/C++, LLVM-IR |
| π **3D Modeling / CAD** | CadQuery, OpenCascade, Python |
---
## Model Architecture
InCoder-32B-Base adopts a standard decoder-only Transformer architecture:
| Hyperparameter | Value |
|---|---|
| Parameters | ~32B |
| Layers | 64 |
| Hidden Size | 5,120 |
| Attention Heads | 40 (8 KV heads, GQA) |
| Max Context Length | 131,072 (128K) |
| Positional Encoding | RoPE (ΞΈ = 500,000) |
| Precision | BFloat16 |
| Vocabulary Size | 76,800 |
---
## Training Pipeline: Code-Flow
InCoder-32B-Base is trained through a two-stage **Code-Flow** pipeline:
### Stage 1 β Pre-training & Annealing
- **Industrial Recall**: Data pipeline using rule-based filtering, FastText classifiers, and semantic retrieval for Verilog, CUDA, firmware C, and CadQuery.
- **Refinement**: OCR extraction from technical manuals, multi-level deduplication, and repository-level fork consolidation.
- **Training**: 15T total tokens using Autoregressive LM + Fill-in-the-Middle (FIM) objectives on 4,096 GPUs.
### Stage 2 β Mid-Training (Context Extension)
Context window extended progressively from 8K to 128K tokens:
- **8K β 32K**: Targets file-level tasks like completing RTL modules or kernel functions.
- **32K β 128K**: Unlocks long-context capabilities for extended debugging and cross-module projects.
---
## Usage
### Installation
```bash
pip install transformers accelerate
```
### Code Completion
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "Multilingual-Multimodal-NLP/IndustrialCoder-Base"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
prompt = """// Synthesizable Verilog: UART transmitter (8N1 protocol)
module uart_tx (
input wire clk,
input wire rst_n,
input wire [7:0] data_in,
input wire tx_start,
output reg tx,
output reg tx_busy
);
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.2,
do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Fill-in-the-Middle (FIM)
InCoder-32B-Base supports FIM completion for code infilling tasks:
```python
prefix = """// CUDA kernel for RMS Normalization
__global__ void rms_norm_kernel(float* output, const float* input,
const float* weight, int N, float eps) {
int idx = blockIdx.x;
"""
suffix = """
output[idx * N + tid] = normalized * weight[tid];
}"""
fim_prompt = f"<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>"
inputs = tokenizer(fim_prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Deployment with vLLM
```bash
vllm serve Multilingual-Multimodal-NLP/IndustrialCoder-Base \
--tensor-parallel-size 4 --max-model-len 32768 --trust-remote-code
```
---
## Fine-tuning
We provide an SFT framework in the [GitHub repository](https://github.com/CSJianYang/Industrial-Coder/tree/main/sft). See the README for data preparation and training instructions.
---
## Model Family
| Model | Type | HuggingFace |
|---|---|---|
| InCoder-32B-Base | Pre-trained | [π€ IndustrialCoder-Base](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-Base) |
| InCoder-32B | Instruct | [π€ IndustrialCoder](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder) |
| InCoder-32B-Thinking | Reasoning | [π€ IndustrialCoder-Thinking](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-Thinking) |
| InCoder-32B-FP8 | FP8 Quantized | [π€ IndustrialCoder-32B-FP8](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-32B-FP8) |
| InCoder-32B-AWQ-INT4 | AWQ INT4 | [π€ IndustrialCoder-32B-AWQ-INT4](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-32B-AWQ-INT4) |
| InCoder-32B-GPTQ-INT4 | GPTQ INT4 | [π€ IndustrialCoder-32B-GPTQ-INT4](https://huggingface.co/Multilingual-Multimodal-NLP/IndustrialCoder-32B-GPTQ-INT4) |
---
## Limitations & Disclaimers
This is a **base model** β it has not been instruction-tuned and does not follow conversational instructions. It is best suited for:
- Code completion and generation
- Fill-in-the-middle (FIM) tasks
- Further fine-tuning for downstream applications
Always review and test generated code in a sandboxed environment. Industrial code (RTL, embedded firmware, GPU kernels) requires expert review before deployment.
---
## Citation
```bibtex
@article{yang2026incoder,
title={InCoder-32B: Code Foundation Model for Industrial Scenarios},
author={Yang, Jian and Zhang, Wei and Wu, Jiajun and Cheng, Junhang and Guo, Shawn
and Wang, Haowen and Gu, Weicheng and Du, Yaxin and Li, Joseph and Xu, Fanglin
and others},
journal={arXiv preprint arXiv:2603.16790},
year={2026}
}
```
|