README.md · RnniaSnow/ST-Coder-14B at main

ST-Coder-14B / README.md

RnniaSnow

Update README.md

af7bbb3 verified 17 days ago

preview code

raw

history blame contribute delete

3.79 kB

	---
	license: mit
	library_name: transformers
	base_model: Qwen/Qwen2.5-Coder-14B-Instruct
	datasets:
	- RnniaSnow/st-code-dataset
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- code
	- plc
	- iec-61131-3
	- structured-text
	- industrial-automation
	- qwen
	- llama-factory
	---

	# ST-Coder-14B

	<div align="center">
	<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="400"/>
	</div>

	## 🤖 Model Description

	ST-Coder-14B is a specialized code generation model fine-tuned on Qwen2.5-Coder-14B-Instruct. It is specifically optimized for Industrial Automation tasks, with a primary focus on the IEC 61131-3 Structured Text (ST) programming language.

	Unlike general-purpose coding models, ST-Coder-14B has been trained on high-quality, domain-specific data to understand:
	* PLC Logic: PID control, Motion Control, Safety logic, State Machines.
	* IEC 61131-3 Syntax: Correct usage of `FUNCTION_BLOCK`, `VAR_INPUT`, `VAR_OUTPUT`, and strict typing rules.
	* Industrial Protocols: Modbus, TCP/IP socket handling in ST, and vendor-specific nuances (Codesys, TwinCAT, Siemens SCL).

	## 💻 Quick Start

	### 1. Installation

	```bash
	pip install transformers torch accelerate
	```
	### 2. Inference with Transformers

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the model
	model_id = "RnniaSnow/ST-Coder-14B"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype="auto",
	device_map="auto"
	)

	# Prepare the prompt
	system_prompt = "You are an expert industrial automation engineer specializing in IEC 61131-3 Structured Text."
	user_prompt = "Write a Function Block for a 3-axis motion control system with error handling."

	messages = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_prompt}
	]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	# Generate
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=2048,
	temperature=0.2, # Low temperature is recommended for code generation
	top_p=0.9
	)

	# Decode output
	output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(output)

	```

	### 3. Usage with vLLM (Recommended for Production)

	```bash
	vllm serve RnniaSnow/ST-Coder-14B --tensor-parallel-size 1 --max-model-len 8192

	```

	## 🔧 Training Details

	This model was trained using [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) with the following configuration:

	* Base Model: Qwen/Qwen2.5-Coder-14B-Instruct
	* Finetuning Method: Full LoRA Merge (Target modules: `all`)
	* Precision: BF16
	* Context Window: 8192 tokens
	* Optimizer: AdamW (Paged)
	* Learning Rate Strategy: Cosine with warmup

	The training data includes a mix of:

	1. Golden Samples: Verified ST code from real-world engineering projects.
	2. Synthetic Data: High-quality instruction-response pairs generated via DeepSeek-V3 distillation, focusing on edge cases and complex logic.

	## ⚠️ Disclaimer & Safety

	Industrial Control Systems (ICS) carry significant physical risks. * This model generates code based on statistical probabilities and does not guarantee functional correctness or safety.

	* Always verify, simulate, and test generated code in a safe environment before deploying to physical hardware (PLCs, robots, drives).
	* The authors assume no liability for any damage or injury resulting from the use of this model.

	## 📜 License

	This model is licensed under the [MIT License](https://opensource.org/licenses/MIT).