AutoBM-Seed-Coder-8B-R / README.md

yongqiqng

Upload README.md with huggingface_hub

55e8e2d verified about 17 hours ago

preview code

raw

history blame contribute delete

3.85 kB

metadata

license: apache-2.0
language:
  - en
library_name: transformers
pipeline_tag: text-generation
base_model: ByteDance-Seed/Seed-Coder-8B-Reasoning
tags:
  - code
  - structural-engineering
  - openseespy
  - scientific-modeling
  - reinforcement-learning
  - grpo
  - autobm

AutoBM-Seed-Coder-8B-R

Official model release for the paper Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation.

This model is trained from ByteDance-Seed/Seed-Coder-8B-Reasoning via the RLA-SPC two-stage alignment strategy:

Stage I — Domain Instruction Fine-Tuning (SFT) on the CivilInstruct dataset (10,912 samples).
Stage II — Self-Play Constraint GRPO (SPC-GRPO) with the Multi-Granularity Hybrid Reward (MGHR), combining format, AST, and OpenSeesPy execution rewards.

The resulting model generates executable, physically consistent OpenSeesPy structural modeling code from natural language building specifications.

BMEval Results

Model	Pass@1	Pass@5	Pass@5_period	Pass@5_compliance	Pass@5_strict	Overall Avg
Seed-Coder-8B-R (baseline)	11.72	21.09	0.78	3.13	0.78	6.51
AutoBM-Seed-Coder-8B-R (this model)	64.18	97.28	78.05	92.47	77.14	81.95

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "yongqiqng/AutoBM-Seed-Coder-8B-R"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = '''Generate OpenSeesPy code to model a 5-story reinforced concrete frame building:
- Floor height: 3.5 m
- Bay width: 6 m (3 bays in X, 2 bays in Y)
- Seismic intensity: 0.2g
Compute the fundamental period.'''

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=4096, temperature=0.6, top_p=0.95, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Training Details

Stage	Method	Data
Stage I	Supervised Fine-Tuning	CivilInstruct SFT (9,894 train + 202 val)
Stage II	SPC-GRPO with MGHR	CivilInstruct RL (455 train + 57 test)

The MGHR reward function combines:

r_fmt (Format, weight 0.05) — <think>...</think><answer>...</answer> structure
r_ast (AST, weight 0.25) — three-tiered OpenSeesPy API coverage
r_exec (Execution, weight 0.70) — sandboxed OpenSeesPy execution + period error grading

See the paper and training code for details.

Paper: arXiv:2602.07083
Code: github.com/Jovanqing/AutoBM
Sample data: yongqiqng/CivilInstruct-Sample
Base model: ByteDance-Seed/Seed-Coder-8B-Reasoning

Citation

@article{jiang2026rethinking,
  title={Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation},
  author={Jiang, Yongqing and Wang, Jianze and Shen, Zhiqi and Lin, Zhenghong and Wang, Jiayuan and Yang, Yijian and Dai, Kaoshan and Luo, Haoran},
  journal={arXiv preprint arXiv:2602.07083},
  year={2026}
}

License

Released under the Apache 2.0 License, consistent with the base Seed-Coder-8B-Reasoning model.

yongqiqng
/

AutoBM-Seed-Coder-8B-R