yongqiqng's picture
Upload README.md with huggingface_hub
55e8e2d verified
metadata
license: apache-2.0
language:
  - en
library_name: transformers
pipeline_tag: text-generation
base_model: ByteDance-Seed/Seed-Coder-8B-Reasoning
tags:
  - code
  - structural-engineering
  - openseespy
  - scientific-modeling
  - reinforcement-learning
  - grpo
  - autobm

AutoBM-Seed-Coder-8B-R

Official model release for the paper Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation.

This model is trained from ByteDance-Seed/Seed-Coder-8B-Reasoning via the RLA-SPC two-stage alignment strategy:

  • Stage I — Domain Instruction Fine-Tuning (SFT) on the CivilInstruct dataset (10,912 samples).
  • Stage II — Self-Play Constraint GRPO (SPC-GRPO) with the Multi-Granularity Hybrid Reward (MGHR), combining format, AST, and OpenSeesPy execution rewards.

The resulting model generates executable, physically consistent OpenSeesPy structural modeling code from natural language building specifications.

BMEval Results

Model Pass@1 Pass@5 Pass@5_period Pass@5_compliance Pass@5_strict Overall Avg
Seed-Coder-8B-R (baseline) 11.72 21.09 0.78 3.13 0.78 6.51
AutoBM-Seed-Coder-8B-R (this model) 64.18 97.28 78.05 92.47 77.14 81.95

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "yongqiqng/AutoBM-Seed-Coder-8B-R"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = '''Generate OpenSeesPy code to model a 5-story reinforced concrete frame building:
- Floor height: 3.5 m
- Bay width: 6 m (3 bays in X, 2 bays in Y)
- Seismic intensity: 0.2g
Compute the fundamental period.'''

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=4096, temperature=0.6, top_p=0.95, do_sample=True)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

Training Details

Stage Method Data
Stage I Supervised Fine-Tuning CivilInstruct SFT (9,894 train + 202 val)
Stage II SPC-GRPO with MGHR CivilInstruct RL (455 train + 57 test)

The MGHR reward function combines:

  • r_fmt (Format, weight 0.05) — <think>...</think><answer>...</answer> structure
  • r_ast (AST, weight 0.25) — three-tiered OpenSeesPy API coverage
  • r_exec (Execution, weight 0.70) — sandboxed OpenSeesPy execution + period error grading

See the paper and training code for details.

Related

Citation

@article{jiang2026rethinking,
  title={Rethinking Scientific Modeling: Toward Physically Consistent and Simulation-Executable Programmatic Generation},
  author={Jiang, Yongqing and Wang, Jianze and Shen, Zhiqi and Lin, Zhenghong and Wang, Jiayuan and Yang, Yijian and Dai, Kaoshan and Luo, Haoran},
  journal={arXiv preprint arXiv:2602.07083},
  year={2026}
}

License

Released under the Apache 2.0 License, consistent with the base Seed-Coder-8B-Reasoning model.