tphage commited on
Commit
7fcbf30
·
verified ·
1 Parent(s): dcc338f

Add model card

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -23,14 +23,14 @@ pipeline_tag: text-generation
23
 
24
  | Property | Value |
25
  |---|---|
26
- | Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` |
27
  | Fine-tuning method | GRPO (RL) + LoRA (PEFT) |
28
  | LoRA rank / alpha | 32 / 128 |
29
  | LoRA dropout | 0.05 |
30
  | LoRA target modules | q, k, v, o, gate, up, down projections |
31
  | Training precision | bfloat16 |
32
  | Max sequence length | 2048 tokens (256 prompt + 1792 completion) |
33
- | Training dataset | `beamrl_train` (synthetic beam mechanics QA) |
34
 
35
  ### Reward Functions
36
 
@@ -47,7 +47,7 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
47
  model = AutoModelForCausalLM.from_pretrained("tphage/BeamPERL", torch_dtype="bfloat16", device_map="auto")
48
  tokenizer = AutoTokenizer.from_pretrained("tphage/BeamPERL")
49
 
50
- prompt = "A simply supported beam of length 6 m carries a point load of 10 kN at its midspan. What are the reaction forces at the supports?"
51
 
52
  messages = [{"role": "user", "content": prompt}]
53
  inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
 
23
 
24
  | Property | Value |
25
  |---|---|
26
+ | Base model | `tphage/DeepSeek-R1-Distill-Qwen-1.5B` |
27
  | Fine-tuning method | GRPO (RL) + LoRA (PEFT) |
28
  | LoRA rank / alpha | 32 / 128 |
29
  | LoRA dropout | 0.05 |
30
  | LoRA target modules | q, k, v, o, gate, up, down projections |
31
  | Training precision | bfloat16 |
32
  | Max sequence length | 2048 tokens (256 prompt + 1792 completion) |
33
+ | Training dataset | `tphage/BeamRL-TrainData` (synthetic beam mechanics QA) |
34
 
35
  ### Reward Functions
36
 
 
47
  model = AutoModelForCausalLM.from_pretrained("tphage/BeamPERL", torch_dtype="bfloat16", device_map="auto")
48
  tokenizer = AutoTokenizer.from_pretrained("tphage/BeamPERL")
49
 
50
+ prompt = "Determine the reaction forces at the pin support (x=0.0*L) and the roller support (x=9.0*L) for a statically loaded beam with a length of 9*L, a point load of -13*P at x=3.0*L, and supports at x=0.0*L (pin) and x=9.0*L (roller)."
51
 
52
  messages = [{"role": "user", "content": prompt}]
53
  inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)