GangJiang commited on
Commit
608f50f
·
verified ·
1 Parent(s): 2eed013

Update LLM-BEM-Engineer_Benchmark/README.md

Browse files
LLM-BEM-Engineer_Benchmark/README.md CHANGED
@@ -101,14 +101,14 @@ Each file name ends with a numeric suffix indicating a **distinct robustness tes
101
 
102
  ### Ambiguity, User Intent, and Modeling Precision
103
 
104
- When users aim to obtain more specific or accurate building energy models, they are expected to explicitly provide key modeling information, such as: number of thermal zones per story, space type definitions, and system details. Providing such information improves modeling precision and helps reduce model hallucination. This mirrors human communication: even in expert-to-expert interactions, clear and explicit specifications are required to produce accurate technical outputs.
105
 
106
  In cases where user intent is ambiguous or underspecified, the generated building model inevitably reflects the LLM’s own interpretation of the input. As a result, the output represents the closest plausible model inferred from the provided intent, rather than a uniquely determined solution.
107
 
108
- ### Multi-Round Robust Inference Mechanism
109
 
110
  Because ambiguous user intent must be interpreted by LLMs themselves, this benchmark incorporates a multi-round try to improve robustness as LLMs progressively better understand user intent, ensuring that the generated models converge toward the intended user requirements.
111
-
112
 
113
  ## 🎯 Benchmark Objectives
114
 
 
101
 
102
  ### Ambiguity, User Intent, and Modeling Precision
103
 
104
+ When users aim to obtain more specific or accurate building energy models, they are expected to explicitly provide key modeling information, such as: building geometry, number of thermal zones per story, space type definitions, and system details. Providing such information improves modeling precision and helps reduce model hallucination. This mirrors human communication: even in expert-to-expert interactions, clear and explicit specifications are required to produce accurate technical outputs.
105
 
106
  In cases where user intent is ambiguous or underspecified, the generated building model inevitably reflects the LLM’s own interpretation of the input. As a result, the output represents the closest plausible model inferred from the provided intent, rather than a uniquely determined solution.
107
 
108
+ <!-- ### Multi-Round Robust Inference Mechanism
109
 
110
  Because ambiguous user intent must be interpreted by LLMs themselves, this benchmark incorporates a multi-round try to improve robustness as LLMs progressively better understand user intent, ensuring that the generated models converge toward the intended user requirements.
111
+ -->
112
 
113
  ## 🎯 Benchmark Objectives
114