Update LLM-BEM-Engineer_Benchmark/README.md
Browse files
LLM-BEM-Engineer_Benchmark/README.md
CHANGED
|
@@ -101,14 +101,14 @@ Each file name ends with a numeric suffix indicating a **distinct robustness tes
|
|
| 101 |
|
| 102 |
### Ambiguity, User Intent, and Modeling Precision
|
| 103 |
|
| 104 |
-
When users aim to obtain more specific or accurate building energy models, they are expected to explicitly provide key modeling information, such as: number of thermal zones per story, space type definitions, and system details. Providing such information improves modeling precision and helps reduce model hallucination. This mirrors human communication: even in expert-to-expert interactions, clear and explicit specifications are required to produce accurate technical outputs.
|
| 105 |
|
| 106 |
In cases where user intent is ambiguous or underspecified, the generated building model inevitably reflects the LLM’s own interpretation of the input. As a result, the output represents the closest plausible model inferred from the provided intent, rather than a uniquely determined solution.
|
| 107 |
|
| 108 |
-
### Multi-Round Robust Inference Mechanism
|
| 109 |
|
| 110 |
Because ambiguous user intent must be interpreted by LLMs themselves, this benchmark incorporates a multi-round try to improve robustness as LLMs progressively better understand user intent, ensuring that the generated models converge toward the intended user requirements.
|
| 111 |
-
|
| 112 |
|
| 113 |
## 🎯 Benchmark Objectives
|
| 114 |
|
|
|
|
| 101 |
|
| 102 |
### Ambiguity, User Intent, and Modeling Precision
|
| 103 |
|
| 104 |
+
When users aim to obtain more specific or accurate building energy models, they are expected to explicitly provide key modeling information, such as: building geometry, number of thermal zones per story, space type definitions, and system details. Providing such information improves modeling precision and helps reduce model hallucination. This mirrors human communication: even in expert-to-expert interactions, clear and explicit specifications are required to produce accurate technical outputs.
|
| 105 |
|
| 106 |
In cases where user intent is ambiguous or underspecified, the generated building model inevitably reflects the LLM’s own interpretation of the input. As a result, the output represents the closest plausible model inferred from the provided intent, rather than a uniquely determined solution.
|
| 107 |
|
| 108 |
+
<!-- ### Multi-Round Robust Inference Mechanism
|
| 109 |
|
| 110 |
Because ambiguous user intent must be interpreted by LLMs themselves, this benchmark incorporates a multi-round try to improve robustness as LLMs progressively better understand user intent, ensuring that the generated models converge toward the intended user requirements.
|
| 111 |
+
-->
|
| 112 |
|
| 113 |
## 🎯 Benchmark Objectives
|
| 114 |
|