GangJiang commited on
Commit
f9b8ede
·
verified ·
1 Parent(s): da18454

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -110,6 +110,24 @@ This current platform is designed for engineers, architects, and researchers wor
110
  <p><em>LLM-BEM-Engineer for automated editing.</em></p>
111
  </div>
112
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  ## 🚀 Quick Start
114
 
115
  Here provides a code snippet to show you how to run the LLM-BEM-Engineer.
 
110
  <p><em>LLM-BEM-Engineer for automated editing.</em></p>
111
  </div>
112
 
113
+ ## 📁 LLM-BEM-Engineer Benchmark Dataset
114
+
115
+ This benchmark dataset is designed to evaluate the capability of **Large Language Models (LLMs)** in generating **Building Energy Models (BEMs)** from natural language descriptions.
116
+
117
+ The benchmark focuses on two essential aspects of real-world applicability:
118
+
119
+ - **Scalability**: The ability of LLMs to handle a wide range of building configurations and system complexities.
120
+ - **Robustness**: The ability of LLMs to correctly infer user intent under noisy, ambiguous, or incomplete inputs.
121
+
122
+ The benchmark consists of **two complementary test sets**, each designed to evaluate a different capability of LLMs in automated building energy model generation.
123
+
124
+ | Dataset | Purpose | Description |
125
+ |-------|--------|-------------|
126
+ | `detailed_prompt_test` | Scalability benchmark | Well-specified and detailed building modeling prompts |
127
+ | `robust_prompt_test` | Robustness benchmark | Noisy and high-level user input prompts |
128
+
129
+ For details, please refer to the *LLM-BEM-Engineer Benchmark* folder in this repository.
130
+
131
  ## 🚀 Quick Start
132
 
133
  Here provides a code snippet to show you how to run the LLM-BEM-Engineer.