Update README.md
Browse files
README.md
CHANGED
|
@@ -141,24 +141,18 @@ toward higher-quality sub-trajectories, enabling the model to learn how differen
|
|
| 141 |
|
| 142 |
|
| 143 |
### LLM deployment phase
|
| 144 |
-
Gen-HVAC supports
|
| 145 |
-
constraints. For local LLM hosting, install Ollama, pull a quantized model
|
| 146 |
-
, and launch the service.
|
| 147 |
|
| 148 |
On Linux/macOS you can install Ollama via curl -fsSL https://ollama.com/install.sh | sh, start the daemon with ollama serve (leave it running), and pull recommended models using ollama pull deepseek-r1:7b (lightweight reasoning), ollama pull llama3.1:8b (strong general instruction-following), ollama pull qwen2.5:7b (efficient general model), or ollama pull mistral:instruct (fast instruct model). If you want a slightly heavier but still practical model, ollama pull deepseek-r1:14b or ollama pull qwen2.5:14b.
|
| 149 |
In our testing we choose Deepseek R1.
|
| 150 |
|
| 151 |
-
Once pulled,
|
| 152 |
After the LLM endpoint is up, you can proceed to the inference server step to bind the persona/prompt layer to RTG conditioning and the control loop in one end to end pipeline.
|
| 153 |
|
| 154 |
-
|
| 155 |
-
During inference, we deploy Gen-HVAC as a stateless HTTP microservice
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
and
|
| 159 |
-
In our experiments, EnergyPlus/Sinergym executes inside the Docker container, while the inference service runs on the host/server (CPU/GPU),
|
| 160 |
-
so the simulator can stream observation vectors to POST /predict (payload: {step, obs, info}) and receive an action vector in the response, with POST /reset
|
| 161 |
-
used to clear policy history at episode boundaries.
|
| 162 |
-
|
| 163 |
-
When enabled, the DHIL module queries a local Ollama endpoint and updates the comfort RTG target at a low frequency (e.g., every 4 steps).
|
| 164 |
|
|
|
|
| 141 |
|
| 142 |
|
| 143 |
### LLM deployment phase
|
| 144 |
+
Gen-HVAC supports LLM + Digital Human-in-the-Loop (DHIL) layer that modulates preference/RTG targets and high-level
|
| 145 |
+
constraints. For local LLM hosting, install Ollama, pull a quantized model ,and launch the service.
|
|
|
|
| 146 |
|
| 147 |
On Linux/macOS you can install Ollama via curl -fsSL https://ollama.com/install.sh | sh, start the daemon with ollama serve (leave it running), and pull recommended models using ollama pull deepseek-r1:7b (lightweight reasoning), ollama pull llama3.1:8b (strong general instruction-following), ollama pull qwen2.5:7b (efficient general model), or ollama pull mistral:instruct (fast instruct model). If you want a slightly heavier but still practical model, ollama pull deepseek-r1:14b or ollama pull qwen2.5:14b.
|
| 148 |
In our testing we choose Deepseek R1.
|
| 149 |
|
| 150 |
+
Once pulled, run deepseek-r1:7b with Ollama, then in another terminal point your Gen-HVAC LLM client to the default endpoint and run your integration from the llm/ folder (e.g., python -m llm.server --host 0.0.0.0 --port 8000 and python -m llm.client --base_url http://localhost:xxxx --model deepseek-r1:7b.
|
| 151 |
After the LLM endpoint is up, you can proceed to the inference server step to bind the persona/prompt layer to RTG conditioning and the control loop in one end to end pipeline.
|
| 152 |
|
| 153 |
+
### Inference
|
| 154 |
+
During inference, we deploy Gen-HVAC as a stateless HTTP microservice that loads the trained Decision Transformer checkpoint and normalization statistics at startup,
|
| 155 |
+
maintains a short autoregressive context window internally, and returns multi-zone heating/cooling setpoints per control step.
|
| 156 |
+
In our experiments, EnergyPlus/Sinergym executes inside the Docker container, while the inference service runs on the host/server (CPU/GPU), so the simulator can stream observation vectors to POST /predict (payload: {step, obs, info}) and receive an action vector in the response, with POST /reset
|
| 157 |
+
used to clear policy history at episode boundaries. When enabled, the DHIL module queries a local Ollama endpoint and updates the comfort RTG target at a low frequency (e.g., every 4 steps).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 158 |
|