Gen-HVAC commited on
Commit
a665345
·
verified ·
1 Parent(s): 29012cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -14
README.md CHANGED
@@ -141,24 +141,18 @@ toward higher-quality sub-trajectories, enabling the model to learn how differen
141
 
142
 
143
  ### LLM deployment phase
144
- Gen-HVAC supports an optional LLM + Digital Human-in-the-Loop (DHIL) layer that modulates preference/RTG targets and high-level
145
- constraints. For local LLM hosting, install Ollama, pull a quantized model
146
- , and launch the service.
147
 
148
  On Linux/macOS you can install Ollama via curl -fsSL https://ollama.com/install.sh | sh, start the daemon with ollama serve (leave it running), and pull recommended models using ollama pull deepseek-r1:7b (lightweight reasoning), ollama pull llama3.1:8b (strong general instruction-following), ollama pull qwen2.5:7b (efficient general model), or ollama pull mistral:instruct (fast instruct model). If you want a slightly heavier but still practical model, ollama pull deepseek-r1:14b or ollama pull qwen2.5:14b.
149
  In our testing we choose Deepseek R1.
150
 
151
- Once pulled, sanity-check locally with ollama run deepseek-r1:7b, then in another terminal point your Gen-HVAC LLM client to the default endpoint and run your integration from the llm/ folder (e.g., python -m llm.server --host 0.0.0.0 --port 8000 and python -m llm.client --base_url http://localhost:xxxx --model deepseek-r1:7b.
152
  After the LLM endpoint is up, you can proceed to the inference server step to bind the persona/prompt layer to RTG conditioning and the control loop in one end to end pipeline.
153
 
154
- ### Inference
155
- During inference, we deploy Gen-HVAC as a stateless HTTP microservice
156
-
157
- that loads the trained Decision Transformer checkpoint and normalization statistics at startup, maintains a short autoregressive context window internally,
158
- and returns multi-zone heating/cooling setpoints per control step.
159
- In our experiments, EnergyPlus/Sinergym executes inside the Docker container, while the inference service runs on the host/server (CPU/GPU),
160
- so the simulator can stream observation vectors to POST /predict (payload: {step, obs, info}) and receive an action vector in the response, with POST /reset
161
- used to clear policy history at episode boundaries.
162
-
163
- When enabled, the DHIL module queries a local Ollama endpoint and updates the comfort RTG target at a low frequency (e.g., every 4 steps).
164
 
 
141
 
142
 
143
  ### LLM deployment phase
144
+ Gen-HVAC supports LLM + Digital Human-in-the-Loop (DHIL) layer that modulates preference/RTG targets and high-level
145
+ constraints. For local LLM hosting, install Ollama, pull a quantized model ,and launch the service.
 
146
 
147
  On Linux/macOS you can install Ollama via curl -fsSL https://ollama.com/install.sh | sh, start the daemon with ollama serve (leave it running), and pull recommended models using ollama pull deepseek-r1:7b (lightweight reasoning), ollama pull llama3.1:8b (strong general instruction-following), ollama pull qwen2.5:7b (efficient general model), or ollama pull mistral:instruct (fast instruct model). If you want a slightly heavier but still practical model, ollama pull deepseek-r1:14b or ollama pull qwen2.5:14b.
148
  In our testing we choose Deepseek R1.
149
 
150
+ Once pulled, run deepseek-r1:7b with Ollama, then in another terminal point your Gen-HVAC LLM client to the default endpoint and run your integration from the llm/ folder (e.g., python -m llm.server --host 0.0.0.0 --port 8000 and python -m llm.client --base_url http://localhost:xxxx --model deepseek-r1:7b.
151
  After the LLM endpoint is up, you can proceed to the inference server step to bind the persona/prompt layer to RTG conditioning and the control loop in one end to end pipeline.
152
 
153
+ ### Inference
154
+ During inference, we deploy Gen-HVAC as a stateless HTTP microservice that loads the trained Decision Transformer checkpoint and normalization statistics at startup,
155
+ maintains a short autoregressive context window internally, and returns multi-zone heating/cooling setpoints per control step.
156
+ In our experiments, EnergyPlus/Sinergym executes inside the Docker container, while the inference service runs on the host/server (CPU/GPU), so the simulator can stream observation vectors to POST /predict (payload: {step, obs, info}) and receive an action vector in the response, with POST /reset
157
+ used to clear policy history at episode boundaries. When enabled, the DHIL module queries a local Ollama endpoint and updates the comfort RTG target at a low frequency (e.g., every 4 steps).
 
 
 
 
 
158