CreativeEngineer commited on
Commit
97fc141
·
1 Parent(s): 0908e18

feat: add low-fi PPO smoke workflow

Browse files
Files changed (6) hide show
  1. AGENTS.md +240 -0
  2. docs/P1_PPO_SMOKE_NOTE.md +61 -0
  3. pyproject.toml +4 -0
  4. training/README.md +6 -0
  5. training/ppo_smoke.py +339 -0
  6. uv.lock +288 -1
AGENTS.md CHANGED
@@ -149,3 +149,243 @@ A strong change in this repo usually does at least one of these:
149
  - makes the demo evidence easier to trust
150
 
151
  If a change does not help one of those, question whether it belongs in this hackathon repo.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
  - makes the demo evidence easier to trust
150
 
151
  If a change does not help one of those, question whether it belongs in this hackathon repo.
152
+
153
+ ## **OpenEnv Hackathon Participant Guide**
154
+
155
+ Welcome to the [OpenEnv Hackathon](https://cerebralvalley.ai/e/open-env-hackathon), hacker! 👋 We’re thrilled to have you on board.
156
+
157
+ This guide is your all-in-one resource for the event, including schedule, rules, technical resources, problem statements, judging information, and more. Please read this carefully; most answers can be found here.
158
+
159
+ ## **1. Join the [PyTorch Discord Server](https://discord.gg/VBcf6VtfY6)**
160
+
161
+ - You’ll be given a Hackathon Participant role by an admin, which will give you access to the hackathon-specific channels.
162
+
163
+ - Here, you’ll be able to interact with hackers and sponsors, introduce yourselves, and form teams (for a maximum team size of **3**).
164
+
165
+ - If you don't receive your role within **24 hours of joining,** please ping @CV.
166
+
167
+ - Please submit your Discord username below so we can grant you the role
168
+
169
+ [linkEmbed]
170
+
171
+ ## **2. Location**
172
+
173
+ **|** Shack15 (1 Ferry Building, Suite 201, San Francisco CA. 94111)
174
+
175
+ - **Venue Access:** Shack15 is on the 2nd floor of the Ferry Building. Go up the Ferry Building elevator to the second floor, and turn left. Here you will see the main entrance to Shack15. 
176
+
177
+ - **Parking:** Parking near the Ferry Building is extremely limited. Consider parking farther out and taking Uber, Lyft, or Public Transportation. 
178
+
179
+ [youtube]
180
+
181
+ ## **3. WiFi Information**
182
+
183
+ - **Username:** SHACK15_Members
184
+
185
+ - **Password:** M3mb3r$4L!f3
186
+
187
+ ## **4. Hackathon Schedule**
188
+
189
+ **Saturday, March 7 (Outline)**
190
+
191
+ - **9:00 AM:** Doors Open •󠁏 Breakfast Served •󠁏 Team Formation
192
+
193
+ - **10:00 AM – 11:30AM**: Kick-off presentations with Meta, Hugging Face, UC Berkeley, CoreWeave, OpenPipe, Unsloth AI, Fleet AI, Mercor, Scaler AI Labs, Snorkel AI, Patronus AI, Halluminate and Scale AI
194
+
195
+ - **11:30 AM:** Hacking Begins
196
+
197
+ - **1:00 PM:** Lunch Served
198
+
199
+ - **6:00 PM:** Dinner Served
200
+
201
+ - **10:00 PM:** Doors Close •󠁏 Re-entry not permitted
202
+
203
+ **Sunday, March 8 (Outline)**
204
+
205
+ - **9:00AM:** Doors Open •󠁏 Breakfast Served
206
+
207
+ - **1:00PM:** Hacking stops •󠁏 Submissions Due
208
+
209
+ - **1:15PM:** First Round Judging Begins
210
+
211
+ - **2:00PM:** Lunch Served
212
+
213
+ - **3:00PM:** Final Round Judging Begins
214
+
215
+ - **4:00PM:** Winners Announced and Closing
216
+
217
+ - **5:00PM:** Doors Close
218
+
219
+ All presentation slides can be found here
220
+
221
+ [linkEmbed]
222
+
223
+ ## **5. Hackathon and Submission Rules**
224
+
225
+ To keep things fair and aligned with our goals, all teams must follow these rules:
226
+
227
+ - **Open Source:** Please ensure your repository is public.
228
+
229
+ - **New Work Only:** All projects must be started from scratch during the hackathon with no previous work.
230
+
231
+ - **Team Size:** Teams may have up to **3** members.
232
+
233
+ - **Banned Projects:** Projects will be disqualified if they: violate legal, ethical, or platform policies, use code, data, or assets you do not have the rights to.
234
+
235
+ - Your project **must** use OpenEnv (stable release 0.2.1) deployed on HF spaces
236
+
237
+ - You must show a minimal training script for your environment using Unsloth or HF TRL in Colab.
238
+
239
+ - You must upload a **one minute** demo video to YouTube talking about your submission.
240
+
241
+ ## **6. Hackathon Problem Statements**
242
+
243
+ Your project must address at least **one of the five** required problem statements.
244
+
245
+ - Some problem statements include **optional partner-sponsored sub-problem statements**, which are additional focus areas related to the main theme.
246
+
247
+ - Your project may align with **multiple partner sub-problem statements**, but you can only be **judged for a maximum of two**. Please **select up to two** when submitting.
248
+
249
+ - Projects that match these partner sub-problem statements are eligible for **extra partner prizes**, judged separately from the main track winners.
250
+
251
+ - Each partner sub-problem statement carries a prize of **$10,000 USD**.
252
+
253
+ **Statement 1: Multi-Agent Interactions**
254
+
255
+ Environments for this theme involve cooperation, competition, negotiation, and coalition formation. Learning from these environments will enable agents to model the beliefs and incentives of others in partially observable settings. This drives theory-of-mind reasoning and emergent strategic behavior.
256
+
257
+ - **Expected Outcome:** an environment that can be used to train multi-agent task handling in a LLM
258
+
259
+ - **Example Environments:** Market simulations, compute-allocation negotiations, collaborative puzzle worlds, mixed cooperative/competitive strategy games.
260
+
261
+ - **Partner Sub-Themes:**
262
+ - **Fleet AI:** Scalable Oversight: Environments that train oversight agents to monitor, analyze, and explain the behavior of other AI agents operating in complex, multi-agent settings.
263
+ - **Halluminate:** Multi-Actor Environments: Build a realistic environment where an agent interacts with and manages multiple actors (agents) to discover and achieve the task
264
+
265
+ **Statement 2: (Super) Long-Horizon Planning & Instruction Following**
266
+
267
+ You will build environments that require deep, multi-step reasoning with sparse or delayed rewards. After using these environments, the goal is to enable agents to decompose goals, track state over extended trajectories, and recover from early mistakes. The aim is to push beyond shallow next-token reasoning toward structured planning and durable internal representations. 
268
+
269
+ - **Expected Outcome:** an environment that can capture and improve LLM behaviour on challenging long horizon tasks that need long running sessions beyond context memory limits. 
270
+
271
+ - **Example Environments:** Research-planning simulators, large-scale codebase refactoring tasks, strategic resource management worlds, long-horizon logistics optimization, extremely complicated long-horizon instruction following (e.g., 300 instructions scattered around).
272
+
273
+ - **Partner Sub-Themes:**
274
+ - **Mercor:** Make an environment with capped/uncapped rewards where frontier model rewards scale with token output.
275
+
276
+ - **Scale AI:** Environments for long horizon workflows for non-code use cases within a business setting: focusing on either Sales, Project management, or HR & IT.
277
+
278
+ **Statement 3: World Modeling**
279
+
280
+ - **Statement 3.1: Professional Tasks:** Here you will develop environments that require real interaction with tools, APIs, or dynamic systems where the model is expected to do real hard work instead of exploiting short-cuts to arrive at the desired outcome. Learning from these environments will enable agents to maintain consistent internal state, update beliefs based on outcomes, and orchestrate multi-step workflows. The goal is to strengthen causal reasoning and persistent world models.
281
+ - **Expected Outcome:** an environment capturing nuances of a defined partially observable world and improve LLM interaction with it
282
+
283
+ - **Example Environments:** Dynamic browser/API ecosystems, enterprise applications, scientific workflow loops (papers → code → experiments), economic simulations with feedback, tool-discovery benchmarks.
284
+
285
+ - **Partner Sub-Theme:**
286
+ - **Scaler AI Labs:** Multi-App RL Environment for Enterprise Workflows: Create RL environments to demonstrate complex workflows, business rule nuances etc in a large enterprise
287
+
288
+ - **Statement 3.2: Personalized Tasks:** Here we will develop an environment that offers real personalized task handling, imagine replying to personal messages or handling dinner conflicts due to work conflicts, replying to tough emails. Think any personal assistant tasks.
289
+ - **Expected Outcome:** An environment that gives the model a realistic simulation of handling personal tasks, conflicts and managing them as delegations
290
+
291
+ - **Example Environments:** Executive Assistant Meeting Planner, Dinner and drive planning, email and message replying, etc
292
+
293
+ - **Partner Sub-Theme:**
294
+ - **Patronus AI:** Consumer Workflows with Schema Drift: Multi-step consumer workflow environments where the underlying data schemas, API contracts, and t&cs/policies/rules change.
295
+
296
+ **Statement 4: Self-Improvement**
297
+
298
+ The focus here is to create environments where agents can learn to generate new challenges, escalate difficulty, and improve through self-play or adaptive curricula. Rather than optimizing fixed tasks, the goal is for agents to learn to drive their own capability growth. The objective is recursive skill amplification.
299
+
300
+ - **Expected Outcome:** an environment for improving self-play of a LLM over a defined set of tasks
301
+
302
+ - **Example Environments:** Self-play negotiation arenas, auto-generated math/proof tasks, evolving coding competitions, adaptive RL curricula.
303
+
304
+ - **Partner Sub-Theme:**
305
+ - **Snorkel AI:** Simulated Experts-in-the-Loop: Environment that simulates interactions with real subject-matter experts, with changing requirements / preferences.
306
+
307
+ **Statement 5: Wild Card - Impress Us!**
308
+
309
+ We do not want to limit your focus if your idea doesn’t fit the boxes above, we want and WILL reward out of box tasks, please be creative but remember to add submissions that meaningfully add value to LLM training on a certain task.
310
+
311
+ More details about each theme can be found here:
312
+
313
+ [linkEmbed]
314
+
315
+ ## **7. CV Hackathon Winners**
316
+
317
+ [linkEmbed]
318
+
319
+ ## **8. OpenEnv Provided Resources**
320
+
321
+ **Please read through the entire slideshow here. This includes:**
322
+
323
+ - OpenEnv Fundamentals, Architecture
324
+ - Local Dev, Docker, and HF Spaces Deployment
325
+ - OpenEnv in Practice
326
+ - Training (TRL & Unsloth)
327
+ - How-to-Access-Infrastructure (including GPU Request Form)
328
+
329
+ [linkEmbed]
330
+
331
+ ## **9. Partner Provided Resources**
332
+
333
+ - **Unsloth AI Resources**
334
+ - <https://unsloth.ai/docs/get-started/unsloth-notebooks#grpo-reasoning-rl-notebooks>
335
+ - **Mercor Resources**
336
+ - Dataset: <https://huggingface.co/datasets/mercor/apex-agents>
337
+ - Archipelago repo to run the eval: <https://github.com/Mercor-Intelligence/archipelago>
338
+ - APEX-Agents paper: <https://arxiv.org/abs/2601.14242>
339
+ - **Hugging Face Resources**
340
+ - **$30** in Compute and Inference Credits
341
+ - To claim your credits, set up a HF account here: <https://huggingface.co/join>
342
+ - Then, follow this link: <https://huggingface.co/openenv-community>
343
+ - You will be granted **$30** of compute and inference credits!
344
+ - **Northflank Resources**
345
+ - Each team gets an H100
346
+ - Northflank instructions
347
+
348
+ [linkEmbed]
349
+
350
+ - Join the NorthFlank discord channel for any questions
351
+ - Please fill out this form:
352
+
353
+ [linkEmbed]
354
+
355
+ - **Cursor Resources**
356
+ - **$50** in Cursor Credits, **apply below**
357
+
358
+ [linkEmbed]
359
+
360
+ ## **10. Judging & Submissions**
361
+
362
+ Judges will be taking place on **Sunday, March 8**. These judges are evaluating your **technical demos** in the following categories. _Show us what you have built_ to solve our problem statements. Please **do not** show us a presentation. We'll be checking to ensure your project was built **entirely during the event**; no previous work is allowed. 
363
+
364
+ **|** **Teams should submit [here](https://cerebralvalley.ai/e/openenv-hackathon-sf/hackathon/submit) when they have completed hacking.** In the submission form, you will have to upload a **one minute** demo video on YouTube talking about your submission. You must also show a minimal training script for your environment using Unsloth or HF TRL in Colab.
365
+
366
+ **Please ensure your project uses** use OpenEnv (stable release 0.2.1) deployed on HF spaces.
367
+
368
+ [linkEmbed]
369
+
370
+ **Judging Criteria**
371
+
372
+ - **Environment Innovation (40%) -** Is the environment novel, creative, or challenging? Does it meaningfully test the agent’s behavior?
373
+ - **Storytelling (30%) -** Does the team clearly explain the problem, environment, and agent behavior? Is the demo engaging and easy to follow?
374
+ - **Training Script Showing Improvement in Rewards (20%) -** Does the demo provide observable evidence of training progress (reward curves, metrics, or before/after behavior)? 
375
+ - **Reward and Training Pipeline Setup (10%) -** Is the reward logic coherent, and does the pipeline produce meaningful improvement in the agent’s inference (how it acts in the environment)?
376
+
377
+ **Judging Process**
378
+
379
+ **|** Judging proceeds in two rounds:
380
+
381
+ - Hackers will be assigned groups of judges; \~3 minutes to pitch followed by 1-2 minutes of Q/A
382
+
383
+ - The top **six** teams in ranking will get to demo on stage to a panel of judges; \~3 minutes to pitch followed by 2-3 minutes for Q/A.
384
+
385
+ ## **11. Prizes**
386
+
387
+ - **1st Place:** $15,000 USD Cash
388
+
389
+ - **2nd Place:** $9,000 USD Cash
390
+
391
+ - **3rd Place:** $6,000 USD Cash
docs/P1_PPO_SMOKE_NOTE.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # P1 PPO Smoke Note
2
+
3
+ This note records the first tiny low-fidelity PPO smoke pass on the repaired 4-knob `P1` environment.
4
+
5
+ ## Purpose
6
+
7
+ This run is diagnostic-only.
8
+
9
+ It exists to answer:
10
+
11
+ - can a small PPO policy interact with the low-fidelity environment without code-path failures
12
+ - does the reward surface produce a readable early failure mode
13
+ - what is the first obvious behavior problem before any broader training push
14
+
15
+ It does **not** validate the high-fidelity `submit` contract.
16
+
17
+ ## Command
18
+
19
+ ```bash
20
+ uv sync --extra training
21
+ uv run --extra training python training/ppo_smoke.py --eval-episodes 1
22
+ ```
23
+
24
+ ## Artifact
25
+
26
+ - ignored runtime artifact: `training/artifacts/ppo_smoke/ppo_smoke_20260308T062412Z.json`
27
+
28
+ ## Configuration
29
+
30
+ - training mode: low-fidelity only
31
+ - action space: 24 `run` actions + `restore_best`
32
+ - `submit`: intentionally excluded from the smoke loop
33
+ - total timesteps: `64`
34
+ - evaluation episodes: `1`
35
+ - device: `cpu`
36
+
37
+ ## Result
38
+
39
+ - the smoke path executed successfully and wrote a trajectory artifact
40
+ - the trained policy did **not** reach feasibility in the evaluation episode
41
+ - summary metrics:
42
+ - `mean_eval_reward = -1.1`
43
+ - `constraint_satisfaction_rate = 0.0`
44
+
45
+ ## First failure mode
46
+
47
+ The policy collapsed to a repeated low-fidelity action:
48
+
49
+ - `aspect_ratio increase medium`
50
+
51
+ Observed behavior:
52
+
53
+ - the same action repeated for the full 6-step budget
54
+ - feasibility stayed near `0.050653`
55
+ - final reward was negative because the agent burned the budget without finding a repair path
56
+
57
+ This is useful smoke evidence because it shows:
58
+
59
+ - the PPO training path is wired correctly enough to produce trajectories
60
+ - the current low-fidelity surface still permits an obvious local-behavior failure
61
+ - the next step should remain paired high-fidelity fixture checks plus at least one submit-side manual trace, not a broader training push
pyproject.toml CHANGED
@@ -18,6 +18,10 @@ notebooks = [
18
  "ipykernel>=6.29.0",
19
  "jupyterlab>=4.3.0",
20
  ]
 
 
 
 
21
  dev = [
22
  "pre-commit>=4.0.0",
23
  "pytest>=8.3.0",
 
18
  "ipykernel>=6.29.0",
19
  "jupyterlab>=4.3.0",
20
  ]
21
+ training = [
22
+ "gymnasium>=1.0.0",
23
+ "stable-baselines3>=2.5.0",
24
+ ]
25
  dev = [
26
  "pre-commit>=4.0.0",
27
  "pytest>=8.3.0",
training/README.md CHANGED
@@ -12,4 +12,10 @@ Training policy:
12
 
13
  - [ ] Northflank notebook artifacts saved
14
  - [ ] Colab notebook saved
 
15
  - [ ] trained-policy evidence saved
 
 
 
 
 
 
12
 
13
  - [ ] Northflank notebook artifacts saved
14
  - [ ] Colab notebook saved
15
+ - [x] tiny low-fi PPO smoke artifact saved
16
  - [ ] trained-policy evidence saved
17
+
18
+ ## Runnable paths
19
+
20
+ - install the training dependencies: `uv sync --extra training`
21
+ - tiny low-fi PPO smoke run: `uv run --extra training python training/ppo_smoke.py`
training/ppo_smoke.py ADDED
@@ -0,0 +1,339 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import argparse
4
+ import json
5
+ from dataclasses import asdict, dataclass
6
+ from datetime import UTC, datetime
7
+ from pathlib import Path
8
+ from typing import Final
9
+
10
+ import gymnasium as gym
11
+ import numpy as np
12
+ from gymnasium import spaces
13
+ from stable_baselines3 import PPO
14
+
15
+ from fusion_lab.models import StellaratorAction, StellaratorObservation
16
+ from server.contract import RESET_SEEDS
17
+ from server.environment import BUDGET, StellaratorEnvironment
18
+
19
+ DEFAULT_OUTPUT_DIR: Final[Path] = Path("training/artifacts/ppo_smoke")
20
+ DEFAULT_TOTAL_TIMESTEPS: Final[int] = 128
21
+ DEFAULT_EVAL_EPISODES: Final[int] = 3
22
+
23
+ RUN_ACTION_SPECS: Final[tuple[tuple[str, str, str], ...]] = (
24
+ ("aspect_ratio", "increase", "small"),
25
+ ("aspect_ratio", "increase", "medium"),
26
+ ("aspect_ratio", "increase", "large"),
27
+ ("aspect_ratio", "decrease", "small"),
28
+ ("aspect_ratio", "decrease", "medium"),
29
+ ("aspect_ratio", "decrease", "large"),
30
+ ("elongation", "increase", "small"),
31
+ ("elongation", "increase", "medium"),
32
+ ("elongation", "increase", "large"),
33
+ ("elongation", "decrease", "small"),
34
+ ("elongation", "decrease", "medium"),
35
+ ("elongation", "decrease", "large"),
36
+ ("rotational_transform", "increase", "small"),
37
+ ("rotational_transform", "increase", "medium"),
38
+ ("rotational_transform", "increase", "large"),
39
+ ("rotational_transform", "decrease", "small"),
40
+ ("rotational_transform", "decrease", "medium"),
41
+ ("rotational_transform", "decrease", "large"),
42
+ ("triangularity_scale", "increase", "small"),
43
+ ("triangularity_scale", "increase", "medium"),
44
+ ("triangularity_scale", "increase", "large"),
45
+ ("triangularity_scale", "decrease", "small"),
46
+ ("triangularity_scale", "decrease", "medium"),
47
+ ("triangularity_scale", "decrease", "large"),
48
+ )
49
+ LOW_FI_ACTION_COUNT: Final[int] = len(RUN_ACTION_SPECS) + 1
50
+ LOW_FI_RESTORE_ACTION_INDEX: Final[int] = len(RUN_ACTION_SPECS)
51
+
52
+
53
+ @dataclass(frozen=True)
54
+ class TraceStep:
55
+ step: int
56
+ action_index: int
57
+ action_label: str
58
+ reward: float
59
+ score: float
60
+ feasibility: float
61
+ constraints_satisfied: bool
62
+ evaluation_failed: bool
63
+ budget_remaining: int
64
+ max_elongation: float
65
+ average_triangularity: float
66
+ edge_iota_over_nfp: float
67
+
68
+
69
+ @dataclass(frozen=True)
70
+ class EpisodeTrace:
71
+ episode: int
72
+ seed: int
73
+ total_reward: float
74
+ final_score: float
75
+ final_feasibility: float
76
+ constraints_satisfied: bool
77
+ evaluation_failed: bool
78
+ steps: list[TraceStep]
79
+
80
+
81
+ class LowFiSmokeEnv(gym.Env[np.ndarray, int]):
82
+ metadata = {"render_modes": []}
83
+
84
+ def __init__(self) -> None:
85
+ super().__init__()
86
+ self._env = StellaratorEnvironment()
87
+ self._seed = 0
88
+ self._episode_index = 0
89
+ self.observation_space = spaces.Box(
90
+ low=-np.inf,
91
+ high=np.inf,
92
+ shape=(12,),
93
+ dtype=np.float32,
94
+ )
95
+ self.action_space = spaces.Discrete(LOW_FI_ACTION_COUNT)
96
+
97
+ def reset(
98
+ self,
99
+ *,
100
+ seed: int | None = None,
101
+ options: dict[str, object] | None = None,
102
+ ) -> tuple[np.ndarray, dict[str, object]]:
103
+ super().reset(seed=seed)
104
+ self._seed = self._next_seed(seed)
105
+ obs = self._env.reset(seed=self._seed)
106
+ return self._encode_observation(obs), self._info(obs)
107
+
108
+ def _next_seed(self, seed: int | None) -> int:
109
+ if seed is not None:
110
+ self._episode_index = 0
111
+ return seed % len(RESET_SEEDS)
112
+ next_seed = self._episode_index % len(RESET_SEEDS)
113
+ self._episode_index += 1
114
+ return next_seed
115
+
116
+ def step(
117
+ self,
118
+ action: int,
119
+ ) -> tuple[np.ndarray, float, bool, bool, dict[str, object]]:
120
+ obs = self._env.step(self._decode_action(action))
121
+ return (
122
+ self._encode_observation(obs),
123
+ float(obs.reward or 0.0),
124
+ bool(obs.done),
125
+ False,
126
+ self._info(obs),
127
+ )
128
+
129
+ def _decode_action(self, action: int) -> StellaratorAction:
130
+ if action == LOW_FI_RESTORE_ACTION_INDEX:
131
+ return StellaratorAction(intent="restore_best")
132
+ parameter, direction, magnitude = RUN_ACTION_SPECS[action]
133
+ return StellaratorAction(
134
+ intent="run",
135
+ parameter=parameter,
136
+ direction=direction,
137
+ magnitude=magnitude,
138
+ )
139
+
140
+ def action_label(self, action: int) -> str:
141
+ if action == LOW_FI_RESTORE_ACTION_INDEX:
142
+ return "restore_best"
143
+ parameter, direction, magnitude = RUN_ACTION_SPECS[action]
144
+ return f"{parameter} {direction} {magnitude}"
145
+
146
+ def _encode_observation(self, obs: StellaratorObservation) -> np.ndarray:
147
+ budget_fraction = obs.budget_remaining / BUDGET
148
+ step_fraction = obs.step_number / BUDGET
149
+ return np.array(
150
+ [
151
+ obs.max_elongation,
152
+ obs.aspect_ratio,
153
+ obs.average_triangularity,
154
+ obs.edge_iota_over_nfp,
155
+ obs.p1_score,
156
+ obs.p1_feasibility,
157
+ obs.vacuum_well,
158
+ budget_fraction,
159
+ step_fraction,
160
+ obs.best_low_fidelity_score,
161
+ obs.best_low_fidelity_feasibility,
162
+ float(obs.constraints_satisfied) - float(obs.evaluation_failed),
163
+ ],
164
+ dtype=np.float32,
165
+ )
166
+
167
+ def _info(self, obs: StellaratorObservation) -> dict[str, object]:
168
+ return {
169
+ "diagnostics_text": obs.diagnostics_text,
170
+ "budget_remaining": obs.budget_remaining,
171
+ "constraints_satisfied": obs.constraints_satisfied,
172
+ "evaluation_failed": obs.evaluation_failed,
173
+ "p1_score": obs.p1_score,
174
+ "p1_feasibility": obs.p1_feasibility,
175
+ }
176
+
177
+
178
+ def parse_args() -> argparse.Namespace:
179
+ parser = argparse.ArgumentParser(
180
+ description=(
181
+ "Run a tiny low-fidelity PPO smoke pass against the repaired Fusion Design Lab "
182
+ "environment and save a small trajectory artifact."
183
+ )
184
+ )
185
+ parser.add_argument(
186
+ "--total-timesteps",
187
+ type=int,
188
+ default=DEFAULT_TOTAL_TIMESTEPS,
189
+ help=f"Total PPO timesteps for the smoke run (default: {DEFAULT_TOTAL_TIMESTEPS}).",
190
+ )
191
+ parser.add_argument(
192
+ "--eval-episodes",
193
+ type=int,
194
+ default=DEFAULT_EVAL_EPISODES,
195
+ help=f"Number of deterministic evaluation episodes to record (default: {DEFAULT_EVAL_EPISODES}).",
196
+ )
197
+ parser.add_argument(
198
+ "--seed",
199
+ type=int,
200
+ default=0,
201
+ help="Base seed for training and evaluation.",
202
+ )
203
+ parser.add_argument(
204
+ "--output-dir",
205
+ type=Path,
206
+ default=DEFAULT_OUTPUT_DIR,
207
+ help="Directory where the JSON artifact should be written.",
208
+ )
209
+ return parser.parse_args()
210
+
211
+
212
+ def build_model(env: LowFiSmokeEnv, seed: int) -> PPO:
213
+ return PPO(
214
+ policy="MlpPolicy",
215
+ env=env,
216
+ seed=seed,
217
+ verbose=0,
218
+ device="cpu",
219
+ n_steps=32,
220
+ batch_size=32,
221
+ n_epochs=4,
222
+ gamma=0.98,
223
+ learning_rate=3e-4,
224
+ ent_coef=0.01,
225
+ )
226
+
227
+
228
+ def evaluate_policy(model: PPO, *, eval_episodes: int, base_seed: int) -> list[EpisodeTrace]:
229
+ traces: list[EpisodeTrace] = []
230
+ for episode in range(eval_episodes):
231
+ env = LowFiSmokeEnv()
232
+ seed = base_seed + episode
233
+ obs, _ = env.reset(seed=seed)
234
+ done = False
235
+ total_reward = 0.0
236
+ steps: list[TraceStep] = []
237
+ step_index = 0
238
+ final_info: dict[str, object] = {}
239
+
240
+ while not done:
241
+ action, _ = model.predict(obs, deterministic=True)
242
+ action_index = int(action)
243
+ obs, reward, terminated, truncated, info = env.step(action_index)
244
+ done = terminated or truncated
245
+ total_reward += reward
246
+ step_index += 1
247
+ final_info = info
248
+ steps.append(
249
+ TraceStep(
250
+ step=step_index,
251
+ action_index=action_index,
252
+ action_label=env.action_label(action_index),
253
+ reward=reward,
254
+ score=float(info["p1_score"]),
255
+ feasibility=float(info["p1_feasibility"]),
256
+ constraints_satisfied=bool(info["constraints_satisfied"]),
257
+ evaluation_failed=bool(info["evaluation_failed"]),
258
+ budget_remaining=int(info["budget_remaining"]),
259
+ max_elongation=float(obs[0]),
260
+ average_triangularity=float(obs[2]),
261
+ edge_iota_over_nfp=float(obs[3]),
262
+ )
263
+ )
264
+
265
+ traces.append(
266
+ EpisodeTrace(
267
+ episode=episode,
268
+ seed=seed,
269
+ total_reward=round(total_reward, 4),
270
+ final_score=float(final_info["p1_score"]),
271
+ final_feasibility=float(final_info["p1_feasibility"]),
272
+ constraints_satisfied=bool(final_info["constraints_satisfied"]),
273
+ evaluation_failed=bool(final_info["evaluation_failed"]),
274
+ steps=steps,
275
+ )
276
+ )
277
+ return traces
278
+
279
+
280
+ def artifact_payload(
281
+ *,
282
+ total_timesteps: int,
283
+ eval_episodes: int,
284
+ seed: int,
285
+ traces: list[EpisodeTrace],
286
+ ) -> dict[str, object]:
287
+ mean_reward = sum(trace.total_reward for trace in traces) / max(len(traces), 1)
288
+ success_rate = sum(1 for trace in traces if trace.constraints_satisfied) / max(len(traces), 1)
289
+ return {
290
+ "created_at_utc": datetime.now(UTC).isoformat(),
291
+ "mode": "low_fidelity_ppo_smoke",
292
+ "total_timesteps": total_timesteps,
293
+ "eval_episodes": eval_episodes,
294
+ "seed": seed,
295
+ "train_reset_seed_indices": list(range(len(RESET_SEEDS))),
296
+ "action_space_size": LOW_FI_ACTION_COUNT,
297
+ "notes": (
298
+ "Diagnostic-only PPO smoke run. Submit is intentionally excluded here so the "
299
+ "smoke loop stays low-fidelity and fast. Training resets cycle through the "
300
+ "frozen low-fidelity reset seeds to surface positive repair signal sooner."
301
+ ),
302
+ "summary": {
303
+ "mean_eval_reward": round(mean_reward, 4),
304
+ "constraint_satisfaction_rate": round(success_rate, 4),
305
+ },
306
+ "episodes": [asdict(trace) for trace in traces],
307
+ }
308
+
309
+
310
+ def write_artifact(output_dir: Path, payload: dict[str, object]) -> Path:
311
+ output_dir.mkdir(parents=True, exist_ok=True)
312
+ timestamp = datetime.now(UTC).strftime("%Y%m%dT%H%M%SZ")
313
+ output_path = output_dir / f"ppo_smoke_{timestamp}.json"
314
+ output_path.write_text(json.dumps(payload, indent=2, sort_keys=True) + "\n")
315
+ return output_path
316
+
317
+
318
+ def main() -> None:
319
+ args = parse_args()
320
+ env = LowFiSmokeEnv()
321
+ model = build_model(env, seed=args.seed)
322
+ model.learn(total_timesteps=args.total_timesteps, progress_bar=False)
323
+ traces = evaluate_policy(
324
+ model,
325
+ eval_episodes=args.eval_episodes,
326
+ base_seed=args.seed,
327
+ )
328
+ payload = artifact_payload(
329
+ total_timesteps=args.total_timesteps,
330
+ eval_episodes=args.eval_episodes,
331
+ seed=args.seed,
332
+ traces=traces,
333
+ )
334
+ output_path = write_artifact(args.output_dir, payload)
335
+ print(output_path)
336
+
337
+
338
+ if __name__ == "__main__":
339
+ main()
uv.lock CHANGED
@@ -698,6 +698,15 @@ wheels = [
698
  { url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" },
699
  ]
700
 
 
 
 
 
 
 
 
 
 
701
  [[package]]
702
  name = "cma"
703
  version = "4.4.4"
@@ -909,6 +918,30 @@ wheels = [
909
  { url = "https://files.pythonhosted.org/packages/bc/58/6b3d24e6b9bc474a2dcdee65dfd1f008867015408a271562e4b690561a4d/cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7", size = 3407605, upload-time = "2026-02-10T19:18:29.233Z" },
910
  ]
911
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
912
  [[package]]
913
  name = "cycler"
914
  version = "0.12.1"
@@ -1191,6 +1224,15 @@ wheels = [
1191
  { url = "https://files.pythonhosted.org/packages/21/f2/4454eefc15cc326b46530d230c58cc0bb91a1e9797f2842b2a1720cbb233/f90nml-1.5.0-py2.py3-none-any.whl", hash = "sha256:bdf616dbe7e83619feb86d54358fb8d97038133bfd8f9ba9a01eeca5dc4691a7", size = 51994, upload-time = "2025-10-07T15:25:09.064Z" },
1192
  ]
1193
 
 
 
 
 
 
 
 
 
 
1194
  [[package]]
1195
  name = "fastapi"
1196
  version = "0.135.1"
@@ -1489,11 +1531,16 @@ notebooks = [
1489
  { name = "ipykernel" },
1490
  { name = "jupyterlab" },
1491
  ]
 
 
 
 
1492
 
1493
  [package.metadata]
1494
  requires-dist = [
1495
  { name = "constellaration" },
1496
  { name = "fastapi", specifier = ">=0.115.0" },
 
1497
  { name = "ipykernel", marker = "extra == 'notebooks'", specifier = ">=6.29.0" },
1498
  { name = "jupyterlab", marker = "extra == 'notebooks'", specifier = ">=4.3.0" },
1499
  { name = "numpy", specifier = ">=2.0.0" },
@@ -1502,9 +1549,10 @@ requires-dist = [
1502
  { name = "pydantic", specifier = ">=2.10.0" },
1503
  { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.3.0" },
1504
  { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.11.0" },
 
1505
  { name = "uvicorn", specifier = ">=0.34.0" },
1506
  ]
1507
- provides-extras = ["notebooks", "dev"]
1508
 
1509
  [[package]]
1510
  name = "graphemeu"
@@ -1515,6 +1563,21 @@ wheels = [
1515
  { url = "https://files.pythonhosted.org/packages/69/18/36503ea63e1ecd0a95590d7b6b8b7d227a1e4541a154e1612a231def1bdc/graphemeu-0.7.2-py3-none-any.whl", hash = "sha256:1444520f6899fd30114fc2a39f297d86d10fa0f23bf7579f772f8bc7efaa2542", size = 22670, upload-time = "2025-01-15T09:48:57.241Z" },
1516
  ]
1517
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1518
  [[package]]
1519
  name = "h11"
1520
  version = "0.16.0"
@@ -3130,6 +3193,108 @@ dependencies = [
3130
  ]
3131
  sdist = { url = "https://files.pythonhosted.org/packages/1a/95/5b99a5798b366ab242fe0b2190f3814b9321eb98c6e1e9c6b599b2b4ce84/nvgpu-0.10.0.tar.gz", hash = "sha256:c415f757e0c375357f8904a6ea0cee084ab0ce97ed11e4840f2c8839196b3918", size = 8445, upload-time = "2023-03-30T03:17:01.622Z" }
3132
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3133
  [[package]]
3134
  name = "nvidia-ml-py"
3135
  version = "13.590.48"
@@ -3139,6 +3304,38 @@ wheels = [
3139
  { url = "https://files.pythonhosted.org/packages/fd/72/fb2af0d259a651affdce65fd6a495f0e07a685a0136baf585c5065204ee7/nvidia_ml_py-13.590.48-py3-none-any.whl", hash = "sha256:fd43d30ee9cd0b7940f5f9f9220b68d42722975e3992b6c21d14144c48760e43", size = 50680, upload-time = "2026-01-22T01:14:55.281Z" },
3140
  ]
3141
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3142
  [[package]]
3143
  name = "openai"
3144
  version = "2.26.0"
@@ -5031,6 +5228,23 @@ wheels = [
5031
  { url = "https://files.pythonhosted.org/packages/61/28/8cb142d3fe80c4a2d8af54ca0b003f47ce0ba920974e7990fa6e016402d1/sse_starlette-3.3.2-py3-none-any.whl", hash = "sha256:5c3ea3dad425c601236726af2f27689b74494643f57017cafcb6f8c9acfbb862", size = 14270, upload-time = "2026-02-28T11:24:32.984Z" },
5032
  ]
5033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5034
  [[package]]
5035
  name = "stack-data"
5036
  version = "0.6.3"
@@ -5198,6 +5412,66 @@ wheels = [
5198
  { url = "https://files.pythonhosted.org/packages/c7/18/c86eb8e0202e32dd3df50d43d7ff9854f8e0603945ff398974c1d91ac1ef/tomli_w-1.2.0-py3-none-any.whl", hash = "sha256:188306098d013b691fcadc011abd66727d3c414c571bb01b1a174ba8c983cf90", size = 6675, upload-time = "2025-01-15T12:07:22.074Z" },
5199
  ]
5200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5201
  [[package]]
5202
  name = "tornado"
5203
  version = "6.5.4"
@@ -5238,6 +5512,19 @@ wheels = [
5238
  { url = "https://files.pythonhosted.org/packages/00/c0/8f5d070730d7836adc9c9b6408dec68c6ced86b304a9b26a14df072a6e8c/traitlets-5.14.3-py3-none-any.whl", hash = "sha256:b74e89e397b1ed28cc831db7aea759ba6640cb3de13090ca145426688ff1ac4f", size = 85359, upload-time = "2024-04-19T11:11:46.763Z" },
5239
  ]
5240
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5241
  [[package]]
5242
  name = "typer"
5243
  version = "0.24.1"
 
698
  { url = "https://files.pythonhosted.org/packages/98/78/01c019cdb5d6498122777c1a43056ebb3ebfeef2076d9d026bfe15583b2b/click-8.3.1-py3-none-any.whl", hash = "sha256:981153a64e25f12d547d3426c367a4857371575ee7ad18df2a6183ab0545b2a6", size = 108274, upload-time = "2025-11-15T20:45:41.139Z" },
699
  ]
700
 
701
+ [[package]]
702
+ name = "cloudpickle"
703
+ version = "3.1.2"
704
+ source = { registry = "https://pypi.org/simple" }
705
+ sdist = { url = "https://files.pythonhosted.org/packages/27/fb/576f067976d320f5f0114a8d9fa1215425441bb35627b1993e5afd8111e5/cloudpickle-3.1.2.tar.gz", hash = "sha256:7fda9eb655c9c230dab534f1983763de5835249750e85fbcef43aaa30a9a2414", size = 22330, upload-time = "2025-11-03T09:25:26.604Z" }
706
+ wheels = [
707
+ { url = "https://files.pythonhosted.org/packages/88/39/799be3f2f0f38cc727ee3b4f1445fe6d5e4133064ec2e4115069418a5bb6/cloudpickle-3.1.2-py3-none-any.whl", hash = "sha256:9acb47f6afd73f60dc1df93bb801b472f05ff42fa6c84167d25cb206be1fbf4a", size = 22228, upload-time = "2025-11-03T09:25:25.534Z" },
708
+ ]
709
+
710
  [[package]]
711
  name = "cma"
712
  version = "4.4.4"
 
918
  { url = "https://files.pythonhosted.org/packages/bc/58/6b3d24e6b9bc474a2dcdee65dfd1f008867015408a271562e4b690561a4d/cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7", size = 3407605, upload-time = "2026-02-10T19:18:29.233Z" },
919
  ]
920
 
921
+ [[package]]
922
+ name = "cuda-bindings"
923
+ version = "12.9.4"
924
+ source = { registry = "https://pypi.org/simple" }
925
+ dependencies = [
926
+ { name = "cuda-pathfinder", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
927
+ ]
928
+ wheels = [
929
+ { url = "https://files.pythonhosted.org/packages/45/e7/b47792cc2d01c7e1d37c32402182524774dadd2d26339bd224e0e913832e/cuda_bindings-12.9.4-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c912a3d9e6b6651853eed8eed96d6800d69c08e94052c292fec3f282c5a817c9", size = 12210593, upload-time = "2025-10-21T14:51:36.574Z" },
930
+ { url = "https://files.pythonhosted.org/packages/a9/c1/dabe88f52c3e3760d861401bb994df08f672ec893b8f7592dc91626adcf3/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fda147a344e8eaeca0c6ff113d2851ffca8f7dfc0a6c932374ee5c47caa649c8", size = 12151019, upload-time = "2025-10-21T14:51:43.167Z" },
931
+ { url = "https://files.pythonhosted.org/packages/63/56/e465c31dc9111be3441a9ba7df1941fe98f4aa6e71e8788a3fb4534ce24d/cuda_bindings-12.9.4-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:32bdc5a76906be4c61eb98f546a6786c5773a881f3b166486449b5d141e4a39f", size = 11906628, upload-time = "2025-10-21T14:51:49.905Z" },
932
+ { url = "https://files.pythonhosted.org/packages/a3/84/1e6be415e37478070aeeee5884c2022713c1ecc735e6d82d744de0252eee/cuda_bindings-12.9.4-cp313-cp313t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:56e0043c457a99ac473ddc926fe0dc4046694d99caef633e92601ab52cbe17eb", size = 11925991, upload-time = "2025-10-21T14:51:56.535Z" },
933
+ { url = "https://files.pythonhosted.org/packages/d1/af/6dfd8f2ed90b1d4719bc053ff8940e494640fe4212dc3dd72f383e4992da/cuda_bindings-12.9.4-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8b72ee72a9cc1b531db31eebaaee5c69a8ec3500e32c6933f2d3b15297b53686", size = 11922703, upload-time = "2025-10-21T14:52:03.585Z" },
934
+ { url = "https://files.pythonhosted.org/packages/6c/19/90ac264acc00f6df8a49378eedec9fd2db3061bf9263bf9f39fd3d8377c3/cuda_bindings-12.9.4-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d80bffc357df9988dca279734bc9674c3934a654cab10cadeed27ce17d8635ee", size = 11924658, upload-time = "2025-10-21T14:52:10.411Z" },
935
+ ]
936
+
937
+ [[package]]
938
+ name = "cuda-pathfinder"
939
+ version = "1.4.1"
940
+ source = { registry = "https://pypi.org/simple" }
941
+ wheels = [
942
+ { url = "https://files.pythonhosted.org/packages/07/02/59a5bc738a09def0b49aea0e460bdf97f65206d0d041246147cf6207e69c/cuda_pathfinder-1.4.1-py3-none-any.whl", hash = "sha256:40793006082de88e0950753655e55558a446bed9a7d9d0bcb48b2506d50ed82a", size = 43903, upload-time = "2026-03-06T21:05:24.372Z" },
943
+ ]
944
+
945
  [[package]]
946
  name = "cycler"
947
  version = "0.12.1"
 
1224
  { url = "https://files.pythonhosted.org/packages/21/f2/4454eefc15cc326b46530d230c58cc0bb91a1e9797f2842b2a1720cbb233/f90nml-1.5.0-py2.py3-none-any.whl", hash = "sha256:bdf616dbe7e83619feb86d54358fb8d97038133bfd8f9ba9a01eeca5dc4691a7", size = 51994, upload-time = "2025-10-07T15:25:09.064Z" },
1225
  ]
1226
 
1227
+ [[package]]
1228
+ name = "farama-notifications"
1229
+ version = "0.0.4"
1230
+ source = { registry = "https://pypi.org/simple" }
1231
+ sdist = { url = "https://files.pythonhosted.org/packages/2e/2c/8384832b7a6b1fd6ba95bbdcae26e7137bb3eedc955c42fd5cdcc086cfbf/Farama-Notifications-0.0.4.tar.gz", hash = "sha256:13fceff2d14314cf80703c8266462ebf3733c7d165336eee998fc58e545efd18", size = 2131, upload-time = "2023-02-27T18:28:41.047Z" }
1232
+ wheels = [
1233
+ { url = "https://files.pythonhosted.org/packages/05/2c/ffc08c54c05cdce6fbed2aeebc46348dbe180c6d2c541c7af7ba0aa5f5f8/Farama_Notifications-0.0.4-py3-none-any.whl", hash = "sha256:14de931035a41961f7c056361dc7f980762a143d05791ef5794a751a2caf05ae", size = 2511, upload-time = "2023-02-27T18:28:39.447Z" },
1234
+ ]
1235
+
1236
  [[package]]
1237
  name = "fastapi"
1238
  version = "0.135.1"
 
1531
  { name = "ipykernel" },
1532
  { name = "jupyterlab" },
1533
  ]
1534
+ training = [
1535
+ { name = "gymnasium" },
1536
+ { name = "stable-baselines3" },
1537
+ ]
1538
 
1539
  [package.metadata]
1540
  requires-dist = [
1541
  { name = "constellaration" },
1542
  { name = "fastapi", specifier = ">=0.115.0" },
1543
+ { name = "gymnasium", marker = "extra == 'training'", specifier = ">=1.0.0" },
1544
  { name = "ipykernel", marker = "extra == 'notebooks'", specifier = ">=6.29.0" },
1545
  { name = "jupyterlab", marker = "extra == 'notebooks'", specifier = ">=4.3.0" },
1546
  { name = "numpy", specifier = ">=2.0.0" },
 
1549
  { name = "pydantic", specifier = ">=2.10.0" },
1550
  { name = "pytest", marker = "extra == 'dev'", specifier = ">=8.3.0" },
1551
  { name = "ruff", marker = "extra == 'dev'", specifier = ">=0.11.0" },
1552
+ { name = "stable-baselines3", marker = "extra == 'training'", specifier = ">=2.5.0" },
1553
  { name = "uvicorn", specifier = ">=0.34.0" },
1554
  ]
1555
+ provides-extras = ["notebooks", "training", "dev"]
1556
 
1557
  [[package]]
1558
  name = "graphemeu"
 
1563
  { url = "https://files.pythonhosted.org/packages/69/18/36503ea63e1ecd0a95590d7b6b8b7d227a1e4541a154e1612a231def1bdc/graphemeu-0.7.2-py3-none-any.whl", hash = "sha256:1444520f6899fd30114fc2a39f297d86d10fa0f23bf7579f772f8bc7efaa2542", size = 22670, upload-time = "2025-01-15T09:48:57.241Z" },
1564
  ]
1565
 
1566
+ [[package]]
1567
+ name = "gymnasium"
1568
+ version = "1.2.3"
1569
+ source = { registry = "https://pypi.org/simple" }
1570
+ dependencies = [
1571
+ { name = "cloudpickle" },
1572
+ { name = "farama-notifications" },
1573
+ { name = "numpy" },
1574
+ { name = "typing-extensions" },
1575
+ ]
1576
+ sdist = { url = "https://files.pythonhosted.org/packages/76/59/653a9417d98ed3e29ef9734ba52c3495f6c6823b8d5c0c75369f25111708/gymnasium-1.2.3.tar.gz", hash = "sha256:2b2cb5b5fbbbdf3afb9f38ca952cc48aa6aa3e26561400d940747fda3ad42509", size = 829230, upload-time = "2025-12-18T16:51:10.234Z" }
1577
+ wheels = [
1578
+ { url = "https://files.pythonhosted.org/packages/56/d3/ea5f088e3638dbab12e5c20d6559d5b3bdaeaa1f2af74e526e6815836285/gymnasium-1.2.3-py3-none-any.whl", hash = "sha256:e6314bba8f549c7fdcc8677f7cd786b64908af6e79b57ddaa5ce1825bffb5373", size = 952113, upload-time = "2025-12-18T16:51:08.445Z" },
1579
+ ]
1580
+
1581
  [[package]]
1582
  name = "h11"
1583
  version = "0.16.0"
 
3193
  ]
3194
  sdist = { url = "https://files.pythonhosted.org/packages/1a/95/5b99a5798b366ab242fe0b2190f3814b9321eb98c6e1e9c6b599b2b4ce84/nvgpu-0.10.0.tar.gz", hash = "sha256:c415f757e0c375357f8904a6ea0cee084ab0ce97ed11e4840f2c8839196b3918", size = 8445, upload-time = "2023-03-30T03:17:01.622Z" }
3195
 
3196
+ [[package]]
3197
+ name = "nvidia-cublas-cu12"
3198
+ version = "12.8.4.1"
3199
+ source = { registry = "https://pypi.org/simple" }
3200
+ wheels = [
3201
+ { url = "https://files.pythonhosted.org/packages/dc/61/e24b560ab2e2eaeb3c839129175fb330dfcfc29e5203196e5541a4c44682/nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:8ac4e771d5a348c551b2a426eda6193c19aa630236b418086020df5ba9667142", size = 594346921, upload-time = "2025-03-07T01:44:31.254Z" },
3202
+ ]
3203
+
3204
+ [[package]]
3205
+ name = "nvidia-cuda-cupti-cu12"
3206
+ version = "12.8.90"
3207
+ source = { registry = "https://pypi.org/simple" }
3208
+ wheels = [
3209
+ { url = "https://files.pythonhosted.org/packages/f8/02/2adcaa145158bf1a8295d83591d22e4103dbfd821bcaf6f3f53151ca4ffa/nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ea0cb07ebda26bb9b29ba82cda34849e73c166c18162d3913575b0c9db9a6182", size = 10248621, upload-time = "2025-03-07T01:40:21.213Z" },
3210
+ ]
3211
+
3212
+ [[package]]
3213
+ name = "nvidia-cuda-nvrtc-cu12"
3214
+ version = "12.8.93"
3215
+ source = { registry = "https://pypi.org/simple" }
3216
+ wheels = [
3217
+ { url = "https://files.pythonhosted.org/packages/05/6b/32f747947df2da6994e999492ab306a903659555dddc0fbdeb9d71f75e52/nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:a7756528852ef889772a84c6cd89d41dfa74667e24cca16bb31f8f061e3e9994", size = 88040029, upload-time = "2025-03-07T01:42:13.562Z" },
3218
+ ]
3219
+
3220
+ [[package]]
3221
+ name = "nvidia-cuda-runtime-cu12"
3222
+ version = "12.8.90"
3223
+ source = { registry = "https://pypi.org/simple" }
3224
+ wheels = [
3225
+ { url = "https://files.pythonhosted.org/packages/0d/9b/a997b638fcd068ad6e4d53b8551a7d30fe8b404d6f1804abf1df69838932/nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:adade8dcbd0edf427b7204d480d6066d33902cab2a4707dcfc48a2d0fd44ab90", size = 954765, upload-time = "2025-03-07T01:40:01.615Z" },
3226
+ ]
3227
+
3228
+ [[package]]
3229
+ name = "nvidia-cudnn-cu12"
3230
+ version = "9.10.2.21"
3231
+ source = { registry = "https://pypi.org/simple" }
3232
+ dependencies = [
3233
+ { name = "nvidia-cublas-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3234
+ ]
3235
+ wheels = [
3236
+ { url = "https://files.pythonhosted.org/packages/ba/51/e123d997aa098c61d029f76663dedbfb9bc8dcf8c60cbd6adbe42f76d049/nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:949452be657fa16687d0930933f032835951ef0892b37d2d53824d1a84dc97a8", size = 706758467, upload-time = "2025-06-06T21:54:08.597Z" },
3237
+ ]
3238
+
3239
+ [[package]]
3240
+ name = "nvidia-cufft-cu12"
3241
+ version = "11.3.3.83"
3242
+ source = { registry = "https://pypi.org/simple" }
3243
+ dependencies = [
3244
+ { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3245
+ ]
3246
+ wheels = [
3247
+ { url = "https://files.pythonhosted.org/packages/1f/13/ee4e00f30e676b66ae65b4f08cb5bcbb8392c03f54f2d5413ea99a5d1c80/nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d2dd21ec0b88cf61b62e6b43564355e5222e4a3fb394cac0db101f2dd0d4f74", size = 193118695, upload-time = "2025-03-07T01:45:27.821Z" },
3248
+ ]
3249
+
3250
+ [[package]]
3251
+ name = "nvidia-cufile-cu12"
3252
+ version = "1.13.1.3"
3253
+ source = { registry = "https://pypi.org/simple" }
3254
+ wheels = [
3255
+ { url = "https://files.pythonhosted.org/packages/bb/fe/1bcba1dfbfb8d01be8d93f07bfc502c93fa23afa6fd5ab3fc7c1df71038a/nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1d069003be650e131b21c932ec3d8969c1715379251f8d23a1860554b1cb24fc", size = 1197834, upload-time = "2025-03-07T01:45:50.723Z" },
3256
+ ]
3257
+
3258
+ [[package]]
3259
+ name = "nvidia-curand-cu12"
3260
+ version = "10.3.9.90"
3261
+ source = { registry = "https://pypi.org/simple" }
3262
+ wheels = [
3263
+ { url = "https://files.pythonhosted.org/packages/fb/aa/6584b56dc84ebe9cf93226a5cde4d99080c8e90ab40f0c27bda7a0f29aa1/nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:b32331d4f4df5d6eefa0554c565b626c7216f87a06a4f56fab27c3b68a830ec9", size = 63619976, upload-time = "2025-03-07T01:46:23.323Z" },
3264
+ ]
3265
+
3266
+ [[package]]
3267
+ name = "nvidia-cusolver-cu12"
3268
+ version = "11.7.3.90"
3269
+ source = { registry = "https://pypi.org/simple" }
3270
+ dependencies = [
3271
+ { name = "nvidia-cublas-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3272
+ { name = "nvidia-cusparse-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3273
+ { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3274
+ ]
3275
+ wheels = [
3276
+ { url = "https://files.pythonhosted.org/packages/85/48/9a13d2975803e8cf2777d5ed57b87a0b6ca2cc795f9a4f59796a910bfb80/nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:4376c11ad263152bd50ea295c05370360776f8c3427b30991df774f9fb26c450", size = 267506905, upload-time = "2025-03-07T01:47:16.273Z" },
3277
+ ]
3278
+
3279
+ [[package]]
3280
+ name = "nvidia-cusparse-cu12"
3281
+ version = "12.5.8.93"
3282
+ source = { registry = "https://pypi.org/simple" }
3283
+ dependencies = [
3284
+ { name = "nvidia-nvjitlink-cu12", marker = "platform_machine != 'ARM64' or sys_platform != 'win32'" },
3285
+ ]
3286
+ wheels = [
3287
+ { url = "https://files.pythonhosted.org/packages/c2/f5/e1854cb2f2bcd4280c44736c93550cc300ff4b8c95ebe370d0aa7d2b473d/nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ec05d76bbbd8b61b06a80e1eaf8cf4959c3d4ce8e711b65ebd0443bb0ebb13b", size = 288216466, upload-time = "2025-03-07T01:48:13.779Z" },
3288
+ ]
3289
+
3290
+ [[package]]
3291
+ name = "nvidia-cusparselt-cu12"
3292
+ version = "0.7.1"
3293
+ source = { registry = "https://pypi.org/simple" }
3294
+ wheels = [
3295
+ { url = "https://files.pythonhosted.org/packages/56/79/12978b96bd44274fe38b5dde5cfb660b1d114f70a65ef962bcbbed99b549/nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl", hash = "sha256:f1bb701d6b930d5a7cea44c19ceb973311500847f81b634d802b7b539dc55623", size = 287193691, upload-time = "2025-02-26T00:15:44.104Z" },
3296
+ ]
3297
+
3298
  [[package]]
3299
  name = "nvidia-ml-py"
3300
  version = "13.590.48"
 
3304
  { url = "https://files.pythonhosted.org/packages/fd/72/fb2af0d259a651affdce65fd6a495f0e07a685a0136baf585c5065204ee7/nvidia_ml_py-13.590.48-py3-none-any.whl", hash = "sha256:fd43d30ee9cd0b7940f5f9f9220b68d42722975e3992b6c21d14144c48760e43", size = 50680, upload-time = "2026-01-22T01:14:55.281Z" },
3305
  ]
3306
 
3307
+ [[package]]
3308
+ name = "nvidia-nccl-cu12"
3309
+ version = "2.27.5"
3310
+ source = { registry = "https://pypi.org/simple" }
3311
+ wheels = [
3312
+ { url = "https://files.pythonhosted.org/packages/6e/89/f7a07dc961b60645dbbf42e80f2bc85ade7feb9a491b11a1e973aa00071f/nvidia_nccl_cu12-2.27.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ad730cf15cb5d25fe849c6e6ca9eb5b76db16a80f13f425ac68d8e2e55624457", size = 322348229, upload-time = "2025-06-26T04:11:28.385Z" },
3313
+ ]
3314
+
3315
+ [[package]]
3316
+ name = "nvidia-nvjitlink-cu12"
3317
+ version = "12.8.93"
3318
+ source = { registry = "https://pypi.org/simple" }
3319
+ wheels = [
3320
+ { url = "https://files.pythonhosted.org/packages/f6/74/86a07f1d0f42998ca31312f998bd3b9a7eff7f52378f4f270c8679c77fb9/nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:81ff63371a7ebd6e6451970684f916be2eab07321b73c9d244dc2b4da7f73b88", size = 39254836, upload-time = "2025-03-07T01:49:55.661Z" },
3321
+ ]
3322
+
3323
+ [[package]]
3324
+ name = "nvidia-nvshmem-cu12"
3325
+ version = "3.4.5"
3326
+ source = { registry = "https://pypi.org/simple" }
3327
+ wheels = [
3328
+ { url = "https://files.pythonhosted.org/packages/b5/09/6ea3ea725f82e1e76684f0708bbedd871fc96da89945adeba65c3835a64c/nvidia_nvshmem_cu12-3.4.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:042f2500f24c021db8a06c5eec2539027d57460e1c1a762055a6554f72c369bd", size = 139103095, upload-time = "2025-09-06T00:32:31.266Z" },
3329
+ ]
3330
+
3331
+ [[package]]
3332
+ name = "nvidia-nvtx-cu12"
3333
+ version = "12.8.90"
3334
+ source = { registry = "https://pypi.org/simple" }
3335
+ wheels = [
3336
+ { url = "https://files.pythonhosted.org/packages/a2/eb/86626c1bbc2edb86323022371c39aa48df6fd8b0a1647bc274577f72e90b/nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5b17e2001cc0d751a5bc2c6ec6d26ad95913324a4adb86788c944f8ce9ba441f", size = 89954, upload-time = "2025-03-07T01:42:44.131Z" },
3337
+ ]
3338
+
3339
  [[package]]
3340
  name = "openai"
3341
  version = "2.26.0"
 
5228
  { url = "https://files.pythonhosted.org/packages/61/28/8cb142d3fe80c4a2d8af54ca0b003f47ce0ba920974e7990fa6e016402d1/sse_starlette-3.3.2-py3-none-any.whl", hash = "sha256:5c3ea3dad425c601236726af2f27689b74494643f57017cafcb6f8c9acfbb862", size = 14270, upload-time = "2026-02-28T11:24:32.984Z" },
5229
  ]
5230
 
5231
+ [[package]]
5232
+ name = "stable-baselines3"
5233
+ version = "2.7.1"
5234
+ source = { registry = "https://pypi.org/simple" }
5235
+ dependencies = [
5236
+ { name = "cloudpickle" },
5237
+ { name = "gymnasium" },
5238
+ { name = "matplotlib" },
5239
+ { name = "numpy" },
5240
+ { name = "pandas" },
5241
+ { name = "torch" },
5242
+ ]
5243
+ sdist = { url = "https://files.pythonhosted.org/packages/c9/42/f284c28272422262a99cdf35ecd2e283fded2f75327e6d5e82a9f6d6fe62/stable_baselines3-2.7.1.tar.gz", hash = "sha256:cd90d12d9ee0d9584053f12215c1682b313be4e3a8d8007739319799c3d2c071", size = 220719, upload-time = "2025-12-05T11:22:03.691Z" }
5244
+ wheels = [
5245
+ { url = "https://files.pythonhosted.org/packages/df/cc/a3038d3833f329dcd03b2dce8b778e4b41044caff88b48429473b8629623/stable_baselines3-2.7.1-py3-none-any.whl", hash = "sha256:b017e76dfe5ca0ce6eabb29e79c42e8c7e125d5862bfcd43ce04ec19732348d0", size = 188039, upload-time = "2025-12-05T11:22:00.819Z" },
5246
+ ]
5247
+
5248
  [[package]]
5249
  name = "stack-data"
5250
  version = "0.6.3"
 
5412
  { url = "https://files.pythonhosted.org/packages/c7/18/c86eb8e0202e32dd3df50d43d7ff9854f8e0603945ff398974c1d91ac1ef/tomli_w-1.2.0-py3-none-any.whl", hash = "sha256:188306098d013b691fcadc011abd66727d3c414c571bb01b1a174ba8c983cf90", size = 6675, upload-time = "2025-01-15T12:07:22.074Z" },
5413
  ]
5414
 
5415
+ [[package]]
5416
+ name = "torch"
5417
+ version = "2.10.0"
5418
+ source = { registry = "https://pypi.org/simple" }
5419
+ dependencies = [
5420
+ { name = "cuda-bindings", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5421
+ { name = "filelock" },
5422
+ { name = "fsspec" },
5423
+ { name = "jinja2" },
5424
+ { name = "networkx" },
5425
+ { name = "nvidia-cublas-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5426
+ { name = "nvidia-cuda-cupti-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5427
+ { name = "nvidia-cuda-nvrtc-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5428
+ { name = "nvidia-cuda-runtime-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5429
+ { name = "nvidia-cudnn-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5430
+ { name = "nvidia-cufft-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5431
+ { name = "nvidia-cufile-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5432
+ { name = "nvidia-curand-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5433
+ { name = "nvidia-cusolver-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5434
+ { name = "nvidia-cusparse-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5435
+ { name = "nvidia-cusparselt-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5436
+ { name = "nvidia-nccl-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5437
+ { name = "nvidia-nvjitlink-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5438
+ { name = "nvidia-nvshmem-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5439
+ { name = "nvidia-nvtx-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5440
+ { name = "setuptools", marker = "python_full_version >= '3.12'" },
5441
+ { name = "sympy" },
5442
+ { name = "triton", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
5443
+ { name = "typing-extensions" },
5444
+ ]
5445
+ wheels = [
5446
+ { url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
5447
+ { url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
5448
+ { url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
5449
+ { url = "https://files.pythonhosted.org/packages/78/89/f5554b13ebd71e05c0b002f95148033e730d3f7067f67423026cc9c69410/torch-2.10.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:3282d9febd1e4e476630a099692b44fdc214ee9bf8ee5377732d9d9dfe5712e4", size = 145992610, upload-time = "2026-01-21T16:25:26.327Z" },
5450
+ { url = "https://files.pythonhosted.org/packages/ae/30/a3a2120621bf9c17779b169fc17e3dc29b230c29d0f8222f499f5e159aa8/torch-2.10.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:a2f9edd8dbc99f62bc4dfb78af7bf89499bca3d753423ac1b4e06592e467b763", size = 915607863, upload-time = "2026-01-21T16:25:06.696Z" },
5451
+ { url = "https://files.pythonhosted.org/packages/6f/3d/c87b33c5f260a2a8ad68da7147e105f05868c281c63d65ed85aa4da98c66/torch-2.10.0-cp311-cp311-win_amd64.whl", hash = "sha256:29b7009dba4b7a1c960260fc8ac85022c784250af43af9fb0ebafc9883782ebd", size = 113723116, upload-time = "2026-01-21T16:25:21.916Z" },
5452
+ { url = "https://files.pythonhosted.org/packages/61/d8/15b9d9d3a6b0c01b883787bd056acbe5cc321090d4b216d3ea89a8fcfdf3/torch-2.10.0-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:b7bd80f3477b830dd166c707c5b0b82a898e7b16f59a7d9d42778dd058272e8b", size = 79423461, upload-time = "2026-01-21T16:24:50.266Z" },
5453
+ { url = "https://files.pythonhosted.org/packages/cc/af/758e242e9102e9988969b5e621d41f36b8f258bb4a099109b7a4b4b50ea4/torch-2.10.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:5fd4117d89ffd47e3dcc71e71a22efac24828ad781c7e46aaaf56bf7f2796acf", size = 145996088, upload-time = "2026-01-21T16:24:44.171Z" },
5454
+ { url = "https://files.pythonhosted.org/packages/23/8e/3c74db5e53bff7ed9e34c8123e6a8bfef718b2450c35eefab85bb4a7e270/torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:787124e7db3b379d4f1ed54dd12ae7c741c16a4d29b49c0226a89bea50923ffb", size = 915711952, upload-time = "2026-01-21T16:23:53.503Z" },
5455
+ { url = "https://files.pythonhosted.org/packages/6e/01/624c4324ca01f66ae4c7cd1b74eb16fb52596dce66dbe51eff95ef9e7a4c/torch-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:2c66c61f44c5f903046cc696d088e21062644cbe541c7f1c4eaae88b2ad23547", size = 113757972, upload-time = "2026-01-21T16:24:39.516Z" },
5456
+ { url = "https://files.pythonhosted.org/packages/c9/5c/dee910b87c4d5c0fcb41b50839ae04df87c1cfc663cf1b5fca7ea565eeaa/torch-2.10.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:6d3707a61863d1c4d6ebba7be4ca320f42b869ee657e9b2c21c736bf17000294", size = 79498198, upload-time = "2026-01-21T16:24:34.704Z" },
5457
+ { url = "https://files.pythonhosted.org/packages/c9/6f/f2e91e34e3fcba2e3fc8d8f74e7d6c22e74e480bbd1db7bc8900fdf3e95c/torch-2.10.0-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:5c4d217b14741e40776dd7074d9006fd28b8a97ef5654db959d8635b2fe5f29b", size = 146004247, upload-time = "2026-01-21T16:24:29.335Z" },
5458
+ { url = "https://files.pythonhosted.org/packages/98/fb/5160261aeb5e1ee12ee95fe599d0541f7c976c3701d607d8fc29e623229f/torch-2.10.0-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:6b71486353fce0f9714ca0c9ef1c850a2ae766b409808acd58e9678a3edb7738", size = 915716445, upload-time = "2026-01-21T16:22:45.353Z" },
5459
+ { url = "https://files.pythonhosted.org/packages/6a/16/502fb1b41e6d868e8deb5b0e3ae926bbb36dab8ceb0d1b769b266ad7b0c3/torch-2.10.0-cp313-cp313-win_amd64.whl", hash = "sha256:c2ee399c644dc92ef7bc0d4f7e74b5360c37cdbe7c5ba11318dda49ffac2bc57", size = 113757050, upload-time = "2026-01-21T16:24:19.204Z" },
5460
+ { url = "https://files.pythonhosted.org/packages/1a/0b/39929b148f4824bc3ad6f9f72a29d4ad865bcf7ebfc2fa67584773e083d2/torch-2.10.0-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:3202429f58309b9fa96a614885eace4b7995729f44beb54d3e4a47773649d382", size = 79851305, upload-time = "2026-01-21T16:24:09.209Z" },
5461
+ { url = "https://files.pythonhosted.org/packages/d8/14/21fbce63bc452381ba5f74a2c0a959fdf5ad5803ccc0c654e752e0dbe91a/torch-2.10.0-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:aae1b29cd68e50a9397f5ee897b9c24742e9e306f88a807a27d617f07adb3bd8", size = 146005472, upload-time = "2026-01-21T16:22:29.022Z" },
5462
+ { url = "https://files.pythonhosted.org/packages/54/fd/b207d1c525cb570ef47f3e9f836b154685011fce11a2f444ba8a4084d042/torch-2.10.0-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:6021db85958db2f07ec94e1bc77212721ba4920c12a18dc552d2ae36a3eb163f", size = 915612644, upload-time = "2026-01-21T16:21:47.019Z" },
5463
+ { url = "https://files.pythonhosted.org/packages/36/53/0197f868c75f1050b199fe58f9bf3bf3aecac9b4e85cc9c964383d745403/torch-2.10.0-cp313-cp313t-win_amd64.whl", hash = "sha256:ff43db38af76fda183156153983c9a096fc4c78d0cd1e07b14a2314c7f01c2c8", size = 113997015, upload-time = "2026-01-21T16:23:00.767Z" },
5464
+ { url = "https://files.pythonhosted.org/packages/0e/13/e76b4d9c160e89fff48bf16b449ea324bda84745d2ab30294c37c2434c0d/torch-2.10.0-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:cdf2a523d699b70d613243211ecaac14fe9c5df8a0b0a9c02add60fb2a413e0f", size = 79498248, upload-time = "2026-01-21T16:23:09.315Z" },
5465
+ { url = "https://files.pythonhosted.org/packages/4f/93/716b5ac0155f1be70ed81bacc21269c3ece8dba0c249b9994094110bfc51/torch-2.10.0-cp314-cp314-macosx_14_0_arm64.whl", hash = "sha256:bf0d9ff448b0218e0433aeb198805192346c4fd659c852370d5cc245f602a06a", size = 79464992, upload-time = "2026-01-21T16:23:05.162Z" },
5466
+ { url = "https://files.pythonhosted.org/packages/69/2b/51e663ff190c9d16d4a8271203b71bc73a16aa7619b9f271a69b9d4a936b/torch-2.10.0-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:233aed0659a2503b831d8a67e9da66a62c996204c0bba4f4c442ccc0c68a3f60", size = 146018567, upload-time = "2026-01-21T16:22:23.393Z" },
5467
+ { url = "https://files.pythonhosted.org/packages/5e/cd/4b95ef7f293b927c283db0b136c42be91c8ec6845c44de0238c8c23bdc80/torch-2.10.0-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:682497e16bdfa6efeec8cde66531bc8d1fbbbb4d8788ec6173c089ed3cc2bfe5", size = 915721646, upload-time = "2026-01-21T16:21:16.983Z" },
5468
+ { url = "https://files.pythonhosted.org/packages/56/97/078a007208f8056d88ae43198833469e61a0a355abc0b070edd2c085eb9a/torch-2.10.0-cp314-cp314-win_amd64.whl", hash = "sha256:6528f13d2a8593a1a412ea07a99812495bec07e9224c28b2a25c0a30c7da025c", size = 113752373, upload-time = "2026-01-21T16:22:13.471Z" },
5469
+ { url = "https://files.pythonhosted.org/packages/d8/94/71994e7d0d5238393df9732fdab607e37e2b56d26a746cb59fdb415f8966/torch-2.10.0-cp314-cp314t-macosx_14_0_arm64.whl", hash = "sha256:f5ab4ba32383061be0fb74bda772d470140a12c1c3b58a0cfbf3dae94d164c28", size = 79850324, upload-time = "2026-01-21T16:22:09.494Z" },
5470
+ { url = "https://files.pythonhosted.org/packages/e2/65/1a05346b418ea8ccd10360eef4b3e0ce688fba544e76edec26913a8d0ee0/torch-2.10.0-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:716b01a176c2a5659c98f6b01bf868244abdd896526f1c692712ab36dbaf9b63", size = 146006482, upload-time = "2026-01-21T16:22:18.42Z" },
5471
+ { url = "https://files.pythonhosted.org/packages/1d/b9/5f6f9d9e859fc3235f60578fa64f52c9c6e9b4327f0fe0defb6de5c0de31/torch-2.10.0-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:d8f5912ba938233f86361e891789595ff35ca4b4e2ac8fe3670895e5976731d6", size = 915613050, upload-time = "2026-01-21T16:20:49.035Z" },
5472
+ { url = "https://files.pythonhosted.org/packages/66/4d/35352043ee0eaffdeff154fad67cd4a31dbed7ff8e3be1cc4549717d6d51/torch-2.10.0-cp314-cp314t-win_amd64.whl", hash = "sha256:71283a373f0ee2c89e0f0d5f446039bdabe8dbc3c9ccf35f0f784908b0acd185", size = 113995816, upload-time = "2026-01-21T16:22:05.312Z" },
5473
+ ]
5474
+
5475
  [[package]]
5476
  name = "tornado"
5477
  version = "6.5.4"
 
5512
  { url = "https://files.pythonhosted.org/packages/00/c0/8f5d070730d7836adc9c9b6408dec68c6ced86b304a9b26a14df072a6e8c/traitlets-5.14.3-py3-none-any.whl", hash = "sha256:b74e89e397b1ed28cc831db7aea759ba6640cb3de13090ca145426688ff1ac4f", size = 85359, upload-time = "2024-04-19T11:11:46.763Z" },
5513
  ]
5514
 
5515
+ [[package]]
5516
+ name = "triton"
5517
+ version = "3.6.0"
5518
+ source = { registry = "https://pypi.org/simple" }
5519
+ wheels = [
5520
+ { url = "https://files.pythonhosted.org/packages/e0/12/b05ba554d2c623bffa59922b94b0775673de251f468a9609bc9e45de95e9/triton-3.6.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e8e323d608e3a9bfcc2d9efcc90ceefb764a82b99dea12a86d643c72539ad5d3", size = 188214640, upload-time = "2026-01-20T16:00:35.869Z" },
5521
+ { url = "https://files.pythonhosted.org/packages/ab/a8/cdf8b3e4c98132f965f88c2313a4b493266832ad47fb52f23d14d4f86bb5/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:74caf5e34b66d9f3a429af689c1c7128daba1d8208df60e81106b115c00d6fca", size = 188266850, upload-time = "2026-01-20T16:00:43.041Z" },
5522
+ { url = "https://files.pythonhosted.org/packages/f9/0b/37d991d8c130ce81a8728ae3c25b6e60935838e9be1b58791f5997b24a54/triton-3.6.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:10c7f76c6e72d2ef08df639e3d0d30729112f47a56b0c81672edc05ee5116ac9", size = 188289450, upload-time = "2026-01-20T16:00:49.136Z" },
5523
+ { url = "https://files.pythonhosted.org/packages/35/f8/9c66bfc55361ec6d0e4040a0337fb5924ceb23de4648b8a81ae9d33b2b38/triton-3.6.0-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d002e07d7180fd65e622134fbd980c9a3d4211fb85224b56a0a0efbd422ab72f", size = 188400296, upload-time = "2026-01-20T16:00:56.042Z" },
5524
+ { url = "https://files.pythonhosted.org/packages/df/3d/9e7eee57b37c80cec63322c0231bb6da3cfe535a91d7a4d64896fcb89357/triton-3.6.0-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a17a5d5985f0ac494ed8a8e54568f092f7057ef60e1b0fa09d3fd1512064e803", size = 188273063, upload-time = "2026-01-20T16:01:07.278Z" },
5525
+ { url = "https://files.pythonhosted.org/packages/f6/56/6113c23ff46c00aae423333eb58b3e60bdfe9179d542781955a5e1514cb3/triton-3.6.0-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:46bd1c1af4b6704e554cad2eeb3b0a6513a980d470ccfa63189737340c7746a7", size = 188397994, upload-time = "2026-01-20T16:01:14.236Z" },
5526
+ ]
5527
+
5528
  [[package]]
5529
  name = "typer"
5530
  version = "0.24.1"