LUNDECHEN
/

space-mining-ppo

Model card Files Files and versions

xet

Community

LUNDECHEN commited on Sep 3, 2025

Commit

1bcfe99

verified ·

1 Parent(s): 90c2003

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +69 -0

README.md ADDED Viewed

	@@ -0,0 +1,69 @@

+# SpaceMining PPO Agent
+A PPO agent trained on the SpaceMining Gymnasium environment. This repository includes the final Stable-Baselines3 checkpoint, configuration, and evaluation metrics.
+## Model Description
+- Algorithm: PPO (Stable-Baselines3)
+- Environment: SpaceMining (Gymnasium)
+- Action Space: Box(3,) — thrust x, thrust y, mine toggle
+- Observation Space: Box(53,) — agent state, nearby asteroids (up to 15), mothership relative position
+## Quickstart
+```python
+from huggingface_hub import hf_hub_download
+from stable_baselines3 import PPO
+from space_mining import make_env
+ckpt_path = hf_hub_download(repo_id="LUNDECHEN/space-mining-ppo", filename="final_model.zip")
+model = PPO.load(ckpt_path)
+env = make_env(render_mode='rgb_array')
+obs, _ = env.reset()
+for _ in range(300):
+    # SB3 `predict` may return `(action, state, *extras)` depending on version.
+    prediction = model.predict(obs, deterministic=True)
+    action = prediction[0] if isinstance(prediction, (tuple, list)) else prediction
+    obs, reward, terminated, truncated, info = env.step(action)
+    if terminated or truncated:
+        break
+env.close()
+```
+## Training Configuration
+- See `hyperparams.json` (algorithm hyperparameters)
+- See `env_config.json` (environment parameters)
+- See `training_args.json` (timesteps, device, versions)
+## Evaluation
+- See `evaluation.json`
+| Metric        | Value |
+|---------------|-------|
+| mean_reward   | 1037.7470 |
+| std_reward    | 1449.5437 |
+| episodes      | 100 |
+## Agent Behavior
+![Agent in action](agent_long.gif)
+## License
+- MIT
+## Authors
+- Xinning Zhu (zhuxinning@shu.edu.cn)
+- Lunde Chen (lundechen@shu.edu.cn)
+## Training Details
+- **Training Steps**: 5,000,000
+- **Device**: cpu
+- **Model Type**: best
+- **GitHub Run**: [17421809264](https://github.com/reveurmichael/space_mining/actions/runs/17421809264)