LUNDECHEN
/

space-mining-ppo

Model card Files Files and versions

space-mining-ppo / README.md

LUNDECHEN's picture

Upload README.md with huggingface_hub

1bcfe99 verified 9 months ago

|

history blame contribute delete

1.87 kB

	# SpaceMining PPO Agent

	A PPO agent trained on the SpaceMining Gymnasium environment. This repository includes the final Stable-Baselines3 checkpoint, configuration, and evaluation metrics.

	## Model Description

	- Algorithm: PPO (Stable-Baselines3)
	- Environment: SpaceMining (Gymnasium)
	- Action Space: Box(3,) — thrust x, thrust y, mine toggle
	- Observation Space: Box(53,) — agent state, nearby asteroids (up to 15), mothership relative position

	## Quickstart

	```python
	from huggingface_hub import hf_hub_download
	from stable_baselines3 import PPO
	from space_mining import make_env

	ckpt_path = hf_hub_download(repo_id="LUNDECHEN/space-mining-ppo", filename="final_model.zip")
	model = PPO.load(ckpt_path)

	env = make_env(render_mode='rgb_array')
	obs, _ = env.reset()
	for _ in range(300):
	# SB3 `predict` may return `(action, state, *extras)` depending on version.
	prediction = model.predict(obs, deterministic=True)
	action = prediction[0] if isinstance(prediction, (tuple, list)) else prediction
	obs, reward, terminated, truncated, info = env.step(action)
	if terminated or truncated:
	break
	env.close()
	```

	## Training Configuration

	- See `hyperparams.json` (algorithm hyperparameters)
	- See `env_config.json` (environment parameters)
	- See `training_args.json` (timesteps, device, versions)

	## Evaluation

	- See `evaluation.json`

	\| Metric \| Value \|
	\|---------------\|-------\|
	\| mean_reward \| 1037.7470 \|
	\| std_reward \| 1449.5437 \|
	\| episodes \| 100 \|

	## Agent Behavior

	![Agent in action](agent_long.gif)

	## License

	- MIT

	## Authors

	- Xinning Zhu (zhuxinning@shu.edu.cn)
	- Lunde Chen (lundechen@shu.edu.cn)


	## Training Details

	- Training Steps: 5,000,000
	- Device: cpu
	- Model Type: best
	- GitHub Run: [17421809264](https://github.com/reveurmichael/space_mining/actions/runs/17421809264)