---
tags:
- reinforcement-learning
- minecraft
- stable-baselines3
- PPO
- deep-reinforcement-learning
library_name: stable-baselines3
model-index:
- name: minecraft-learning-distributed_470k
  results: []
---

# minecraft-learning-distributed_470k

A Minecraft RL agent trained with PPO (Proximal Policy Optimization) using Stable-Baselines3.

This agent was trained to gather resources in Minecraft.

## Training Details

| Metric | Value |
|--------|-------|
| **Total Steps** | 483,923 |
| **Episodes** | 56 |
| **Mean Reward** | 0.64 |
| **Best Reward** | 26.20 |
| **Reward Scheme** | gathering |
| **Learning Rate** | 0.0003 |

## Hardware

- **Training:** NVIDIA RTX 5090 (32GB VRAM)
- **Environment:** NVIDIA Jetson Orin AGX (64GB RAM)
- **LLM Server:** NVIDIA DGX Spark - GPT-OSS-20B (vLLM)

## Architecture

- **Algorithm:** PPO (Proximal Policy Optimization)
- **Policy:** MLP with [512, 512] hidden layers
- **Observation Space:** 82 dimensions (position, velocity, vitals, hotbar, craftable flags)
- **Action Space:** 37 discrete actions (movement, mining, crafting, inventory)

## Usage

```python
from huggingface_hub import hf_hub_download
from stable_baselines3 import PPO

# Download model
hf_hub_download(
    repo_id='cahlen/minecraft-learning-distributed_470k',
    filename='model.zip',
    local_dir='./models'
)

# Load and use
model = PPO.load('./models/model.zip')

# Run inference
obs = env.reset()
action, _ = model.predict(obs, deterministic=True)
```

## Environment Setup

This model was trained on a custom Minecraft environment using:
- [Mineflayer](https://github.com/PrismarineJS/mineflayer) for bot control
- Custom Gymnasium wrapper for RL interface
- Vision features extracted from game data (not computer vision)

## Training Configuration

```python
PPO(
    "MlpPolicy",
    env,
    learning_rate=1e-3,
    n_steps=256,
    batch_size=256,
    n_epochs=15,
    gamma=0.99,
    gae_lambda=0.95,
    ent_coef=0.02,
    clip_range=0.2,
    policy_kwargs={"net_arch": [512, 512]},
)
```

## License

MIT

## Citation

If you use this model, please cite:

```bibtex
@misc{minecraft_learning_distributed_470k},
  author = {cahlen},
  title = {minecraft-learning-distributed_470k},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/cahlen/minecraft-learning-distributed_470k}}
}
```