--- tags: - reinforcement-learning - minecraft - stable-baselines3 - PPO - deep-reinforcement-learning library_name: stable-baselines3 model-index: - name: minecraft-learning-distributed_470k results: [] --- # minecraft-learning-distributed_470k A Minecraft RL agent trained with PPO (Proximal Policy Optimization) using Stable-Baselines3. This agent was trained to gather resources in Minecraft. ## Training Details | Metric | Value | |--------|-------| | **Total Steps** | 483,923 | | **Episodes** | 56 | | **Mean Reward** | 0.64 | | **Best Reward** | 26.20 | | **Reward Scheme** | gathering | | **Learning Rate** | 0.0003 | ## Hardware - **Training:** NVIDIA RTX 5090 (32GB VRAM) - **Environment:** NVIDIA Jetson Orin AGX (64GB RAM) - **LLM Server:** NVIDIA DGX Spark - GPT-OSS-20B (vLLM) ## Architecture - **Algorithm:** PPO (Proximal Policy Optimization) - **Policy:** MLP with [512, 512] hidden layers - **Observation Space:** 82 dimensions (position, velocity, vitals, hotbar, craftable flags) - **Action Space:** 37 discrete actions (movement, mining, crafting, inventory) ## Usage ```python from huggingface_hub import hf_hub_download from stable_baselines3 import PPO # Download model hf_hub_download( repo_id='cahlen/minecraft-learning-distributed_470k', filename='model.zip', local_dir='./models' ) # Load and use model = PPO.load('./models/model.zip') # Run inference obs = env.reset() action, _ = model.predict(obs, deterministic=True) ``` ## Environment Setup This model was trained on a custom Minecraft environment using: - [Mineflayer](https://github.com/PrismarineJS/mineflayer) for bot control - Custom Gymnasium wrapper for RL interface - Vision features extracted from game data (not computer vision) ## Training Configuration ```python PPO( "MlpPolicy", env, learning_rate=1e-3, n_steps=256, batch_size=256, n_epochs=15, gamma=0.99, gae_lambda=0.95, ent_coef=0.02, clip_range=0.2, policy_kwargs={"net_arch": [512, 512]}, ) ``` ## License MIT ## Citation If you use this model, please cite: ```bibtex @misc{minecraft_learning_distributed_470k}, author = {cahlen}, title = {minecraft-learning-distributed_470k}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/cahlen/minecraft-learning-distributed_470k}} } ```