TD3 Agent for LunarLander-v3

This is a trained TD3 (Twin Delayed Deep Deterministic Policy Gradient) agent for the LunarLander-v3 environment.

Model Details

Algorithm: Twin Delayed Deep Deterministic Policy Gradient (TD3)
Environment: LunarLander-v3
Framework: PyTorch
Device: cuda

Training Information

Total Timesteps: 1,000,000
Buffer Size: 10,000
Batch Size: 256
Learning Rates:
- Actor: 1e-4
- Critic: 1e-3

Evaluation Results

Average Reward: 284.91 ± 16.67
Min Reward: 245.55
Max Reward: 323.18
Evaluation Episodes: 100

How to Use

import torch
from td3_agent import TD3Agent
import gymnasium as gym

# Load model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load("pytorch_model.pth", map_location=device)

# Create agent and environment
env = gym.make("LunarLander-v3")
agent = TD3Agent(obs_dim=8, action_dim=4, max_action=1.0, hyperparameters={}, device=device)
agent.actor.load_state_dict(model['actor_state_dict'])

# Test agent
state, _ = env.reset()
while True:
    action = agent.select_action(state, eval_mode=True)
    state, reward, terminated, truncated, _ = env.step(action)
    if terminated or truncated:
        break

Credits

Trained by Basem Elgalfy as part of Assignment 4 in Reinforcement Learning.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning