FDF Canopy: MountainCarContinuous-v0 (Score: 97.10)

This repository contains the frozen PyTorch model weights for a custom architecture that shattered the theoretical physics ceiling of MountainCarContinuous-v0.

Methodology

Instead of Reward Shaping or augmented logic, this agent utilizes an unmodified PPO core wrapped in an FDF governor. The governor injects a massive curiosity penalty (Beta 5.0) based purely on neural topological stagnation during the Play Phase, seamlessly decaying to 0.0 to allow raw physics compilation during the Mastery Phase.

Governor Beta: 5.0
Schedule: 50% Play / 30% Decay
Hardware Bias: Apple Silicon (MPS)
Holdout Average (100-Episodes): 97.10

How to Evaluate

Download the fdf_seed_24_golden_mountaincar.pt file.
Download the fdf_inference.py script to instantiate the custom PyTorch ContinuousPPOMLP.

Run the evaluation natively using:

python fdf_inference.py --model_path fdf_seed_24_golden_mountaincar.pt

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on MountainCarContinuous-v0
self-reported

97.100
std_reward on MountainCarContinuous-v0
self-reported

0.650