FDF Canopy: MountainCarContinuous-v0 (Score: 97.10)
This repository contains the frozen PyTorch model weights for a custom architecture that shattered the theoretical physics ceiling of MountainCarContinuous-v0.
Methodology
Instead of Reward Shaping or augmented logic, this agent utilizes an unmodified PPO core wrapped in an FDF governor. The governor injects a massive curiosity penalty (Beta 5.0) based purely on neural topological stagnation during the Play Phase, seamlessly decaying to 0.0 to allow raw physics compilation during the Mastery Phase.
- Governor Beta: 5.0
- Schedule: 50% Play / 30% Decay
- Hardware Bias: Apple Silicon (MPS)
- Holdout Average (100-Episodes): 97.10
How to Evaluate
- Download the
fdf_seed_24_golden_mountaincar.ptfile. - Download the
fdf_inference.pyscript to instantiate the custom PyTorchContinuousPPOMLP. - Run the evaluation natively using:
python fdf_inference.py --model_path fdf_seed_24_golden_mountaincar.pt
Evaluation results
- mean_reward on MountainCarContinuous-v0self-reported97.100
- std_reward on MountainCarContinuous-v0self-reported0.650