metadata
tags:
- reinforcement-learning
- pytorch
- custom-implementation
- MountainCarContinuous-v0
pipeline_tag: reinforcement-learning
library_name: PyTorch
model-index:
- name: FDF-MountainCarContinuous-v0
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: MountainCarContinuous-v0
type: MountainCarContinuous-v0
metrics:
- type: mean_reward
value: 97.1
name: mean_reward
- type: std_reward
value: 0.65
name: std_reward
FDF Canopy: MountainCarContinuous-v0 (Score: 97.10)
This repository contains the frozen PyTorch model weights for a custom architecture that shattered the theoretical physics ceiling of MountainCarContinuous-v0.
Methodology
Instead of Reward Shaping or augmented logic, this agent utilizes an unmodified PPO core wrapped in an FDF governor. The governor injects a massive curiosity penalty (Beta 5.0) based purely on neural topological stagnation during the Play Phase, seamlessly decaying to 0.0 to allow raw physics compilation during the Mastery Phase.
- Governor Beta: 5.0
- Schedule: 50% Play / 30% Decay
- Hardware Bias: Apple Silicon (MPS)
- Holdout Average (100-Episodes): 97.10
How to Evaluate
- Download the
fdf_seed_24_golden_mountaincar.ptfile. - Download the
fdf_inference.pyscript to instantiate the custom PyTorchContinuousPPOMLP. - Run the evaluation natively using:
python fdf_inference.py --model_path fdf_seed_24_golden_mountaincar.pt