Instructions to use aiacontext/marginplay with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use aiacontext/marginplay with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir marginplay aiacontext/marginplay
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Margin Play — Trained Checkpoints
Website: marginplay.app · Code: github.com/aiacontext/marginplay
Pretrained MADDPG weights for the Margin Play multi-agent reinforcement learning simulation of oil exploration in the Brazilian Equatorial Margin.
Contents
This repository holds the final policy weights of the sweep_v6_6sc_10k training run (10,000 episodes per scenario, 6 scenarios, 6 agents).
Scenarios (6)
| Key | Description |
|---|---|
referencia |
Reference baseline |
otimista |
Optimistic price/discovery assumptions |
pessimista |
Pessimistic price/discovery assumptions |
choque_brent |
Brent oil price shock |
ma_prospero |
Prosperous Maranhão variant |
sem_lei12858 |
Without Law 12,858 (royalty redistribution removed) |
Agents (6)
operadora, anp, ibama, gov_federal, gov_estadual, comunidade.
Files
For each scenario × agent there is:
{scenario}_actor_{agent}.npz— Actor (policy) MLP weights{scenario}_critic_{agent}.npz— Critic (Q-function) MLP weights
Plus per-scenario episode logs:
{scenario}_episodes.parquet— Step-level state/action/reward logsweep_summary.parquet— Aggregated metrics across the sweep
Usage
pip install huggingface-hub
hf download aiacontext/marginplay --local-dir models/
Then load with the Margin Play codebase:
import numpy as np
from agents.networks import Actor
from agents.definitions import SPECS, state_dim_global
weights = np.load("models/referencia_actor_operadora.npz")
actor = Actor(state_dim=state_dim_global(), action_dim=SPECS["operadora"].action_dim)
actor.load_weights(list(weights.items()))
With explore=False the policy is deterministic — same scenario seed and intervention log produce reproducible trajectories.
Training
- Algorithm: MADDPG (Multi-Agent DDPG) with target networks and OU exploration noise
- Framework: MLX on Apple Silicon
- Episodes per scenario: 10,000
- See training scripts for full configuration
Citation
If you use these weights or the Margin Play simulator in academic work, please cite:
@unpublished{leitaofilho2026marginplay,
title = {Margin Play: A Multi-Agent System for Public Policy Analysis
in the Brazilian Equatorial Margin},
author = {Leit{\~a}o Filho, Antonio de Sousa and
Lima, Fabr{\'\i}cio Saul and
Santos, Selby Mykael Lima dos and
Sousa, Rejani Bandeira Vieira and
Jesus, Lu{\'\i}s Jorge Mesquita de and
Silva, Dennys Correia da and
Barros Filho, Allan Kardec Duailibe},
year = {2026},
note = {Manuscript in preparation},
}
Plain text:
Leitão Filho, A. S., Lima, F. S., Santos, S. M. L. dos, Sousa, R. B. V., Jesus, L. J. M. de, Silva, D. C. da, & Barros Filho, A. K. D. (2026). Margin Play: A Multi-Agent System for Public Policy Analysis in the Brazilian Equatorial Margin. Manuscript in preparation.
Corresponding author: Antonio de Sousa Leitão Filho — antonio@aiacontext.com — ORCID 0009-0002-1705-3611.
License
MIT.