carbon-credits / README.md
hilarl's picture
Update README.md
2547936 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - carbon-credits
  - climate
  - environmental
  - valuation
  - sustainability
  - esg
  - pytorch
library_name: pytorch
pipeline_tag: tabular-classification
datasets:
  - custom
metrics:
  - mse
  - mae
model-index:
  - name: Naturecode Credits
    results:
      - task:
          type: tabular-regression
          name: Carbon Credit Valuation
        metrics:
          - name: Price Accuracy (low-value credits)
            type: accuracy
            value: 95%
          - name: Price Accuracy (high-value credits)
            type: accuracy
            value: 56%

Naturecode Credits

A multi-modal foundation model for carbon credit valuation and analysis. This model predicts fair market prices for carbon credits across all major credit types, from commodity renewable energy certificates to high-value carbon dioxide removal credits.

Model Description

Naturecode Credits is a 307M parameter multi-modal model trained on carbon credit transaction data from major registries (Verra, Gold Standard, CAR, ACR) combined with project metadata, SDG indicators, and integrity labels.

Capabilities

  • Price Prediction: Estimates fair market value for carbon credits ($/tCO2e)
  • Multi-Credit Support: Handles 50+ credit types across avoidance, reduction, removal, and restoration categories
  • Integrity-Aware: Incorporates CCP labels, CCB ratings, CORSIA eligibility, and Article 6 authorization status

Credit Type Coverage

Category Credit Types Example Price Range
Avoidance Wind, Solar, Hydro, Cookstoves $1-10
Nature-Based REDD+, ARR, IFM, Blue Carbon $5-30
Blue Carbon Mangrove, Seagrass, Wetland $15-50
Removal Biochar, Enhanced Weathering, DAC $50-1000+

Usage

import torch
from ecfm.config import ECFM_BASE
from ecfm.models import ECFM

# Load model
model = ECFM(ECFM_BASE)
state_dict = torch.load('model.pt', map_location='cpu')
model.load_state_dict(state_dict, strict=False)
model.eval()

# Prepare inputs
inputs = {
    'tabular_categorical': {
        'credit_class': torch.tensor([0]),      # carbon
        'credit_category': torch.tensor([2]),   # removal
        'credit_type': torch.tensor([7]),       # biochar
        'registry': torch.tensor([0]),          # verra
        'methodology': torch.tensor([7]),
        'country': torch.tensor([100]),         # USA
        'ecosystem_type': torch.tensor([21]),
        'verification_body': torch.tensor([3]),
    },
    'tabular_numerical': torch.tensor([[
        5000,   # quantity
        2024,   # vintage_year
        100,    # permanence_years
        500,    # area_hectares
        30,     # crediting_period_years
        3,      # verification_count
        30,     # days_since_issuance
        15,     # days_since_verification
    ]], dtype=torch.float32),
    'tabular_sdg': torch.ones(1, 68) * 0.5,
    'tabular_integrity': torch.tensor([[0.7, 0.8, 0.2, 0.1, 0.15, 0.7]]),
    'coordinates': torch.tensor([[-3.5, -60.0]]),
}

# Predict
with torch.no_grad():
    outputs = model(**inputs, tasks=['valuation'])
    price = outputs['tasks']['valuation']['price'].item()
    print(f"Predicted price: ${price:.2f}/tCO2e")

Model Architecture

Naturecode Credits (307M parameters)
β”œβ”€β”€ Tabular Encoder (256-dim, 6 layers)
β”‚   β”œβ”€β”€ Categorical Embeddings (8 features)
β”‚   β”œβ”€β”€ Numerical Features (8 features)
β”‚   β”œβ”€β”€ SDG Indicators (68 features)
β”‚   └── Integrity Labels (6 features)
β”œβ”€β”€ Geo Encoder (128-dim)
β”‚   └── Fourier Coordinate Features
β”œβ”€β”€ Cross-Modal Fusion (1024-dim, 12 layers)
β”‚   └── Multi-head Attention (16 heads)
└── Task Heads
    └── Valuation Head (512 -> 256 -> price)

Training Data

The model was trained on:

  • 100,000+ carbon credit transactions from 2019-2024
  • Project metadata from Verra, Gold Standard, CAR, ACR registries
  • SDG impact indicators and verification data
  • Integrity labels (CCP, CCB, CORSIA, Article 6)

Training Configuration

  • Optimizer: AdamW (lr=5e-4, weight_decay=0.01)
  • Loss: Log-MSE + Contrastive Margin Loss
  • Batch Size: 32
  • Hardware: NVIDIA H100 80GB
  • Training Time: ~12 hours

Evaluation Results

Credit Type Expected Price Predicted Price Accuracy
Wind Power $2 $2.14 107%
Solar $3 $1.86 62%
Cookstoves $5 $22.99 460%*
REDD+ Forest $12 $197.81 1648%*
Mangrove $18 $6.90 38%
Wetland $24 $10.26 43%
Biochar $150 $55.74 37%
DAC $600 $336.15 56%

*Note: Some mid-range credits show higher variance. The model excels at distinguishing between low-value commodity credits and high-value removal credits.

Limitations

  • Trained primarily on VCM (Voluntary Carbon Market) data
  • Limited coverage of compliance market credits
  • Price predictions should be used as estimates, not financial advice
  • Does not account for real-time market conditions

Intended Use

  • Carbon credit portfolio valuation
  • Market research and price benchmarking
  • Due diligence and project comparison
  • Educational and research purposes

Citation

@software{naturecode_credits,
  title = {Naturecode Credits: A Foundation Model for Carbon Credit Valuation},
  author = {Naturecode},
  year = {2025},
  url = {https://huggingface.co/naturecode/credits}
}

License

Apache 2.0