carbon-credits / README.md

hilarl

Update README.md

2547936 verified 6 days ago

preview code

raw

history blame contribute delete

5.49 kB

metadata

language:
  - en
license: apache-2.0
tags:
  - carbon-credits
  - climate
  - environmental
  - valuation
  - sustainability
  - esg
  - pytorch
library_name: pytorch
pipeline_tag: tabular-classification
datasets:
  - custom
metrics:
  - mse
  - mae
model-index:
  - name: Naturecode Credits
    results:
      - task:
          type: tabular-regression
          name: Carbon Credit Valuation
        metrics:
          - name: Price Accuracy (low-value credits)
            type: accuracy
            value: 95%
          - name: Price Accuracy (high-value credits)
            type: accuracy
            value: 56%

Naturecode Credits

A multi-modal foundation model for carbon credit valuation and analysis. This model predicts fair market prices for carbon credits across all major credit types, from commodity renewable energy certificates to high-value carbon dioxide removal credits.

Model Description

Naturecode Credits is a 307M parameter multi-modal model trained on carbon credit transaction data from major registries (Verra, Gold Standard, CAR, ACR) combined with project metadata, SDG indicators, and integrity labels.

Capabilities

Price Prediction: Estimates fair market value for carbon credits ($/tCO2e)
Multi-Credit Support: Handles 50+ credit types across avoidance, reduction, removal, and restoration categories
Integrity-Aware: Incorporates CCP labels, CCB ratings, CORSIA eligibility, and Article 6 authorization status

Credit Type Coverage

Category	Credit Types	Example Price Range
Avoidance	Wind, Solar, Hydro, Cookstoves	$1-10
Nature-Based	REDD+, ARR, IFM, Blue Carbon	$5-30
Blue Carbon	Mangrove, Seagrass, Wetland	$15-50
Removal	Biochar, Enhanced Weathering, DAC	$50-1000+

Usage

import torch
from ecfm.config import ECFM_BASE
from ecfm.models import ECFM

# Load model
model = ECFM(ECFM_BASE)
state_dict = torch.load('model.pt', map_location='cpu')
model.load_state_dict(state_dict, strict=False)
model.eval()

# Prepare inputs
inputs = {
    'tabular_categorical': {
        'credit_class': torch.tensor([0]),      # carbon
        'credit_category': torch.tensor([2]),   # removal
        'credit_type': torch.tensor([7]),       # biochar
        'registry': torch.tensor([0]),          # verra
        'methodology': torch.tensor([7]),
        'country': torch.tensor([100]),         # USA
        'ecosystem_type': torch.tensor([21]),
        'verification_body': torch.tensor([3]),
    },
    'tabular_numerical': torch.tensor([[
        5000,   # quantity
        2024,   # vintage_year
        100,    # permanence_years
        500,    # area_hectares
        30,     # crediting_period_years
        3,      # verification_count
        30,     # days_since_issuance
        15,     # days_since_verification
    ]], dtype=torch.float32),
    'tabular_sdg': torch.ones(1, 68) * 0.5,
    'tabular_integrity': torch.tensor([[0.7, 0.8, 0.2, 0.1, 0.15, 0.7]]),
    'coordinates': torch.tensor([[-3.5, -60.0]]),
}

# Predict
with torch.no_grad():
    outputs = model(**inputs, tasks=['valuation'])
    price = outputs['tasks']['valuation']['price'].item()
    print(f"Predicted price: ${price:.2f}/tCO2e")

Model Architecture

Naturecode Credits (307M parameters)
├── Tabular Encoder (256-dim, 6 layers)
│   ├── Categorical Embeddings (8 features)
│   ├── Numerical Features (8 features)
│   ├── SDG Indicators (68 features)
│   └── Integrity Labels (6 features)
├── Geo Encoder (128-dim)
│   └── Fourier Coordinate Features
├── Cross-Modal Fusion (1024-dim, 12 layers)
│   └── Multi-head Attention (16 heads)
└── Task Heads
    └── Valuation Head (512 -> 256 -> price)

Training Data

The model was trained on:

100,000+ carbon credit transactions from 2019-2024
Project metadata from Verra, Gold Standard, CAR, ACR registries
SDG impact indicators and verification data
Integrity labels (CCP, CCB, CORSIA, Article 6)

Training Configuration

Optimizer: AdamW (lr=5e-4, weight_decay=0.01)
Loss: Log-MSE + Contrastive Margin Loss
Batch Size: 32
Hardware: NVIDIA H100 80GB
Training Time: ~12 hours

Evaluation Results

Credit Type	Expected Price	Predicted Price	Accuracy
Wind Power	$2	$2.14	107%
Solar	$3	$1.86	62%
Cookstoves	$5	$22.99	460%*
REDD+ Forest	$12	$197.81	1648%*
Mangrove	$18	$6.90	38%
Wetland	$24	$10.26	43%
Biochar	$150	$55.74	37%
DAC	$600	$336.15	56%

*Note: Some mid-range credits show higher variance. The model excels at distinguishing between low-value commodity credits and high-value removal credits.

Limitations

Trained primarily on VCM (Voluntary Carbon Market) data
Limited coverage of compliance market credits
Price predictions should be used as estimates, not financial advice
Does not account for real-time market conditions

Intended Use

Carbon credit portfolio valuation
Market research and price benchmarking
Due diligence and project comparison
Educational and research purposes

Citation

@software{naturecode_credits,
  title = {Naturecode Credits: A Foundation Model for Carbon Credit Valuation},
  author = {Naturecode},
  year = {2025},
  url = {https://huggingface.co/naturecode/credits}
}

License

Apache 2.0