Energy Intelligence Multitask Model

QuantBridge / energy-intelligence-multitask

A single DistilBERT model with a shared encoder and two task heads for energy and financial news analysis. One forward pass returns both named entities and topic labels simultaneously.

Head	Task	Output shape
NER	Named entity recognition (BIO)	`(batch, seq_len, 19)`
CLS	Multi-label topic classification	`(batch, 10)`

Architecture

Input headline
      |
BertTokenizer  (do_lower_case=True, max_length=128)
      |
DistilBERT encoder  (6 layers · 768 dim · 12 heads · ~67M params)
[weights from QuantBridge/energy-intelligence-multitask-custom-ner]
      |
      +──────────────────────────────────────────┐
      |                                          |
  all token hidden states                   [CLS] hidden state
      |                                          |
  Dropout(0.1)                         Linear(768→768) + ReLU
      |                                     Dropout(0.2)
  Linear(768→19)                        Linear(768→10)
      |                                          |
  NER logits                             CLS logits
  argmax → BIO entity tags           sigmoid → topic probabilities

NER Label Space — 19 BIO tags

Entity Type	Example extractions from test set
`COMPANY`	ExxonMobil, Gazprom, Maersk, Shell, Chevron, BP, Equinor
`ORGANIZATION`	OPEC+, US Treasury, Federal Reserve, IMF, FERC, IAEA
`COUNTRY`	Saudi Arabia, Russia, China, Iran, Venezuela, Germany
`COMMODITY`	crude oil, natural gas, LNG, methane, aluminum, hydrogen
`LOCATION`	Strait of Hormuz, Red Sea, Gulf of Mexico, North Sea, Kollsnes
`MARKET`	S&P 500, Brent, WTI
`EVENT`	Hurricane Ida, Houthi attacks
`PERSON`	Elon Musk, Jerome Powell
`INFRASTRUCTURE`	pipelines, refineries, terminals

Each type uses standard BIO tagging: B-<TYPE> starts a span, I-<TYPE> continues it, O marks non-entities.

Classification Label Space — 10 topic labels

Label	Description	Avg score (test set)
`macro`	GDP, inflation, central bank policy	0.323
`politics`	Government policy, sanctions, diplomacy	0.307
`business`	Corporate earnings, M&A, operations	0.219
`technology`	Tech, innovation, clean-tech	0.155
`energy`	Oil, gas, renewables, power grid	0.070
`trade`	Tariffs, import/export, agreements	0.046
`shipping`	Maritime logistics, ports	0.038
`stocks`	Equity markets, share prices	0.015
`regulation`	Compliance, legislation, rules	0.013
`risk`	Crises, geopolitical tension	0.013

Note on classification scores: The classification head was trained on AG News + Reuters + Kaggle — datasets dominated by general business and macro content. Domain-specific labels (energy, shipping, risk, regulation, stocks) score lower as a result. The relative ranking of scores is semantically meaningful even when raw values are low. See Limitations.

Test Results

Evaluated on 40 real-world energy & financial news headlines across 9 domain groups (ENERGY, GEOPOLITICAL, SHIPPING, TRADE, MACRO, CORPORATE, REGULATION, TECHNOLOGY, STOCKS, RISK).

NER Results

Metric	Value
Total entities detected	86 across 40 headlines
Average entities per headline	2.1
Entity types fired	7 / 9

Entity type frequency:

Entity Type	Detections	Example extractions
COMMODITY	20	oil production, crude, LNG, natural gas, aluminum, methane, hydrogen
COUNTRY	19	Saudi Arabia, Russia, China, Iran, Venezuela, Poland, Bulgaria, UK
ORGANIZATION	15	OPEC+, US Treasury, Federal Reserve, IMF, G7, FERC, IAEA, SEC
COMPANY	15	ExxonMobil, Gazprom, Maersk, Shell, Chevron, Equinor, BP, Tesla, Vestas
LOCATION	14	Kollsnes, Strait of Hormuz, Red Sea, Panama Canal, North Sea, Gulf of Mexico
EVENT	2	Hurricane Ida, Houthi (attacks)
MARKET	1	S&P 500
PERSON	0	— (not fired on this test set)
INFRASTRUCTURE	0	— (not fired on this test set)

Key NER observations:

COMMODITY is the top entity type — the model reliably extracts energy goods (oil, crude, LNG, natural gas, hydrogen) and commodities (aluminum, solar panels)
COUNTRY and ORGANIZATION fire consistently across all domain groups
COMPANY detection is accurate: correctly identifies both energy majors (ExxonMobil, Shell, BP) and non-energy companies (Tesla, Maersk, Vestas)
LOCATION captures geopolitically important hotspots correctly (Red Sea, Strait of Hormuz, Gulf of Mexico, North Sea)
MARKET fires on "S&P 500" but misses "Brent" and "WTI" — likely a tokenisation artefact where these are split sub-words during BertTokenizer processing
PERSON and INFRASTRUCTURE did not fire on this specific test set; these types are present in the model's label vocabulary and will activate on appropriate inputs

Classification Results (threshold = 0.20)

Label activation frequency across 40 headlines:

Label	Active headlines	%	Avg score
macro	14 / 40	35%	0.323
politics	9 / 40	22%	0.307
business	1 / 40	2%	0.219
technology	0 / 40	0%	0.155
energy	0 / 40	0%	0.070
trade	0 / 40	0%	0.046
shipping	0 / 40	0%	0.038
stocks	0 / 40	0%	0.015
regulation	0 / 40	0%	0.013
risk	0 / 40	0%	0.013

Domain-group heatmap (>>> = group average score ≥ 0.35):

Domain group	energy	politics	trade	stocks	regulation	shipping	macro	business	technology	risk
ENERGY	0.09	0.28	0.07	0.02	0.02	0.05	>>>	0.26	0.17	0.02
GEOPOLITICAL	0.06	0.30	0.04	0.01	0.01	0.03	0.30	0.19	0.12	0.01
SHIPPING	0.07	0.31	0.04	0.01	0.01	0.05	0.28	0.23	0.14	0.01
TRADE	0.06	0.30	0.05	0.01	0.01	0.04	0.31	0.23	0.17	0.01
MACRO	0.07	0.30	0.06	0.02	0.02	0.04	>>>	0.22	0.17	0.02
CORPORATE	0.09	0.33	0.05	0.02	0.02	0.04	>>>	0.23	0.16	0.02
REGULATION	0.04	0.26	0.03	0.01	0.01	0.02	0.32	0.26	0.18	0.01
TECHNOLOGY	0.07	>>>	0.04	0.01	0.01	0.04	0.28	0.17	0.14	0.01
STOCKS	0.08	0.30	0.04	0.01	0.01	0.03	0.32	0.21	0.16	0.01
RISK	0.08	0.32	0.04	0.01	0.01	0.04	0.34	0.19	0.14	0.01

Key classification observations:

macro is the dominant label across all domain groups — a direct consequence of training data composition (AG News World category and Kaggle both map heavily to macro)
politics fires on TECHNOLOGY and GEOPOLITICAL groups, which is semantically reasonable (government energy policy, sanctions)
Domain-specific labels (energy, shipping, risk, regulation, stocks) score consistently low — these categories are underrepresented in training data
The score ranking is meaningful: for ENERGY headlines, energy consistently ranks above trade, shipping, and regulation even when below threshold — the model has learned the correct relative associations

Usage

Important: This model uses custom architecture files. Always pass trust_remote_code=True.

Installation

pip install transformers torch

Full inference — NER + Classification

import torch
import numpy as np
from transformers import AutoTokenizer, AutoConfig, AutoModel

MODEL_ID = "QuantBridge/energy-intelligence-multitask"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model     = AutoModel.from_pretrained(MODEL_ID, trust_remote_code=True)
model.eval()

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def predict(text: str, cls_threshold: float = 0.20):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    inputs.pop("token_type_ids", None)   # DistilBERT does not use these

    with torch.no_grad():
        output = model(**inputs)

    tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])

    # ── Named Entity Recognition ──────────────────────────────────────────
    ner_id2label = {int(k): v for k, v in model.config.ner_id2label.items()}
    tag_ids = output.ner_logits[0].argmax(-1).tolist()

    entities = []
    current = None
    for token, tag_id in zip(tokens, tag_ids):
        if token in ("[CLS]", "[SEP]", "[PAD]"):
            if current: entities.append(current); current = None
            continue
        tag = ner_id2label[tag_id]
        if tag.startswith("B-"):
            if current: entities.append(current)
            current = {"text": token.replace("##", ""), "type": tag[2:]}
        elif tag.startswith("I-") and current:
            current["text"] += token[2:] if token.startswith("##") else f" {token}"
        else:
            if current: entities.append(current); current = None
    if current: entities.append(current)

    # ── Topic Classification ──────────────────────────────────────────────
    cls_id2label = {int(k): v for k, v in model.config.cls_id2label.items()}
    probs  = sigmoid(output.cls_logits[0].numpy())
    topics = {cls_id2label[i]: float(probs[i]) for i in range(len(probs))}
    active = {lbl: p for lbl, p in topics.items() if p >= cls_threshold}

    return entities, active

# Example
headline = "Russia cuts natural gas flows to Poland and Bulgaria following payment dispute"
entities, topics = predict(headline)

print("Entities found:")
for e in entities:
    print(f"  [{e['type']}]  {e['text']}")

print("\nActive topic labels:")
for topic, score in sorted(topics.items(), key=lambda x: -x[1]):
    print(f"  {topic}: {score:.3f}")

Expected output:

Entities found:
  [COUNTRY]   Russia
  [COUNTRY]   Poland
  [COUNTRY]   Bulgaria
  [COMMODITY] natural gas

Active topic labels:
  politics: 0.362
  macro: 0.357

NER only — decode all entity spans

def get_entities(text: str) -> list[dict]:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    inputs.pop("token_type_ids", None)
    with torch.no_grad():
        output = model(**inputs)
    tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
    ner_id2label = {int(k): v for k, v in model.config.ner_id2label.items()}
    tag_ids = output.ner_logits[0].argmax(-1).tolist()
    # ... (decode as shown above)

Classification only — get all label scores

def get_topic_scores(text: str) -> dict[str, float]:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    inputs.pop("token_type_ids", None)
    with torch.no_grad():
        output = model(**inputs)
    cls_id2label = {int(k): v for k, v in model.config.cls_id2label.items()}
    probs = sigmoid(output.cls_logits[0].numpy())
    return {cls_id2label[i]: float(probs[i]) for i in range(len(probs))}

Training Details

Encoder

Transferred from QuantBridge/energy-intelligence-multitask-custom-ner — DistilBERT fine-tuned on energy and financial news for BIO entity recognition.

NER Head

Weights transferred directly from the NER backbone (classifier.* → ner_classifier.*). No additional NER training was performed.

Classification Head

Trained separately from scratch on a merged corpus:

Source	HF / NLTK id	Categories used	Mapped to
AG News	`ag_news`	World (0), Business (2), Sci/Tech (3)	`macro`, `business`, `technology`
Reuters-21578	`nltk.corpus.reuters`	crude, gas, ship, trade, money-fx, interest, earn, acq	`energy`, `shipping`, `trade`, `macro`, `business`
Kaggle News Category	`rmisra/news-category-dataset`	POLITICS, BUSINESS, TECH, WORLD NEWS	`politics`, `business`, `technology`, `macro`

Training split: 80% train / 10% validation / 10% test, seed 42.

Hyperparameters:

Parameter	Value
Epochs	10
Train batch size	32
Learning rate	2e-5
Warmup steps	500
Weight decay	0.01
Max sequence length	128 tokens
Loss	BCEWithLogitsLoss
Best checkpoint selected by	micro-F1 on validation set
Hardware	NVIDIA T4 16 GB

Model Files

energy-intelligence-multitask/
  configuration_energy_multitask.py   # EnergyMultitaskConfig (DistilBertConfig subclass)
  modeling_energy_multitask.py        # EnergyMultitaskModel  (two-head architecture)
  config.json                         # Serialised config with auto_map
  model.safetensors                   # Combined weights (~256 MB)
  tokenizer.json                      # Fast tokenizer
  tokenizer_config.json               # Tokenizer settings

Limitations

English only — trained exclusively on English-language news text.
Classification data bias — training corpora (AG News, Kaggle) are dominated by business and macro content. Domain-specific labels (energy, shipping, risk, regulation, stocks) score lower across the board and may not cross common thresholds even when semantically correct. A recommended threshold for this model is 0.20 rather than the default 0.50.
NER on headlines — the NER head was fine-tuned on short news headlines; performance may be lower on long-form documents.
Max length — inputs are truncated to 128 tokens. Longer texts should be chunked.
PERSON / INFRASTRUCTURE — these entity types exist in the label vocabulary but fired less frequently on financial news headlines compared to COMPANY, COUNTRY, and COMMODITY.
Not for trading — this model is intended as an intelligence tagging layer, not for real-time trading or financial decision-making.

Intended Use

This model is the tagging layer in an energy intelligence pipeline:

Raw news headline
       ↓
EnergyMultitaskModel (this model)
       ↓
  entities  ──────────────────→  who / what / where
  topic labels  ──────────────→  energy / risk / trade / macro / ...
       ↓
Structured intelligence signal for downstream analysis

Related Models

QuantBridge/energy-intelligence-multitask-custom-ner — NER backbone (encoder source)
QuantBridge/energy-news-classifier-ner-multitask — Classification-only model (single head)

Citation

@misc{quantbridge2025multitask,
  author       = {QuantBridge},
  title        = {Energy Intelligence Multitask Model (NER + Classification)},
  year         = {2025},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/QuantBridge/energy-intelligence-multitask}},
}

Downloads last month: 6

Safetensors

Model size

67M params

Tensor type

F32

Model tree for QuantBridge/energy-news-classifier-ner-multitask

Base model

QuantBridge/energy-intelligence-multitask-custom-ner

Finetuned

(1)

this model

QuantBridge
/

energy-news-classifier-ner-multitask