Model ID:Quantbridge/distilbert-energy-intelligence-multitask-v2
A domain-specific fine-tuned DistilBERT model for Named Entity Recognition across energy markets, financial instruments, geopolitics, corporate events, and technology. This is a broad-coverage multitask NER model designed for intelligence extraction from financial news and market commentary.
The model recognises 59 entity types (119 BIO labels including B-/I- prefixes) spanning multiple intelligence domains.
Entity Taxonomy
Financial Instruments & Markets
Label
Description
EQUITY
Stocks and equity instruments
DERIVATIVE
Futures, options, swaps
CURRENCY
FX pairs and currencies
FIXED_INCOME
Bonds, treasuries, notes
ASSET_CLASS
Broad asset class references
INDEX
Market indices (S&P 500, FTSE, etc.)
COMMODITY
Physical commodities (oil, gas, metals)
TRADING_HUB
Price benchmarks and trading hubs
Financial Institutions
Label
Description
FINANCIAL_INSTITUTION
Banks, brokerages, investment firms
CENTRAL_BANK
Central banks (Fed, ECB, BoE)
HEDGE_FUND
Hedge funds and asset managers
RATING_AGENCY
Credit rating agencies
EXCHANGE
Stock and commodity exchanges
Macro & Policy
Label
Description
MACRO_INDICATOR
GDP, inflation, unemployment figures
MONETARY_POLICY
Interest rate decisions, QE programmes
FISCAL_POLICY
Government spending, tax policy
TRADE_POLICY
Tariffs, trade agreements, WTO actions
ECONOMIC_BLOC
G7, G20, EU, ASEAN, etc.
Energy Domain
Label
Description
ENERGY_COMPANY
Oil majors, utilities, renewable firms
ENERGY_SOURCE
Oil, gas, coal, solar, nuclear, etc.
PIPELINE
Energy pipelines and transmission lines
REFINERY
Oil refineries and processing plants
ENERGY_POLICY
OPEC decisions, energy legislation
ENERGY_TRANSITION
Decarbonisation, net-zero, EV, hydrogen
GRID
Power grids and electricity networks
Geopolitical
Label
Description
GEOPOLITICAL_EVENT
Summits, elections, geopolitical shifts
SANCTION
Economic sanctions and embargoes
TREATY
International agreements and accords
CONFLICT_ZONE
Active or historic conflict regions
DIPLOMATIC_ACTION
Diplomatic moves, expulsions, negotiations
COUNTRY
Nation states
REGION
Geographic regions (Middle East, EU, etc.)
CITY
Cities and urban locations
Corporate Events
Label
Description
COMPANY
General companies
M_AND_A
Mergers and acquisitions
IPO
Initial public offerings
EARNINGS_EVENT
Quarterly earnings, revenue reports
EXECUTIVE
Named C-suite executives
CORPORATE_ACTION
Dividends, buybacks, restructuring
Infrastructure & Supply Chain
Label
Description
INFRA
Physical infrastructure (general)
SUPPLY_CHAIN
Supply chain disruptions and logistics
SHIPPING_VESSEL
Named ships and tankers
PORT
Ports and maritime hubs
Risk & Events
Label
Description
EVENT
General newsworthy events
RISK_FACTOR
Risk factors and vulnerabilities
NATURAL_DISASTER
Hurricanes, earthquakes, floods
CYBER_EVENT
Cyber attacks and digital incidents
DISRUPTION
Supply or market disruptions
Technology
Label
Description
TECH_COMPANY
Technology companies
AI_MODEL
AI systems and models
SEMICONDUCTOR
Chips and semiconductor companies
TECH_REGULATION
Technology regulation and policy
People & Organizations
Label
Description
PERSON
Named individuals
THINK_TANK
Policy research organizations
NEWS_SOURCE
Media and news outlets
REGULATORY_BODY
Government regulators (SEC, FCA, etc.)
ORG
General organizations
Usage
from transformers import pipeline
ner = pipeline(
"token-classification",
model="Quantbridge/distilbert-energy-intelligence-multitask-v2",
aggregation_strategy="simple",
)
text = (
"The Federal Reserve held interest rates steady as Brent crude fell below $75 ""following OPEC+ production cuts and renewed sanctions on Russian energy exports."
)
results = ner(text)
for entity in results:
print(f"{entity['word']:<35}{entity['entity_group']:<25}{entity['score']:.3f}")
Example output:
Federal Reserve CENTRAL_BANK 0.961
Brent TRADING_HUB 0.954
OPEC+ REGULATORY_BODY 0.947
Russian energy exports SANCTION 0.932
Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
model_name = "Quantbridge/distilbert-energy-intelligence-multitask-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
model.eval()
text = "Goldman Sachs cut its oil price forecast after OPEC+ agreed to extend output cuts."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predicted_ids = outputs.logits.argmax(dim=-1)[0]
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
for token, label_id inzip(tokens, predicted_ids):
label = model.config.id2label[label_id.item()]
if label != "O"andnot token.startswith("["):
print(f"{token.lstrip('##'):<25}{label}")
Model Details
Property
Value
Base architecture
distilbert-base-uncased
Architecture type
DistilBertForTokenClassification
Entity types
59 types (119 BIO labels)
Hidden dimension
768
Attention heads
12
Layers
6
Vocabulary size
30,522
Max sequence length
512 tokens
Intended Use
This model is designed for financial and energy intelligence extraction — automated NER over news feeds, earnings transcripts, regulatory filings, and geopolitical reports. It is a base model suitable for:
Structured data extraction from unstructured financial news
Entity linking and knowledge graph population
Signal detection for trading and risk systems
Geopolitical risk monitoring
Out-of-scope use
General-purpose NER on non-financial text
Languages other than English
Documents with heavy technical jargon outside the financial/energy domain
Limitations
English-only
Optimised for news-style formal writing; may underperform on social media or informal text
59-label taxonomy may produce overlapping predictions for ambiguous entities (e.g. a company that is also an energy company)