DistilBERT Energy Intelligence Multitask NER โ v2
Model ID: Quantbridge/distilbert-energy-intelligence-multitask-v2
A domain-specific fine-tuned DistilBERT model for Named Entity Recognition across energy markets, financial instruments, geopolitics, corporate events, and technology. This is a broad-coverage multitask NER model designed for intelligence extraction from financial news and market commentary.
The model recognises 59 entity types (119 BIO labels including B-/I- prefixes) spanning multiple intelligence domains.
Entity Taxonomy
Financial Instruments & Markets
| Label |
Description |
EQUITY |
Stocks and equity instruments |
DERIVATIVE |
Futures, options, swaps |
CURRENCY |
FX pairs and currencies |
FIXED_INCOME |
Bonds, treasuries, notes |
ASSET_CLASS |
Broad asset class references |
INDEX |
Market indices (S&P 500, FTSE, etc.) |
COMMODITY |
Physical commodities (oil, gas, metals) |
TRADING_HUB |
Price benchmarks and trading hubs |
Financial Institutions
| Label |
Description |
FINANCIAL_INSTITUTION |
Banks, brokerages, investment firms |
CENTRAL_BANK |
Central banks (Fed, ECB, BoE) |
HEDGE_FUND |
Hedge funds and asset managers |
RATING_AGENCY |
Credit rating agencies |
EXCHANGE |
Stock and commodity exchanges |
Macro & Policy
| Label |
Description |
MACRO_INDICATOR |
GDP, inflation, unemployment figures |
MONETARY_POLICY |
Interest rate decisions, QE programmes |
FISCAL_POLICY |
Government spending, tax policy |
TRADE_POLICY |
Tariffs, trade agreements, WTO actions |
ECONOMIC_BLOC |
G7, G20, EU, ASEAN, etc. |
Energy Domain
| Label |
Description |
ENERGY_COMPANY |
Oil majors, utilities, renewable firms |
ENERGY_SOURCE |
Oil, gas, coal, solar, nuclear, etc. |
PIPELINE |
Energy pipelines and transmission lines |
REFINERY |
Oil refineries and processing plants |
ENERGY_POLICY |
OPEC decisions, energy legislation |
ENERGY_TRANSITION |
Decarbonisation, net-zero, EV, hydrogen |
GRID |
Power grids and electricity networks |
Geopolitical
| Label |
Description |
GEOPOLITICAL_EVENT |
Summits, elections, geopolitical shifts |
SANCTION |
Economic sanctions and embargoes |
TREATY |
International agreements and accords |
CONFLICT_ZONE |
Active or historic conflict regions |
DIPLOMATIC_ACTION |
Diplomatic moves, expulsions, negotiations |
COUNTRY |
Nation states |
REGION |
Geographic regions (Middle East, EU, etc.) |
CITY |
Cities and urban locations |
Corporate Events
| Label |
Description |
COMPANY |
General companies |
M_AND_A |
Mergers and acquisitions |
IPO |
Initial public offerings |
EARNINGS_EVENT |
Quarterly earnings, revenue reports |
EXECUTIVE |
Named C-suite executives |
CORPORATE_ACTION |
Dividends, buybacks, restructuring |
Infrastructure & Supply Chain
| Label |
Description |
INFRA |
Physical infrastructure (general) |
SUPPLY_CHAIN |
Supply chain disruptions and logistics |
SHIPPING_VESSEL |
Named ships and tankers |
PORT |
Ports and maritime hubs |
Risk & Events
| Label |
Description |
EVENT |
General newsworthy events |
RISK_FACTOR |
Risk factors and vulnerabilities |
NATURAL_DISASTER |
Hurricanes, earthquakes, floods |
CYBER_EVENT |
Cyber attacks and digital incidents |
DISRUPTION |
Supply or market disruptions |
Technology
| Label |
Description |
TECH_COMPANY |
Technology companies |
AI_MODEL |
AI systems and models |
SEMICONDUCTOR |
Chips and semiconductor companies |
TECH_REGULATION |
Technology regulation and policy |
People & Organizations
| Label |
Description |
PERSON |
Named individuals |
THINK_TANK |
Policy research organizations |
NEWS_SOURCE |
Media and news outlets |
REGULATORY_BODY |
Government regulators (SEC, FCA, etc.) |
ORG |
General organizations |
Usage
from transformers import pipeline
ner = pipeline(
"token-classification",
model="Quantbridge/distilbert-energy-intelligence-multitask-v2",
aggregation_strategy="simple",
)
text = (
"The Federal Reserve held interest rates steady as Brent crude fell below $75 "
"following OPEC+ production cuts and renewed sanctions on Russian energy exports."
)
results = ner(text)
for entity in results:
print(f"{entity['word']:<35} {entity['entity_group']:<25} {entity['score']:.3f}")
Example output:
Federal Reserve CENTRAL_BANK 0.961
Brent TRADING_HUB 0.954
OPEC+ REGULATORY_BODY 0.947
Russian energy exports SANCTION 0.932
Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
model_name = "Quantbridge/distilbert-energy-intelligence-multitask-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
model.eval()
text = "Goldman Sachs cut its oil price forecast after OPEC+ agreed to extend output cuts."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
predicted_ids = outputs.logits.argmax(dim=-1)[0]
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
for token, label_id in zip(tokens, predicted_ids):
label = model.config.id2label[label_id.item()]
if label != "O" and not token.startswith("["):
print(f"{token.lstrip('##'):<25} {label}")
Model Details
| Property |
Value |
| Base architecture |
distilbert-base-uncased |
| Architecture type |
DistilBertForTokenClassification |
| Entity types |
59 types (119 BIO labels) |
| Hidden dimension |
768 |
| Attention heads |
12 |
| Layers |
6 |
| Vocabulary size |
30,522 |
| Max sequence length |
512 tokens |
Intended Use
This model is designed for financial and energy intelligence extraction โ automated NER over news feeds, earnings transcripts, regulatory filings, and geopolitical reports. It is a base model suitable for:
- Structured data extraction from unstructured financial news
- Entity linking and knowledge graph population
- Signal detection for trading and risk systems
- Geopolitical risk monitoring
Out-of-scope use
- General-purpose NER on non-financial text
- Languages other than English
- Documents with heavy technical jargon outside the financial/energy domain
Limitations
- English-only
- Optimised for news-style formal writing; may underperform on social media or informal text
- 59-label taxonomy may produce overlapping predictions for ambiguous entities (e.g. a company that is also an energy company)
- BIO scheme does not support nested entities
License
Apache 2.0 โ see LICENSE.