swik-heuristic-v1 (v0.1)
Deterministic keyword-based financial sentiment classifier. Fast, interpretable, no GPU, no API key. A baseline for domain-specific financial news sentiment.
This is the Layer 1 model in swik's two-layer inference pipeline. It processes every request before any LLM call β both as a fast path for high-confidence cases and as a fallback when the API is unavailable.
What it does
Two-pass classification:
- Inversion check β matches asset-specific inversion phrases (e.g., "production cut" β BULLISH for OIL)
- Keyword scan β matches generic bullish/bearish keyword lists
If neither pass fires, the label is neutral.
Keyword Lists
Bullish (14 terms): cut, surge, rally, record high, growth, beat, upgrade, rise, gain, boost, strong, exceed, recovery, rebound
Bearish (13 terms): crash, plunge, drop, fall, miss, downgrade, warning, decline, loss, weak, below, cut guidance, layoff
Inversions: Asset-specific phrase overrides from the swik inversion catalog (125 active entries). Published separately as a dataset.
Usage
from inference import SwikHeuristicV1
model = SwikHeuristicV1()
# Basic usage
result = model.predict("Oil surges after OPEC production cut")
# {'label': 'bullish', 'magnitude': 0.72, 'confidence': 0.45, 'method': 'keyword'}
# With inversion catalog
inversions = [
{"phrase": "coal power", "direction": "BULLISH", "variants": ["coal-fired power"]},
{"phrase": "production cut", "direction": "BULLISH"},
]
model_with_inv = SwikHeuristicV1(known_inversions=inversions)
result = model_with_inv.predict("Coal power demand rises as gas prices surge", security="NATGAS")
# {'label': 'bullish', ..., 'inversion_applied': 'coal power', 'method': 'inversion'}
Benchmark Results
Evaluated on matched corpus: inference_log vs community_labels_legacy (text_hash join), 2026-03-08 to 2026-03-29.
| Metric | heuristic-v1 | haiku-4-5 (baseline) | haiku-4-5 (variant B) |
|---|---|---|---|
| Accuracy | 98.88% | 39.6% | 46.0% |
| F1 macro | 0.981 | 0.309 | 0.456 |
| Neutral F1 | 0.992 | 0.506 | β |
| Bullish F1 | 0.970 | 0.231 | β |
| Bearish F1 | 0.981 | 0.189 | β |
| n (pairs) | 13,966 | 16,141 | 200 (test set) |
β οΈ Important: These benchmarks are measured against AI-generated labels (Claude Haiku), not human ground truth. The high heuristic accuracy reflects agreement with the labeling model, not necessarily alignment with human judgment. Human-label benchmarks are pending.
β οΈ Known dataset bias: The companion labeled dataset is OIL-dominant β OIL accounts for ~56% of all labeled records. Model performance on other securities (especially low-volume ones) may be significantly lower than the aggregate numbers suggest. Evaluate per-security before deploying.
Confidence Calibration
The heuristic outputs a fixed confidence of 0.45 for all predictions. This is intentional β
unlike the Haiku baseline (which is anti-calibrated: higher confidence β higher error rate),
the heuristic makes no claim about certainty. Use it as a deterministic rule engine, not a
probabilistic model.
Known Failure Modes
Ambiguous generic terms: Words like "cut" appear in both bullish (supply cuts β oil bullish) and neutral contexts (budget cuts, interest rate cuts). Without the inversion catalog, these will be mis-labeled.
Multi-entity headlines: "Oil falls as dollar rises" β the heuristic detects "falls" (bearish) but may assign it to the wrong security if entity filtering is weak.
Negation blindness: "Oil did NOT surge" β misclassified as bullish. No negation handling.
Language and spelling: English only. Abbreviations and misspellings not handled.
Context window: Heuristic has no memory of prior sentences. Each text is classified in isolation.
Model Weights
This model has no neural network weights. It is a deterministic rule-based system (keyword lists + inversion catalog).
- No fine-tuning. No LoRA adapter. No PyTorch/TensorFlow required.
- Labels in the companion dataset were generated by Claude Haiku (claude-haiku-4-5 via API) β not by a local model.
- A LoRA fine-tuned adapter is planned once the community label corpus reaches sufficient size and multi-labeler consensus.
Architecture Context
This model is Layer 1 in swik's inference pipeline:
Text Input
β
[heuristic-v1] β this model
β layer1_score
if security β [OIL, NATGAS, LNG, GOLD, EURUSD]: use heuristic output
else if relevance < threshold: use heuristic output
else:
β
[claude-haiku-4-5 + inversion catalog] β Layer 2
β
Final prediction
For OIL, NATGAS, LNG, GOLD, EURUSD: the heuristic is the final model (accuracy ~99% on these). For other securities: heuristic pre-screens, Haiku runs if relevance passes.
Training Data
Not trained. Deterministic rule-based system. Keyword lists were derived from:
- Manual curation of financial news vocabulary
- Error analysis on the swik inference corpus
- Cross-validated against community labels
Dataset
Labels used for benchmarking: polibert/swik-sentiment-labels
License
CC BY 4.0
Citation
@misc{swik_heuristic_v1_2026,
title={swik-heuristic-v1: Domain-Specific Financial Sentiment Classifier},
author={swik Community},
year={2026},
url={https://huggingface.co/polibert/swik-heuristic-v1},
license={CC BY 4.0}
}
Links
- Platform: swik.io
- Dataset: polibert/swik-sentiment-labels
- Contribute labels: swik.io/contribute/label