swik-heuristic-v1 (v0.1)

Deterministic keyword-based financial sentiment classifier. Fast, interpretable, no GPU, no API key. A baseline for domain-specific financial news sentiment.

This is the Layer 1 model in swik's two-layer inference pipeline. It processes every request before any LLM call β€” both as a fast path for high-confidence cases and as a fallback when the API is unavailable.

What it does

Two-pass classification:

  1. Inversion check β€” matches asset-specific inversion phrases (e.g., "production cut" β†’ BULLISH for OIL)
  2. Keyword scan β€” matches generic bullish/bearish keyword lists

If neither pass fires, the label is neutral.

Keyword Lists

Bullish (14 terms): cut, surge, rally, record high, growth, beat, upgrade, rise, gain, boost, strong, exceed, recovery, rebound

Bearish (13 terms): crash, plunge, drop, fall, miss, downgrade, warning, decline, loss, weak, below, cut guidance, layoff

Inversions: Asset-specific phrase overrides from the swik inversion catalog (125 active entries). Published separately as a dataset.

Usage

from inference import SwikHeuristicV1

model = SwikHeuristicV1()

# Basic usage
result = model.predict("Oil surges after OPEC production cut")
# {'label': 'bullish', 'magnitude': 0.72, 'confidence': 0.45, 'method': 'keyword'}

# With inversion catalog
inversions = [
    {"phrase": "coal power", "direction": "BULLISH", "variants": ["coal-fired power"]},
    {"phrase": "production cut", "direction": "BULLISH"},
]
model_with_inv = SwikHeuristicV1(known_inversions=inversions)
result = model_with_inv.predict("Coal power demand rises as gas prices surge", security="NATGAS")
# {'label': 'bullish', ..., 'inversion_applied': 'coal power', 'method': 'inversion'}

Benchmark Results

Evaluated on matched corpus: inference_log vs community_labels_legacy (text_hash join), 2026-03-08 to 2026-03-29.

Metric heuristic-v1 haiku-4-5 (baseline) haiku-4-5 (variant B)
Accuracy 98.88% 39.6% 46.0%
F1 macro 0.981 0.309 0.456
Neutral F1 0.992 0.506 β€”
Bullish F1 0.970 0.231 β€”
Bearish F1 0.981 0.189 β€”
n (pairs) 13,966 16,141 200 (test set)

⚠️ Important: These benchmarks are measured against AI-generated labels (Claude Haiku), not human ground truth. The high heuristic accuracy reflects agreement with the labeling model, not necessarily alignment with human judgment. Human-label benchmarks are pending.

⚠️ Known dataset bias: The companion labeled dataset is OIL-dominant β€” OIL accounts for ~56% of all labeled records. Model performance on other securities (especially low-volume ones) may be significantly lower than the aggregate numbers suggest. Evaluate per-security before deploying.

Confidence Calibration

The heuristic outputs a fixed confidence of 0.45 for all predictions. This is intentional β€” unlike the Haiku baseline (which is anti-calibrated: higher confidence β†’ higher error rate), the heuristic makes no claim about certainty. Use it as a deterministic rule engine, not a probabilistic model.

Known Failure Modes

  1. Ambiguous generic terms: Words like "cut" appear in both bullish (supply cuts β†’ oil bullish) and neutral contexts (budget cuts, interest rate cuts). Without the inversion catalog, these will be mis-labeled.

  2. Multi-entity headlines: "Oil falls as dollar rises" β€” the heuristic detects "falls" (bearish) but may assign it to the wrong security if entity filtering is weak.

  3. Negation blindness: "Oil did NOT surge" β†’ misclassified as bullish. No negation handling.

  4. Language and spelling: English only. Abbreviations and misspellings not handled.

  5. Context window: Heuristic has no memory of prior sentences. Each text is classified in isolation.

Model Weights

This model has no neural network weights. It is a deterministic rule-based system (keyword lists + inversion catalog).

  • No fine-tuning. No LoRA adapter. No PyTorch/TensorFlow required.
  • Labels in the companion dataset were generated by Claude Haiku (claude-haiku-4-5 via API) β€” not by a local model.
  • A LoRA fine-tuned adapter is planned once the community label corpus reaches sufficient size and multi-labeler consensus.

Architecture Context

This model is Layer 1 in swik's inference pipeline:

Text Input
    ↓
[heuristic-v1]  ← this model
    ↓ layer1_score
if security ∈ [OIL, NATGAS, LNG, GOLD, EURUSD]: use heuristic output
else if relevance < threshold: use heuristic output
else:
    ↓
[claude-haiku-4-5 + inversion catalog]  ← Layer 2
    ↓
Final prediction

For OIL, NATGAS, LNG, GOLD, EURUSD: the heuristic is the final model (accuracy ~99% on these). For other securities: heuristic pre-screens, Haiku runs if relevance passes.

Training Data

Not trained. Deterministic rule-based system. Keyword lists were derived from:

  • Manual curation of financial news vocabulary
  • Error analysis on the swik inference corpus
  • Cross-validated against community labels

Dataset

Labels used for benchmarking: polibert/swik-sentiment-labels

License

CC BY 4.0

Citation

@misc{swik_heuristic_v1_2026,
  title={swik-heuristic-v1: Domain-Specific Financial Sentiment Classifier},
  author={swik Community},
  year={2026},
  url={https://huggingface.co/polibert/swik-heuristic-v1},
  license={CC BY 4.0}
}

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support