polibert commited on
Commit
f244575
·
verified ·
1 Parent(s): 18b61af

Initial release: swik-heuristic-v1 keyword classifier with benchmarks

Browse files
Files changed (2) hide show
  1. README.md +166 -0
  2. inference.py +107 -0
README.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-classification
7
+ - sentiment-analysis
8
+ - financial-sentiment
9
+ - finance
10
+ - commodities
11
+ - domain-specific
12
+ - rule-based
13
+ - interpretable
14
+ pretty_name: swik Heuristic Sentiment v1
15
+ ---
16
+
17
+ # swik-heuristic-v1 (v0.1)
18
+
19
+ **Deterministic keyword-based financial sentiment classifier.**
20
+ Fast, interpretable, no GPU, no API key. A baseline for domain-specific financial news sentiment.
21
+
22
+ This is the Layer 1 model in swik's two-layer inference pipeline. It processes every request before
23
+ any LLM call — both as a fast path for high-confidence cases and as a fallback when the API is unavailable.
24
+
25
+ ## What it does
26
+
27
+ Two-pass classification:
28
+ 1. **Inversion check** — matches asset-specific inversion phrases (e.g., "production cut" → BULLISH for OIL)
29
+ 2. **Keyword scan** — matches generic bullish/bearish keyword lists
30
+
31
+ If neither pass fires, the label is `neutral`.
32
+
33
+ ## Keyword Lists
34
+
35
+ **Bullish (14 terms):** cut, surge, rally, record high, growth, beat, upgrade, rise, gain, boost, strong, exceed, recovery, rebound
36
+
37
+ **Bearish (13 terms):** crash, plunge, drop, fall, miss, downgrade, warning, decline, loss, weak, below, cut guidance, layoff
38
+
39
+ **Inversions:** Asset-specific phrase overrides from the [swik inversion catalog](https://swik.io/inversions) (125 active entries). Published separately as a dataset.
40
+
41
+ ## Usage
42
+
43
+ ```python
44
+ from inference import SwikHeuristicV1
45
+
46
+ model = SwikHeuristicV1()
47
+
48
+ # Basic usage
49
+ result = model.predict("Oil surges after OPEC production cut")
50
+ # {'label': 'bullish', 'magnitude': 0.72, 'confidence': 0.45, 'method': 'keyword'}
51
+
52
+ # With inversion catalog
53
+ inversions = [
54
+ {"phrase": "coal power", "direction": "BULLISH", "variants": ["coal-fired power"]},
55
+ {"phrase": "production cut", "direction": "BULLISH"},
56
+ ]
57
+ model_with_inv = SwikHeuristicV1(known_inversions=inversions)
58
+ result = model_with_inv.predict("Coal power demand rises as gas prices surge", security="NATGAS")
59
+ # {'label': 'bullish', ..., 'inversion_applied': 'coal power', 'method': 'inversion'}
60
+ ```
61
+
62
+ ## Benchmark Results
63
+
64
+ Evaluated on matched corpus: inference_log vs community_labels_legacy (text_hash join), 2026-03-08 to 2026-03-29.
65
+
66
+ | Metric | heuristic-v1 | haiku-4-5 (baseline) | haiku-4-5 (variant B) |
67
+ |--------|-------------|---------------------|----------------------|
68
+ | **Accuracy** | **98.88%** | 39.6% | 46.0% |
69
+ | **F1 macro** | **0.981** | 0.309 | 0.456 |
70
+ | Neutral F1 | 0.992 | 0.506 | — |
71
+ | Bullish F1 | 0.970 | 0.231 | — |
72
+ | Bearish F1 | 0.981 | 0.189 | — |
73
+ | n (pairs) | 13,966 | 16,141 | 200 (test set) |
74
+
75
+ > ⚠️ **Important:** These benchmarks are measured against AI-generated labels (Claude Haiku), not
76
+ > human ground truth. The high heuristic accuracy reflects agreement with the labeling model, not
77
+ > necessarily alignment with human judgment. Human-label benchmarks are pending.
78
+ >
79
+ > ⚠️ **Known dataset bias:** The companion labeled dataset is OIL-dominant — OIL accounts for ~56% of all
80
+ > labeled records. Model performance on other securities (especially low-volume ones) may be significantly
81
+ > lower than the aggregate numbers suggest. Evaluate per-security before deploying.
82
+
83
+ ## Confidence Calibration
84
+
85
+ The heuristic outputs a fixed confidence of `0.45` for all predictions. This is intentional —
86
+ unlike the Haiku baseline (which is anti-calibrated: higher confidence → higher error rate),
87
+ the heuristic makes no claim about certainty. Use it as a deterministic rule engine, not a
88
+ probabilistic model.
89
+
90
+ ## Known Failure Modes
91
+
92
+ 1. **Ambiguous generic terms**: Words like "cut" appear in both bullish (supply cuts → oil bullish)
93
+ and neutral contexts (budget cuts, interest rate cuts). Without the inversion catalog, these
94
+ will be mis-labeled.
95
+
96
+ 2. **Multi-entity headlines**: "Oil falls as dollar rises" — the heuristic detects "falls" (bearish)
97
+ but may assign it to the wrong security if entity filtering is weak.
98
+
99
+ 3. **Negation blindness**: "Oil did NOT surge" → misclassified as bullish. No negation handling.
100
+
101
+ 4. **Language and spelling**: English only. Abbreviations and misspellings not handled.
102
+
103
+ 5. **Context window**: Heuristic has no memory of prior sentences. Each text is classified in
104
+ isolation.
105
+
106
+ ## Model Weights
107
+
108
+ **This model has no neural network weights.** It is a deterministic rule-based system (keyword lists + inversion catalog).
109
+
110
+ - No fine-tuning. No LoRA adapter. No PyTorch/TensorFlow required.
111
+ - Labels in the companion dataset were generated by Claude Haiku (claude-haiku-4-5 via API) — not by a local model.
112
+ - A LoRA fine-tuned adapter is planned once the community label corpus reaches sufficient size and multi-labeler consensus.
113
+
114
+ ## Architecture Context
115
+
116
+ This model is Layer 1 in swik's inference pipeline:
117
+
118
+ ```
119
+ Text Input
120
+
121
+ [heuristic-v1] ← this model
122
+ ↓ layer1_score
123
+ if security ∈ [OIL, NATGAS, LNG, GOLD, EURUSD]: use heuristic output
124
+ else if relevance < threshold: use heuristic output
125
+ else:
126
+
127
+ [claude-haiku-4-5 + inversion catalog] ← Layer 2
128
+
129
+ Final prediction
130
+ ```
131
+
132
+ For OIL, NATGAS, LNG, GOLD, EURUSD: the heuristic is the final model (accuracy ~99% on these).
133
+ For other securities: heuristic pre-screens, Haiku runs if relevance passes.
134
+
135
+ ## Training Data
136
+
137
+ Not trained. Deterministic rule-based system. Keyword lists were derived from:
138
+ - Manual curation of financial news vocabulary
139
+ - Error analysis on the swik inference corpus
140
+ - Cross-validated against community labels
141
+
142
+ ## Dataset
143
+
144
+ Labels used for benchmarking: [polibert/swik-sentiment-labels](https://huggingface.co/datasets/polibert/swik-sentiment-labels)
145
+
146
+ ## License
147
+
148
+ CC BY 4.0
149
+
150
+ ## Citation
151
+
152
+ ```bibtex
153
+ @misc{swik_heuristic_v1_2026,
154
+ title={swik-heuristic-v1: Domain-Specific Financial Sentiment Classifier},
155
+ author={swik Community},
156
+ year={2026},
157
+ url={https://huggingface.co/polibert/swik-heuristic-v1},
158
+ license={CC BY 4.0}
159
+ }
160
+ ```
161
+
162
+ ## Links
163
+
164
+ - Platform: [swik.io](https://swik.io)
165
+ - Dataset: [polibert/swik-sentiment-labels](https://huggingface.co/datasets/polibert/swik-sentiment-labels)
166
+ - Contribute labels: [swik.io/contribute/label](https://swik.io/contribute/label)
inference.py ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ swik-heuristic-v1 — deterministic keyword-based financial sentiment classifier.
4
+
5
+ A fast, interpretable baseline for domain-specific financial news sentiment.
6
+ No GPU required. No API calls. Runs in microseconds.
7
+
8
+ Usage:
9
+ from inference import SwikHeuristicV1, KNOWN_INVERSIONS
10
+ model = SwikHeuristicV1()
11
+ result = model.predict("OPEC agrees to production cuts", security="OIL")
12
+ # {"label": "bullish", "magnitude": 0.72, "confidence": 0.45, "method": "keyword"}
13
+
14
+ For inversion-aware inference, pass known_inversions (list of dicts with phrase/direction).
15
+ """
16
+
17
+ BULLISH_KEYWORDS = [
18
+ "cut", "surge", "rally", "record high", "growth", "beat", "upgrade",
19
+ "rise", "gain", "boost", "strong", "exceed", "recovery", "rebound"
20
+ ]
21
+ BEARISH_KEYWORDS = [
22
+ "crash", "plunge", "drop", "fall", "miss", "downgrade", "warning",
23
+ "decline", "loss", "weak", "below", "cut guidance", "layoff"
24
+ ]
25
+
26
+ LABEL_MAP = {"bullish": 0, "bearish": 1, "neutral": 2, "irrelevant": 3}
27
+ LABEL_NAMES = ["bullish", "bearish", "neutral", "irrelevant"]
28
+
29
+
30
+ class SwikHeuristicV1:
31
+ """
32
+ Two-pass keyword classifier:
33
+ Pass 1: Check known inversions (asset-specific phrase overrides)
34
+ Pass 2: Check generic bullish/bearish keyword lists
35
+ Default: neutral
36
+
37
+ Accuracy: 98.88% on matched inference corpus vs AI labels (n=13,966).
38
+ Note: measured against AI-generated labels, not human ground truth.
39
+ """
40
+
41
+ def __init__(self, known_inversions=None):
42
+ """
43
+ known_inversions: list of dicts with keys:
44
+ phrase (str), direction (str: BULLISH|BEARISH|NEUTRAL),
45
+ variants (list[str], optional), confidence (float, optional)
46
+ """
47
+ self.known_inversions = known_inversions or []
48
+
49
+ def predict(self, text: str, security: str = None, key_entities: list = None) -> dict:
50
+ text_lower = text.lower()
51
+ direction = "neutral"
52
+ magnitude = 0.4
53
+ relevance = 0.5
54
+ inversion_applied = None
55
+
56
+ # Pass 1: known inversions (highest priority)
57
+ for inv in self.known_inversions:
58
+ phrase = inv["phrase"].lower()
59
+ variants = [v.lower() for v in inv.get("variants", [])]
60
+ if phrase in text_lower or any(v in text_lower for v in variants):
61
+ direction = inv["direction"].lower()
62
+ magnitude = float(inv.get("confidence", 0.7))
63
+ relevance = 0.85
64
+ inversion_applied = inv["phrase"]
65
+ break
66
+
67
+ # Pass 2: generic keywords
68
+ if not inversion_applied:
69
+ if any(kw in text_lower for kw in BULLISH_KEYWORDS):
70
+ direction = "bullish"
71
+ magnitude = 0.72
72
+ relevance = 0.75
73
+ elif any(kw in text_lower for kw in BEARISH_KEYWORDS):
74
+ direction = "bearish"
75
+ magnitude = 0.68
76
+ relevance = 0.75
77
+
78
+ # Boost relevance if key entities mentioned
79
+ if key_entities:
80
+ for entity in key_entities:
81
+ if entity.lower() in text_lower:
82
+ relevance = min(1.0, relevance + 0.15)
83
+ break
84
+
85
+ return {
86
+ "label": direction,
87
+ "label_id": LABEL_MAP.get(direction, 2),
88
+ "magnitude": round(magnitude, 2),
89
+ "relevance": round(relevance, 2),
90
+ "confidence": 0.45, # heuristic confidence is always 0.45
91
+ "inversion_applied": inversion_applied,
92
+ "method": "inversion" if inversion_applied else ("keyword" if direction != "neutral" else "default"),
93
+ }
94
+
95
+ def predict_batch(self, texts: list, security: str = None, key_entities: list = None) -> list:
96
+ return [self.predict(t, security, key_entities) for t in texts]
97
+
98
+
99
+ if __name__ == "__main__":
100
+ import sys
101
+ model = SwikHeuristicV1()
102
+ text = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "OPEC agrees to production cuts, oil surges"
103
+ result = model.predict(text)
104
+ print(f"Text: {text}")
105
+ print(f"Label: {result['label']} (id={result['label_id']})")
106
+ print(f"Magnitude: {result['magnitude']} | Relevance: {result['relevance']} | Confidence: {result['confidence']}")
107
+ print(f"Method: {result['method']}")