File size: 5,634 Bytes
c24924a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: apache-2.0
language: en
library_name: transformers
tags:
  - finance
  - news
  - macro
  - financial-news
  - text-classification
pipeline_tag: text-classification
---

# binomial-shannon-2

A **financial news characterizer with two modes**: it reads *ticker-tagged* company news the way [binomial-shannon-1](https://huggingface.co/BinomialTechnologies/binomial-shannon-1) does (19 structured features), and reads *macro* news (central banks, inflation, rates, FX, commodities, geopolitics) with a dedicated 35-output macro head bank. A built-in router selects the right head set per article. ~15-30 ms on CPU.

## Quick start

```python
from transformers import AutoTokenizer, AutoModel

tok   = AutoTokenizer.from_pretrained("BinomialTechnologies/binomial-shannon-2")
model = AutoModel.from_pretrained("BinomialTechnologies/binomial-shannon-2",
                                   trust_remote_code=True)

inputs = tok("[FEED: reuters] [SITE: reuters.com] [DATE: 2026-03-18]\n\n"
             "TITLE: Fed holds rates, signals two cuts later this year\n\nBODY: ...",
             return_tensors="pt", truncation=True, max_length=1024)
out = model.predict(**inputs)

out["mode_prob"]            # [P(ticker), P(macro)]
out["topic_prob"]          # 18-way macro topic distribution
out["directional_read"]    # signed macro read in [-1, +1]
out["hawkish_dovish_prob"] # 5-way, meaningful on monetary-policy / rates articles
```

## What it outputs

**Ticker mode (19 features, identical to shannon-1)** β€” event type (10 binary), tone, implied_direction, novelty, claim_type (4), specificity, materiality_if_true.

**Macro mode (35 features):**

| Head | Type | Meaning |
|---|---|---|
| `topic` | softmax (18) | monetary_policy / fiscal_policy / inflation / growth / labor / rates_fixed_income / equities_markets / fx_currency / energy / commodities / credit_banking / crypto / mergers_acquisitions / trade_policy / geopolitics / single_company / technicals / other |
| `directional_read` | [-1, +1] | net read for risk assets implied by the article |
| `severity` | softmax (5) | noise / minor / notable / major / crisis |
| `novelty` | softmax (3) | rehash / commentary / breaking |
| `claim_type` | softmax (4) | fact / opinion / rumor / forecast |
| `hawkish_dovish` | softmax (5) | dovish β†’ hawkish; meaningful on monetary-policy / rates articles |

Every macro head is a softmax or a signed scalar β€” argmax for a label, the weighted score for a continuous summary, or the entropy for uncertainty.

## Eval

Held-out forward-temporal test set (Oct 2025 – May 2026, never seen during training). Numbers from a reproducible harness over all 15,805 macro test articles + a seeded 10,000 ticker sample.

### Ticker heads (parity with shannon-1)

| Event-flag macro F1 | implied_direction | tone | claim acc |
|---|---|---|---|
| 0.79 | 0.854 | 0.834 | 89.5% |

The ticker bank matches the standalone shannon-1 model β€” shannon-2 is a strict superset, adding macro without regressing ticker quality.

### Macro heads (n=15,805)

| Head | Metric | Value |
|---|---|---|
| topic (18-way) | accuracy | **0.814** |
| directional_read | Pearson vs panel | **+0.783** |
| severity (5-way) | accuracy | 0.708 |
| novelty (3-way) | accuracy | 0.648 |
| claim_type (4-way) | accuracy | 0.785 |
| hawkish_dovish (5-way) | accuracy | 0.616 (n=1,650) |

Per-topic F1 (selected):

| Topic | F1 | Support |
|---|---|---|
| commodities | 0.94 | 2,061 |
| equities_markets | 0.88 | 3,613 |
| fx_currency | 0.88 | 3,861 |
| monetary_policy | 0.79 | 1,662 |
| inflation | 0.70 | 526 |
| geopolitics | 0.48 | 346 |
| technicals | 0.25 | 517 |

Strongest on high-volume market topics (commodities, FX, equities, monetary policy); weakest on technicals and geopolitics, which are lower-support and more heterogeneous.

**Routing.** Ticker and macro articles arrive on structurally distinct feeds (per-company news vs. macro wires), so the router separates the two modes essentially perfectly β€” it is a convenience for serving mixed streams, not a hard classification result.

## Architecture

A specialized ~150M-parameter encoder shared across a 2-way router and two head banks (ticker + macro), each a 3-layer MLP over a CLS+masked-mean pooled representation.

- ~150M encoder params + lightweight head banks
- 4096-token context (1024 default at inference)
- bf16 GPU / fp32 CPU
- ~15-30 ms CPU

## How it was trained

- **Corpus**: ticker-tagged company news + press releases and a macro news corpus (2018-2026)
- **Labels**: distilled from a frontier reasoning model against per-mode rubrics (separate ticker and macro labeling specs)
- **Split**: forward temporal β€” train on ≀2025-09-30, test on 2025-10 β†’ 2026-05
- **Compute**: trained from the base encoder on a single B200

## Caveats

- **Trained against frontier-LLM labels.** Eval correlations are partly imitation; treat the outputs as structured features, not ground truth.
- **Macro corpus is English-language wire news**, weighted toward 2024-2026.
- **`hawkish_dovish` only fires meaningfully on monetary-policy / rates articles** (it is loss-masked elsewhere during training).
- **Tier 2** β€” research preview. Don't use the outputs as standalone trading signals; combine with your own pipelines.

## License

Apache 2.0, like the rest of the Binomial AI Research model zoo.

## Citation

```bibtex
@misc{binomial-shannon-2-2026,
  title  = {binomial-shannon-2: A dual-mode financial news characterizer (ticker + macro)},
  author = {Binomial AI Research},
  year   = {2026},
  url    = {https://huggingface.co/BinomialTechnologies/binomial-shannon-2}
}
```