Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- QUALITY_SCORE_ARCHITECTURE.md +164 -0
- data/quality_scores.jsonl +3 -0
- log.log +2 -2
- scripts/compute_quality_score.py +545 -0
.gitattributes
CHANGED
|
@@ -35,3 +35,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
log.log filter=lfs diff=lfs merge=lfs -text
|
| 37 |
store/74c/74c70007-cccd-4669-bfd4-e25f8348ad8c/all_1_35_2/primary.cidx filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
log.log filter=lfs diff=lfs merge=lfs -text
|
| 37 |
store/74c/74c70007-cccd-4669-bfd4-e25f8348ad8c/all_1_35_2/primary.cidx filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
data/quality_scores.jsonl filter=lfs diff=lfs merge=lfs -text
|
QUALITY_SCORE_ARCHITECTURE.md
ADDED
|
@@ -0,0 +1,164 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Token Quality / Health Score (q) - Architecture
|
| 2 |
+
|
| 3 |
+
This document defines the "quality/health" scalar `q` used by Apollo.
|
| 4 |
+
|
| 5 |
+
## 1) What problem this solves
|
| 6 |
+
|
| 7 |
+
We want a single number that captures **how healthy / organic vs controlled** a token looks, so a downstream trading policy (e.g., RL agent) can treat it as a **risk/health input**.
|
| 8 |
+
|
| 9 |
+
Key points:
|
| 10 |
+
- This is **not model confidence**.
|
| 11 |
+
- `q` is computed **offline** using a token's **full lifetime** (for labels / training targets).
|
| 12 |
+
- At **inference**, the model predicts `q` from **partial observations**.
|
| 13 |
+
- We avoid hard thresholds and raw-scale features (USD, SOL, counts) by using **within-regime distributions**.
|
| 14 |
+
|
| 15 |
+
## 2) Core idea (distribution-first, not rules-first)
|
| 16 |
+
|
| 17 |
+
Raw totals (fees, volume, holders) are mostly **scale** and are extremely heavy-tailed. Using them directly:
|
| 18 |
+
- makes the signal unstable across regimes,
|
| 19 |
+
- makes it sensitive to market-wide shifts,
|
| 20 |
+
- and invites hand-tuned weights ("human bias").
|
| 21 |
+
|
| 22 |
+
Instead we map each metric to a **percentile** within a comparable peer group, then aggregate.
|
| 23 |
+
|
| 24 |
+
## 3) Return bucketing (why it is required)
|
| 25 |
+
|
| 26 |
+
The dataset is highly imbalanced: most tokens die early (<2-3x), while a tiny tail produces 10x-1000x outcomes.
|
| 27 |
+
|
| 28 |
+
If you compute percentiles globally:
|
| 29 |
+
- 100x tokens will always dominate "good" percentiles for scale metrics,
|
| 30 |
+
- and "quality" will collapse into "return magnitude".
|
| 31 |
+
|
| 32 |
+
So we compute distributions **within return regimes**.
|
| 33 |
+
|
| 34 |
+
### 3.1 Bucket definition (example)
|
| 35 |
+
|
| 36 |
+
Let `R_max` be the token's lifetime max return multiple (e.g., ATH / launch).
|
| 37 |
+
|
| 38 |
+
Use coarse buckets for the bulk and finer buckets for the tail, e.g.:
|
| 39 |
+
- B0: `R_max < 3`
|
| 40 |
+
- B1: `3 <= R_max < 10`
|
| 41 |
+
- B2: `10 <= R_max < 20`
|
| 42 |
+
- B3: `20 <= R_max < 100`
|
| 43 |
+
- B4: `100 <= R_max < 10_000`
|
| 44 |
+
|
| 45 |
+
Notes:
|
| 46 |
+
- If a bucket has too few samples, merge with a neighbor.
|
| 47 |
+
- For the extreme tail you can also replace fixed buckets with **quantile buckets** on `log(R_max)` to keep sample counts stable.
|
| 48 |
+
|
| 49 |
+
Interpretation (important):
|
| 50 |
+
- `q` is **relative within the bucket**.
|
| 51 |
+
- The "best garbage" can have high `q` in B0.
|
| 52 |
+
- A 100x token can have low `q` in B4 if it looks worst vs other 100x+ tokens.
|
| 53 |
+
|
| 54 |
+
This is intentional: return and quality are different axes.
|
| 55 |
+
|
| 56 |
+
## 4) Feature set and sign conventions
|
| 57 |
+
|
| 58 |
+
We want `q` to increase for "healthy/organic" structure and decrease for "controlled/manipulated" structure.
|
| 59 |
+
|
| 60 |
+
All features below are evaluated **within the token's return bucket**.
|
| 61 |
+
|
| 62 |
+
### 4.1 Scale / activity (high is usually better within-bucket)
|
| 63 |
+
|
| 64 |
+
Use log transforms for stability before percentiles:
|
| 65 |
+
- `log1p(total_volume_usd)`
|
| 66 |
+
- `log1p(total_fees_sol)`
|
| 67 |
+
- `log1p(unique_holders)`
|
| 68 |
+
- `log1p(time_to_ath_sec)` (optional; see note below)
|
| 69 |
+
|
| 70 |
+
Ratio features (less pure scale):
|
| 71 |
+
- `fees_per_volume = total_fees_sol / (total_volume_usd + eps)`
|
| 72 |
+
- `fees_per_trade = total_fees_sol / (n_trades + eps)` (if `n_trades` exists)
|
| 73 |
+
- `holders_per_trade = unique_holders / (n_trades + eps)` (if `n_trades` exists)
|
| 74 |
+
- `holders_per_volume = unique_holders / (total_volume_usd + eps)`
|
| 75 |
+
|
| 76 |
+
Rationale:
|
| 77 |
+
- Fees and fee-per-* help separate "real urgency / competition" from "cheap wash".
|
| 78 |
+
- Holders and holders-per-* help separate broad participation from concentrated looping.
|
| 79 |
+
|
| 80 |
+
### 4.2 Manipulation / control (high is worse; flip sign)
|
| 81 |
+
|
| 82 |
+
These are typically "the higher, the less healthy":
|
| 83 |
+
- `snipers_pct_supply_top70`
|
| 84 |
+
- `bundled_pct_supply`
|
| 85 |
+
- `dev_hold_pct_supply`
|
| 86 |
+
- `insiders_pct_supply`
|
| 87 |
+
|
| 88 |
+
We treat exceptions as rare; the model can learn edge cases from context, but the label should reflect the dominant interpretation.
|
| 89 |
+
|
| 90 |
+
### 4.3 Time-to-ATH note
|
| 91 |
+
|
| 92 |
+
`time_to_ath_sec` can behave differently across return buckets.
|
| 93 |
+
- In high-return buckets, very short times can look like a single spike / control.
|
| 94 |
+
- In low-return buckets, many tokens have near-zero times because they never move.
|
| 95 |
+
|
| 96 |
+
Include it only if it improves downstream behavior; keep it **bucket-relative** either way.
|
| 97 |
+
|
| 98 |
+
## 5) Turning raw metrics into a signed scalar
|
| 99 |
+
|
| 100 |
+
We want a single `q` in `[-1, +1]` with direction:
|
| 101 |
+
- `+1` = looks healthiest vs peers in the same return bucket
|
| 102 |
+
- `-1` = looks most unhealthy vs peers in the same return bucket
|
| 103 |
+
|
| 104 |
+
### 5.1 Within-bucket percentile (ECDF)
|
| 105 |
+
|
| 106 |
+
For each feature value `x_i`:
|
| 107 |
+
- compute percentile `p_i = ECDF_b(x_i)` using only tokens in bucket `b`
|
| 108 |
+
- `p_i` is in `[0, 1]`
|
| 109 |
+
|
| 110 |
+
Implementation detail:
|
| 111 |
+
- Use a rank-based ECDF with a small offset to avoid exact 0/1 if desired:
|
| 112 |
+
- `p_i = (rank(x_i) - 0.5) / n`
|
| 113 |
+
|
| 114 |
+
### 5.2 Signed percentile
|
| 115 |
+
|
| 116 |
+
Convert to signed value:
|
| 117 |
+
- `s_i = 2 * p_i - 1` (now `s_i` is in `[-1, +1]`)
|
| 118 |
+
|
| 119 |
+
If "high is bad" for that feature, flip it:
|
| 120 |
+
- `s_i := -s_i`
|
| 121 |
+
|
| 122 |
+
This gives direction + magnitude in a single number.
|
| 123 |
+
|
| 124 |
+
### 5.3 Aggregate without hand weights
|
| 125 |
+
|
| 126 |
+
To avoid hand-tuned weights, use a symmetric aggregator:
|
| 127 |
+
- `q_raw = mean_i(s_i)`
|
| 128 |
+
|
| 129 |
+
Optional robustness:
|
| 130 |
+
- clip each `s_i` to `[-0.99, 0.99]` before averaging (limits extreme leverage)
|
| 131 |
+
- use a trimmed mean (drop top/bottom k% of `s_i`) if a single metric can be noisy
|
| 132 |
+
|
| 133 |
+
### 5.4 Optional: re-rank the aggregate (final calibration)
|
| 134 |
+
|
| 135 |
+
If you want the final `q` to be strictly comparable across time / retrains and more uniform within bucket:
|
| 136 |
+
- `q = 2 * ECDF_b(q_raw) - 1`
|
| 137 |
+
|
| 138 |
+
This keeps the "relative within bucket" meaning while stabilizing scale.
|
| 139 |
+
|
| 140 |
+
## 6) Training vs inference (how it is used)
|
| 141 |
+
|
| 142 |
+
Offline labeling (training target):
|
| 143 |
+
1) Compute `R_max` from full lifetime.
|
| 144 |
+
2) Assign return bucket `b`.
|
| 145 |
+
3) Compute all chosen metrics from full lifetime.
|
| 146 |
+
4) Convert metrics -> signed percentiles -> `q`.
|
| 147 |
+
|
| 148 |
+
Inference (model output):
|
| 149 |
+
- The model only sees partial history and must predict the *final* `q` (computed above).
|
| 150 |
+
- The trading policy uses predicted return signals + predicted `q` to decide position sizing / risk.
|
| 151 |
+
|
| 152 |
+
## 7) Practical notes
|
| 153 |
+
|
| 154 |
+
- Use `eps` (e.g., `1e-9`) in denominators to avoid divide-by-zero.
|
| 155 |
+
- If a metric is missing for a token, drop it from the mean for that token (or impute with bucket median).
|
| 156 |
+
- When bucket sample counts drift, prefer merging buckets rather than letting ECDF be noisy.
|
| 157 |
+
- Recompute distributions on the same "source-of-truth" dataset used for training (not ad-hoc caches).
|
| 158 |
+
|
| 159 |
+
## 8) Summary
|
| 160 |
+
|
| 161 |
+
`q` is a **return-regime-relative**, **distribution-normalized**, **signed** health score:
|
| 162 |
+
- It is not a threshold classifier.
|
| 163 |
+
- It avoids raw-scale dependence and hand weighting.
|
| 164 |
+
- It cleanly separates "made money" (return) from "looks healthy" (quality).
|
data/quality_scores.jsonl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:484a4722dcc9d5bf4928e0926be256df7abecfe74cfbf7f75b04aeab91c2ca23
|
| 3 |
+
size 11849315
|
log.log
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a7e8559fa0dfc6a9356d4078d582a479a5a3cbf8a3348183b3baf336ef73db25
|
| 3 |
+
size 2302
|
scripts/compute_quality_score.py
ADDED
|
@@ -0,0 +1,545 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import sys
|
| 3 |
+
import json
|
| 4 |
+
import math
|
| 5 |
+
import argparse
|
| 6 |
+
from typing import Dict, List, Tuple
|
| 7 |
+
|
| 8 |
+
from clickhouse_driver import Client as ClickHouseClient
|
| 9 |
+
|
| 10 |
+
# Add parent to path
|
| 11 |
+
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
| 12 |
+
|
| 13 |
+
from models.vocabulary import RETURN_THRESHOLDS
|
| 14 |
+
|
| 15 |
+
CLICKHOUSE_HOST = os.getenv("CLICKHOUSE_HOST", "localhost")
|
| 16 |
+
CLICKHOUSE_PORT = int(os.getenv("CLICKHOUSE_PORT", 9000))
|
| 17 |
+
CLICKHOUSE_USER = os.getenv("CLICKHOUSE_USER", "default")
|
| 18 |
+
CLICKHOUSE_PASSWORD = os.getenv("CLICKHOUSE_PASSWORD", "")
|
| 19 |
+
CLICKHOUSE_DATABASE = os.getenv("CLICKHOUSE_DATABASE", "default")
|
| 20 |
+
|
| 21 |
+
LAUNCH_PRICE_USD = 0.000004
|
| 22 |
+
EPS = 1e-9
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def get_client():
|
| 26 |
+
return ClickHouseClient(
|
| 27 |
+
host=CLICKHOUSE_HOST,
|
| 28 |
+
port=CLICKHOUSE_PORT,
|
| 29 |
+
user=CLICKHOUSE_USER,
|
| 30 |
+
password=CLICKHOUSE_PASSWORD,
|
| 31 |
+
database=CLICKHOUSE_DATABASE,
|
| 32 |
+
)
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
def _midrank_percentiles(items: List[Tuple[str, float]]) -> Dict[str, float]:
|
| 36 |
+
"""
|
| 37 |
+
Compute midrank percentiles for a list of (token, value).
|
| 38 |
+
Returns p in (0,1) via (rank - 0.5) / n. Ties get the same midrank.
|
| 39 |
+
"""
|
| 40 |
+
if not items:
|
| 41 |
+
return {}
|
| 42 |
+
items_sorted = sorted(items, key=lambda x: x[1])
|
| 43 |
+
n = len(items_sorted)
|
| 44 |
+
out = {}
|
| 45 |
+
i = 0
|
| 46 |
+
while i < n:
|
| 47 |
+
j = i
|
| 48 |
+
v = items_sorted[i][1]
|
| 49 |
+
while j + 1 < n and items_sorted[j + 1][1] == v:
|
| 50 |
+
j += 1
|
| 51 |
+
# midrank is average of ranks i..j (1-based)
|
| 52 |
+
rank_lo = i + 1
|
| 53 |
+
rank_hi = j + 1
|
| 54 |
+
midrank = 0.5 * (rank_lo + rank_hi)
|
| 55 |
+
p = (midrank - 0.5) / n
|
| 56 |
+
for k in range(i, j + 1):
|
| 57 |
+
out[items_sorted[k][0]] = p
|
| 58 |
+
i = j + 1
|
| 59 |
+
return out
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
def _bucket_id(ret_val: float) -> int:
|
| 63 |
+
for i in range(len(RETURN_THRESHOLDS) - 1):
|
| 64 |
+
lower = RETURN_THRESHOLDS[i]
|
| 65 |
+
upper = RETURN_THRESHOLDS[i + 1]
|
| 66 |
+
if ret_val >= lower and ret_val < upper:
|
| 67 |
+
return i
|
| 68 |
+
return -1
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
def fetch_token_metrics(client) -> List[dict]:
|
| 72 |
+
"""
|
| 73 |
+
Fetches lifetime metrics needed for quality scoring.
|
| 74 |
+
Returns a list of dicts keyed by token_address.
|
| 75 |
+
"""
|
| 76 |
+
query = f"""
|
| 77 |
+
WITH
|
| 78 |
+
trade_agg AS (
|
| 79 |
+
SELECT
|
| 80 |
+
base_address,
|
| 81 |
+
sum(priority_fee + coin_creator_fee) AS fees_sol,
|
| 82 |
+
sum(total_usd) AS volume_usd,
|
| 83 |
+
count() AS n_trades,
|
| 84 |
+
min(timestamp) AS t0,
|
| 85 |
+
argMax(timestamp, price_usd) AS t_ath
|
| 86 |
+
FROM trades
|
| 87 |
+
GROUP BY base_address
|
| 88 |
+
),
|
| 89 |
+
ret_agg AS (
|
| 90 |
+
SELECT
|
| 91 |
+
token_address,
|
| 92 |
+
(argMax(ath_price_usd, updated_at) / {LAUNCH_PRICE_USD}) AS ret,
|
| 93 |
+
argMax(unique_holders, updated_at) AS unique_holders
|
| 94 |
+
FROM token_metrics
|
| 95 |
+
GROUP BY token_address
|
| 96 |
+
),
|
| 97 |
+
snipers AS (
|
| 98 |
+
SELECT
|
| 99 |
+
m.base_address AS token_address,
|
| 100 |
+
(m.val / t.total_supply * 100) AS snipers_pct
|
| 101 |
+
FROM (
|
| 102 |
+
SELECT
|
| 103 |
+
base_address,
|
| 104 |
+
sumIf(base_amount, buyer_rank <= 70) AS val
|
| 105 |
+
FROM (
|
| 106 |
+
SELECT
|
| 107 |
+
base_address,
|
| 108 |
+
base_amount,
|
| 109 |
+
dense_rank() OVER (PARTITION BY base_address ORDER BY min_slot, min_idx) AS buyer_rank
|
| 110 |
+
FROM (
|
| 111 |
+
SELECT
|
| 112 |
+
base_address,
|
| 113 |
+
maker,
|
| 114 |
+
min(slot) AS min_slot,
|
| 115 |
+
min(transaction_index) AS min_idx,
|
| 116 |
+
sum(base_amount) AS base_amount
|
| 117 |
+
FROM trades
|
| 118 |
+
WHERE trade_type = 0
|
| 119 |
+
GROUP BY base_address, maker
|
| 120 |
+
)
|
| 121 |
+
)
|
| 122 |
+
GROUP BY base_address
|
| 123 |
+
) m
|
| 124 |
+
JOIN (
|
| 125 |
+
SELECT token_address, argMax(total_supply, updated_at) AS total_supply
|
| 126 |
+
FROM tokens
|
| 127 |
+
GROUP BY token_address
|
| 128 |
+
) t ON m.base_address = t.token_address
|
| 129 |
+
WHERE t.total_supply > 0
|
| 130 |
+
),
|
| 131 |
+
bundled AS (
|
| 132 |
+
SELECT
|
| 133 |
+
m.base_address AS token_address,
|
| 134 |
+
(m.val / t.total_supply * 100) AS bundled_pct
|
| 135 |
+
FROM (
|
| 136 |
+
SELECT
|
| 137 |
+
t.base_address,
|
| 138 |
+
sum(t.base_amount) AS val
|
| 139 |
+
FROM trades t
|
| 140 |
+
JOIN (
|
| 141 |
+
SELECT base_address, min(slot) AS min_slot
|
| 142 |
+
FROM trades
|
| 143 |
+
GROUP BY base_address
|
| 144 |
+
) m ON t.base_address = m.base_address AND t.slot = m.min_slot
|
| 145 |
+
WHERE t.trade_type = 0
|
| 146 |
+
GROUP BY t.base_address
|
| 147 |
+
) m
|
| 148 |
+
JOIN (
|
| 149 |
+
SELECT token_address, argMax(total_supply, updated_at) AS total_supply
|
| 150 |
+
FROM tokens
|
| 151 |
+
GROUP BY token_address
|
| 152 |
+
) t ON m.base_address = t.token_address
|
| 153 |
+
WHERE t.total_supply > 0
|
| 154 |
+
),
|
| 155 |
+
dev_hold AS (
|
| 156 |
+
SELECT
|
| 157 |
+
t.token_address AS token_address,
|
| 158 |
+
(wh.current_balance / (t.total_supply / pow(10, t.decimals)) * 100) AS dev_hold_pct
|
| 159 |
+
FROM (
|
| 160 |
+
SELECT
|
| 161 |
+
token_address,
|
| 162 |
+
argMax(creator_address, updated_at) AS creator_address,
|
| 163 |
+
argMax(total_supply, updated_at) AS total_supply,
|
| 164 |
+
argMax(decimals, updated_at) AS decimals
|
| 165 |
+
FROM tokens
|
| 166 |
+
GROUP BY token_address
|
| 167 |
+
) t
|
| 168 |
+
JOIN (
|
| 169 |
+
SELECT mint_address, wallet_address, argMax(current_balance, updated_at) AS current_balance
|
| 170 |
+
FROM wallet_holdings
|
| 171 |
+
GROUP BY mint_address, wallet_address
|
| 172 |
+
) wh ON t.token_address = wh.mint_address AND t.creator_address = wh.wallet_address
|
| 173 |
+
WHERE t.total_supply > 0
|
| 174 |
+
),
|
| 175 |
+
insiders AS (
|
| 176 |
+
SELECT
|
| 177 |
+
wh.mint_address AS token_address,
|
| 178 |
+
(sum(wh.current_balance) / (t.total_supply / pow(10, t.decimals)) * 100) AS insiders_pct
|
| 179 |
+
FROM (
|
| 180 |
+
SELECT mint_address, wallet_address, argMax(current_balance, updated_at) AS current_balance
|
| 181 |
+
FROM wallet_holdings
|
| 182 |
+
GROUP BY mint_address, wallet_address
|
| 183 |
+
) wh
|
| 184 |
+
JOIN (
|
| 185 |
+
SELECT
|
| 186 |
+
wallet_address,
|
| 187 |
+
argMax(total_buys_count, updated_at) AS buys,
|
| 188 |
+
argMax(transfers_in_count, updated_at) AS transfers,
|
| 189 |
+
argMax(spl_transfers_in_count, updated_at) AS spl_transfers
|
| 190 |
+
FROM wallet_profile_metrics
|
| 191 |
+
GROUP BY wallet_address
|
| 192 |
+
) wpm ON wh.wallet_address = wpm.wallet_address
|
| 193 |
+
JOIN (
|
| 194 |
+
SELECT token_address, argMax(total_supply, updated_at) AS total_supply, argMax(decimals, updated_at) AS decimals
|
| 195 |
+
FROM tokens
|
| 196 |
+
GROUP BY token_address
|
| 197 |
+
) t ON wh.mint_address = t.token_address
|
| 198 |
+
WHERE wpm.buys = 0 AND (wpm.transfers > 0 OR wpm.spl_transfers > 0) AND t.total_supply > 0
|
| 199 |
+
GROUP BY wh.mint_address, t.total_supply, t.decimals
|
| 200 |
+
)
|
| 201 |
+
SELECT
|
| 202 |
+
r.token_address,
|
| 203 |
+
r.ret,
|
| 204 |
+
r.unique_holders,
|
| 205 |
+
f.fees_sol,
|
| 206 |
+
f.volume_usd,
|
| 207 |
+
f.n_trades,
|
| 208 |
+
(f.t_ath - f.t0) AS time_to_ath_sec,
|
| 209 |
+
s.snipers_pct,
|
| 210 |
+
b.bundled_pct,
|
| 211 |
+
d.dev_hold_pct,
|
| 212 |
+
i.insiders_pct
|
| 213 |
+
FROM ret_agg r
|
| 214 |
+
LEFT JOIN trade_agg f ON r.token_address = f.base_address
|
| 215 |
+
LEFT JOIN snipers s ON r.token_address = s.token_address
|
| 216 |
+
LEFT JOIN bundled b ON r.token_address = b.token_address
|
| 217 |
+
LEFT JOIN dev_hold d ON r.token_address = d.token_address
|
| 218 |
+
LEFT JOIN insiders i ON r.token_address = i.token_address
|
| 219 |
+
"""
|
| 220 |
+
rows = client.execute(query)
|
| 221 |
+
cols = [
|
| 222 |
+
"token_address",
|
| 223 |
+
"ret",
|
| 224 |
+
"unique_holders",
|
| 225 |
+
"fees_sol",
|
| 226 |
+
"volume_usd",
|
| 227 |
+
"n_trades",
|
| 228 |
+
"time_to_ath_sec",
|
| 229 |
+
"snipers_pct",
|
| 230 |
+
"bundled_pct",
|
| 231 |
+
"dev_hold_pct",
|
| 232 |
+
"insiders_pct",
|
| 233 |
+
]
|
| 234 |
+
out = []
|
| 235 |
+
for r in rows:
|
| 236 |
+
out.append(dict(zip(cols, r)))
|
| 237 |
+
return out
|
| 238 |
+
|
| 239 |
+
|
| 240 |
+
def _compute_quality_scores(
|
| 241 |
+
client,
|
| 242 |
+
max_ret: float = 10000.0,
|
| 243 |
+
rerank: bool = True,
|
| 244 |
+
with_debug: bool = False,
|
| 245 |
+
):
|
| 246 |
+
data = fetch_token_metrics(client)
|
| 247 |
+
|
| 248 |
+
# feature spec: (name, getter, positive_when_high)
|
| 249 |
+
feature_defs = [
|
| 250 |
+
("fees_log", lambda d: math.log1p(d["fees_sol"]) if d["fees_sol"] is not None else None, True),
|
| 251 |
+
("volume_log", lambda d: math.log1p(d["volume_usd"]) if d["volume_usd"] is not None else None, True),
|
| 252 |
+
("holders_log", lambda d: math.log1p(d["unique_holders"]) if d["unique_holders"] is not None else None, True),
|
| 253 |
+
("time_to_ath_log", lambda d: math.log1p(d["time_to_ath_sec"]) if d["time_to_ath_sec"] is not None else None, True),
|
| 254 |
+
("fees_per_volume", lambda d: (d["fees_sol"] / (d["volume_usd"] + EPS)) if d["fees_sol"] is not None and d["volume_usd"] is not None else None, True),
|
| 255 |
+
("fees_per_trade", lambda d: (d["fees_sol"] / (d["n_trades"] + EPS)) if d["fees_sol"] is not None and d["n_trades"] is not None else None, True),
|
| 256 |
+
("holders_per_trade", lambda d: (d["unique_holders"] / (d["n_trades"] + EPS)) if d["unique_holders"] is not None and d["n_trades"] is not None else None, True),
|
| 257 |
+
("holders_per_volume", lambda d: (d["unique_holders"] / (d["volume_usd"] + EPS)) if d["unique_holders"] is not None and d["volume_usd"] is not None else None, True),
|
| 258 |
+
("snipers_pct", lambda d: d["snipers_pct"], False),
|
| 259 |
+
("bundled_pct", lambda d: d["bundled_pct"], False),
|
| 260 |
+
("dev_hold_pct", lambda d: d["dev_hold_pct"], False),
|
| 261 |
+
("insiders_pct", lambda d: d["insiders_pct"], False),
|
| 262 |
+
]
|
| 263 |
+
|
| 264 |
+
raw_metrics = ["snipers_pct", "bundled_pct", "dev_hold_pct", "insiders_pct"]
|
| 265 |
+
|
| 266 |
+
debug = None
|
| 267 |
+
if with_debug:
|
| 268 |
+
debug = {
|
| 269 |
+
"q_raw": [],
|
| 270 |
+
"feature_pairs": {f[0]: [] for f in feature_defs},
|
| 271 |
+
"raw_pairs": {m: [] for m in raw_metrics},
|
| 272 |
+
}
|
| 273 |
+
|
| 274 |
+
# Build bucket mapping
|
| 275 |
+
buckets: Dict[int, List[dict]] = {}
|
| 276 |
+
for d in data:
|
| 277 |
+
ret_val = d.get("ret")
|
| 278 |
+
if ret_val is None or ret_val <= 0 or ret_val > max_ret:
|
| 279 |
+
continue
|
| 280 |
+
b = _bucket_id(ret_val)
|
| 281 |
+
if b == -1:
|
| 282 |
+
continue
|
| 283 |
+
d["bucket_id"] = b
|
| 284 |
+
buckets.setdefault(b, []).append(d)
|
| 285 |
+
|
| 286 |
+
# Compute percentiles per bucket + feature
|
| 287 |
+
token_scores = []
|
| 288 |
+
for b, items in buckets.items():
|
| 289 |
+
# Precompute percentiles per feature
|
| 290 |
+
feature_percentiles: Dict[str, Dict[str, float]] = {}
|
| 291 |
+
for fname, fget, _pos in feature_defs:
|
| 292 |
+
vals = []
|
| 293 |
+
for d in items:
|
| 294 |
+
v = fget(d)
|
| 295 |
+
if v is None or (isinstance(v, float) and (math.isnan(v) or math.isinf(v))):
|
| 296 |
+
continue
|
| 297 |
+
vals.append((d["token_address"], v))
|
| 298 |
+
feature_percentiles[fname] = _midrank_percentiles(vals)
|
| 299 |
+
|
| 300 |
+
# Compute q_raw for each token
|
| 301 |
+
q_raw_map = {}
|
| 302 |
+
for d in items:
|
| 303 |
+
s_vals = []
|
| 304 |
+
s_map = {}
|
| 305 |
+
for fname, _fget, pos in feature_defs:
|
| 306 |
+
p = feature_percentiles[fname].get(d["token_address"])
|
| 307 |
+
if p is None:
|
| 308 |
+
continue
|
| 309 |
+
s = 2.0 * p - 1.0
|
| 310 |
+
if not pos:
|
| 311 |
+
s = -s
|
| 312 |
+
# clip
|
| 313 |
+
if s > 0.99:
|
| 314 |
+
s = 0.99
|
| 315 |
+
elif s < -0.99:
|
| 316 |
+
s = -0.99
|
| 317 |
+
s_vals.append(s)
|
| 318 |
+
s_map[fname] = s
|
| 319 |
+
if not s_vals:
|
| 320 |
+
continue
|
| 321 |
+
q_raw = sum(s_vals) / len(s_vals)
|
| 322 |
+
q_raw_map[d["token_address"]] = q_raw
|
| 323 |
+
if with_debug:
|
| 324 |
+
debug["q_raw"].append(q_raw)
|
| 325 |
+
for fname, s in s_map.items():
|
| 326 |
+
debug["feature_pairs"][fname].append((q_raw, s))
|
| 327 |
+
for metric in raw_metrics:
|
| 328 |
+
raw_val = d.get(metric)
|
| 329 |
+
if raw_val is None:
|
| 330 |
+
continue
|
| 331 |
+
debug["raw_pairs"][metric].append((q_raw, raw_val))
|
| 332 |
+
|
| 333 |
+
# Optional re-rank within bucket
|
| 334 |
+
if rerank:
|
| 335 |
+
q_items = [(t, q) for t, q in q_raw_map.items()]
|
| 336 |
+
q_p = _midrank_percentiles(q_items)
|
| 337 |
+
for d in items:
|
| 338 |
+
t = d["token_address"]
|
| 339 |
+
if t not in q_raw_map:
|
| 340 |
+
continue
|
| 341 |
+
q_final = 2.0 * q_p[t] - 1.0
|
| 342 |
+
token_scores.append(
|
| 343 |
+
{
|
| 344 |
+
"token_address": t,
|
| 345 |
+
"bucket_id": b,
|
| 346 |
+
"ret": d["ret"],
|
| 347 |
+
"q_raw": q_raw_map[t],
|
| 348 |
+
"q": q_final,
|
| 349 |
+
}
|
| 350 |
+
)
|
| 351 |
+
else:
|
| 352 |
+
for d in items:
|
| 353 |
+
t = d["token_address"]
|
| 354 |
+
if t not in q_raw_map:
|
| 355 |
+
continue
|
| 356 |
+
token_scores.append(
|
| 357 |
+
{
|
| 358 |
+
"token_address": t,
|
| 359 |
+
"bucket_id": b,
|
| 360 |
+
"ret": d["ret"],
|
| 361 |
+
"q_raw": q_raw_map[t],
|
| 362 |
+
"q": q_raw_map[t],
|
| 363 |
+
}
|
| 364 |
+
)
|
| 365 |
+
|
| 366 |
+
if with_debug:
|
| 367 |
+
return token_scores, debug
|
| 368 |
+
return token_scores
|
| 369 |
+
|
| 370 |
+
|
| 371 |
+
def compute_quality_scores(
|
| 372 |
+
client,
|
| 373 |
+
max_ret: float = 10000.0,
|
| 374 |
+
rerank: bool = True,
|
| 375 |
+
) -> List[dict]:
|
| 376 |
+
return _compute_quality_scores(client, max_ret=max_ret, rerank=rerank, with_debug=False)
|
| 377 |
+
|
| 378 |
+
|
| 379 |
+
def write_jsonl(path: str, rows: List[dict]) -> None:
|
| 380 |
+
os.makedirs(os.path.dirname(path), exist_ok=True)
|
| 381 |
+
with open(path, "w", encoding="utf-8") as f:
|
| 382 |
+
for r in rows:
|
| 383 |
+
f.write(json.dumps(r) + "\n")
|
| 384 |
+
|
| 385 |
+
|
| 386 |
+
def _percentile(sorted_vals: List[float], p: float) -> float:
|
| 387 |
+
if not sorted_vals:
|
| 388 |
+
return float("nan")
|
| 389 |
+
n = len(sorted_vals)
|
| 390 |
+
if n == 1:
|
| 391 |
+
return sorted_vals[0]
|
| 392 |
+
pos = p * (n - 1)
|
| 393 |
+
lo = int(math.floor(pos))
|
| 394 |
+
hi = int(math.ceil(pos))
|
| 395 |
+
if lo == hi:
|
| 396 |
+
return sorted_vals[lo]
|
| 397 |
+
frac = pos - lo
|
| 398 |
+
return sorted_vals[lo] * (1 - frac) + sorted_vals[hi] * frac
|
| 399 |
+
|
| 400 |
+
|
| 401 |
+
def _summary_stats(vals: List[float]) -> Dict[str, float]:
|
| 402 |
+
if not vals:
|
| 403 |
+
return {}
|
| 404 |
+
vals_sorted = sorted(vals)
|
| 405 |
+
return {
|
| 406 |
+
"mean": sum(vals_sorted) / len(vals_sorted),
|
| 407 |
+
"min": vals_sorted[0],
|
| 408 |
+
"max": vals_sorted[-1],
|
| 409 |
+
"p10": _percentile(vals_sorted, 0.10),
|
| 410 |
+
"p50": _percentile(vals_sorted, 0.50),
|
| 411 |
+
"p90": _percentile(vals_sorted, 0.90),
|
| 412 |
+
"p99": _percentile(vals_sorted, 0.99),
|
| 413 |
+
}
|
| 414 |
+
|
| 415 |
+
|
| 416 |
+
def _pearson_corr(xs: List[float], ys: List[float]) -> float:
|
| 417 |
+
if not xs or not ys or len(xs) != len(ys) or len(xs) < 2:
|
| 418 |
+
return float("nan")
|
| 419 |
+
n = len(xs)
|
| 420 |
+
mean_x = sum(xs) / n
|
| 421 |
+
mean_y = sum(ys) / n
|
| 422 |
+
num = 0.0
|
| 423 |
+
den_x = 0.0
|
| 424 |
+
den_y = 0.0
|
| 425 |
+
for i in range(n):
|
| 426 |
+
dx = xs[i] - mean_x
|
| 427 |
+
dy = ys[i] - mean_y
|
| 428 |
+
num += dx * dy
|
| 429 |
+
den_x += dx * dx
|
| 430 |
+
den_y += dy * dy
|
| 431 |
+
denom = math.sqrt(den_x * den_y)
|
| 432 |
+
if denom == 0.0:
|
| 433 |
+
return float("nan")
|
| 434 |
+
return num / denom
|
| 435 |
+
|
| 436 |
+
|
| 437 |
+
def _bucket_label(b: int) -> str:
|
| 438 |
+
lower = RETURN_THRESHOLDS[b]
|
| 439 |
+
upper = RETURN_THRESHOLDS[b + 1] if b + 1 < len(RETURN_THRESHOLDS) else None
|
| 440 |
+
if upper is None:
|
| 441 |
+
return f">= {lower}x"
|
| 442 |
+
return f"{lower}x - {upper}x"
|
| 443 |
+
|
| 444 |
+
|
| 445 |
+
def print_summary(scores: List[dict]) -> None:
|
| 446 |
+
print("=== QUALITY SCORE SUMMARY ===")
|
| 447 |
+
print(f"Total tokens scored: {len(scores)}")
|
| 448 |
+
if not scores:
|
| 449 |
+
return
|
| 450 |
+
|
| 451 |
+
overall_q = [s["q"] for s in scores if "q" in s]
|
| 452 |
+
overall_q_raw = [s["q_raw"] for s in scores if "q_raw" in s]
|
| 453 |
+
for name, series in [("q", overall_q), ("q_raw", overall_q_raw)]:
|
| 454 |
+
stats = _summary_stats(series)
|
| 455 |
+
if not stats:
|
| 456 |
+
continue
|
| 457 |
+
print(f"\nOverall {name}:")
|
| 458 |
+
print(f" Mean: {stats['mean']:.4f} | Min: {stats['min']:.4f} | Max: {stats['max']:.4f}")
|
| 459 |
+
print(f" Q: p10={stats['p10']:.2f} p50={stats['p50']:.2f} p90={stats['p90']:.2f} p99={stats['p99']:.2f}")
|
| 460 |
+
|
| 461 |
+
# Per-bucket summaries
|
| 462 |
+
buckets: Dict[int, List[dict]] = {}
|
| 463 |
+
for s in scores:
|
| 464 |
+
buckets.setdefault(s["bucket_id"], []).append(s)
|
| 465 |
+
|
| 466 |
+
for b in sorted(buckets.keys()):
|
| 467 |
+
items = buckets[b]
|
| 468 |
+
q_vals = [i["q"] for i in items if "q" in i]
|
| 469 |
+
q_raw_vals = [i["q_raw"] for i in items if "q_raw" in i]
|
| 470 |
+
print(f"\nSEGMENT: {b}. {_bucket_label(b)}")
|
| 471 |
+
print(f"Tokens in segment: {len(items)}")
|
| 472 |
+
stats_q = _summary_stats(q_vals)
|
| 473 |
+
stats_q_raw = _summary_stats(q_raw_vals)
|
| 474 |
+
if stats_q:
|
| 475 |
+
print(" q:")
|
| 476 |
+
print(f" Mean: {stats_q['mean']:.4f} | Min: {stats_q['min']:.4f} | Max: {stats_q['max']:.4f}")
|
| 477 |
+
print(f" Q: p10={stats_q['p10']:.2f} p50={stats_q['p50']:.2f} p90={stats_q['p90']:.2f} p99={stats_q['p99']:.2f}")
|
| 478 |
+
if stats_q_raw:
|
| 479 |
+
print(" q_raw:")
|
| 480 |
+
print(f" Mean: {stats_q_raw['mean']:.4f} | Min: {stats_q_raw['min']:.4f} | Max: {stats_q_raw['max']:.4f}")
|
| 481 |
+
print(f" Q: p10={stats_q_raw['p10']:.2f} p50={stats_q_raw['p50']:.2f} p90={stats_q_raw['p90']:.2f} p99={stats_q_raw['p99']:.2f}")
|
| 482 |
+
|
| 483 |
+
|
| 484 |
+
def print_diagnostics(debug: dict) -> None:
|
| 485 |
+
if not debug:
|
| 486 |
+
return
|
| 487 |
+
q_raw_vals = debug.get("q_raw", [])
|
| 488 |
+
if not q_raw_vals:
|
| 489 |
+
return
|
| 490 |
+
print("\n=== QUALITY SCORE DIAGNOSTICS ===")
|
| 491 |
+
|
| 492 |
+
feature_pairs = debug.get("feature_pairs", {})
|
| 493 |
+
if feature_pairs:
|
| 494 |
+
print("Correlation with q_raw (signed features):")
|
| 495 |
+
for fname in sorted(feature_pairs.keys()):
|
| 496 |
+
pairs = feature_pairs[fname]
|
| 497 |
+
xs = [p[0] for p in pairs]
|
| 498 |
+
ys = [p[1] for p in pairs]
|
| 499 |
+
corr = _pearson_corr(xs, ys)
|
| 500 |
+
print(f" {fname}: {corr:.4f} (n={len(pairs)})")
|
| 501 |
+
|
| 502 |
+
raw_pairs = debug.get("raw_pairs", {})
|
| 503 |
+
if raw_pairs:
|
| 504 |
+
q_sorted = sorted(q_raw_vals)
|
| 505 |
+
p10 = _percentile(q_sorted, 0.10)
|
| 506 |
+
p90 = _percentile(q_sorted, 0.90)
|
| 507 |
+
print("\nTop/bottom decile raw means (by q_raw):")
|
| 508 |
+
for metric in sorted(raw_pairs.keys()):
|
| 509 |
+
pairs = raw_pairs[metric]
|
| 510 |
+
lows = [v for q, v in pairs if q <= p10]
|
| 511 |
+
highs = [v for q, v in pairs if q >= p90]
|
| 512 |
+
if not lows or not highs:
|
| 513 |
+
continue
|
| 514 |
+
low_mean = sum(lows) / len(lows)
|
| 515 |
+
high_mean = sum(highs) / len(highs)
|
| 516 |
+
print(f" {metric}: bottom_mean={low_mean:.4f} top_mean={high_mean:.4f} (n_low={len(lows)}, n_high={len(highs)})")
|
| 517 |
+
|
| 518 |
+
|
| 519 |
+
def main():
|
| 520 |
+
parser = argparse.ArgumentParser(description="Compute token quality/health score.")
|
| 521 |
+
parser.add_argument("--max-ret", type=float, default=10000.0, help="Max return to include")
|
| 522 |
+
parser.add_argument("--no-rerank", action="store_true", help="Disable final rerank within bucket")
|
| 523 |
+
parser.add_argument("--no-summary", action="store_true", help="Disable summary logging")
|
| 524 |
+
parser.add_argument("--no-diagnostics", action="store_true", help="Disable diagnostics logging")
|
| 525 |
+
args = parser.parse_args()
|
| 526 |
+
|
| 527 |
+
client = get_client()
|
| 528 |
+
if args.no_diagnostics:
|
| 529 |
+
scores = compute_quality_scores(client, max_ret=args.max_ret, rerank=not args.no_rerank)
|
| 530 |
+
debug = None
|
| 531 |
+
else:
|
| 532 |
+
scores, debug = _compute_quality_scores(
|
| 533 |
+
client,
|
| 534 |
+
max_ret=args.max_ret,
|
| 535 |
+
rerank=not args.no_rerank,
|
| 536 |
+
with_debug=True,
|
| 537 |
+
)
|
| 538 |
+
if not args.no_summary:
|
| 539 |
+
print_summary(scores)
|
| 540 |
+
if not args.no_diagnostics:
|
| 541 |
+
print_diagnostics(debug)
|
| 542 |
+
|
| 543 |
+
|
| 544 |
+
if __name__ == "__main__":
|
| 545 |
+
main()
|