upload cortexa-marketing-feedback v1
Browse files- README.md +66 -0
- config.json +20 -0
- student_int8.onnx +3 -0
- tokenizer.json +120 -0
README.md
ADDED
|
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: other
|
| 5 |
+
license_name: pleius-internal
|
| 6 |
+
tags:
|
| 7 |
+
- onnx
|
| 8 |
+
- conditional-text-generation
|
| 9 |
+
- ad-feedback
|
| 10 |
+
- distillation
|
| 11 |
+
- creator-tools
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# cortexa-marketing-feedback (distilled student)
|
| 15 |
+
|
| 16 |
+
A ~4.4M-parameter conditional decoder distilled from
|
| 17 |
+
`M725/cortexa-marketing-scorer` outputs. Takes CLIP-ViT-B/32 vision
|
| 18 |
+
features (768-d) + the 4 Marketing pillar scores (or a "no-scores"
|
| 19 |
+
sentinel for fast mode) and emits a creator-vernacular phrase chain:
|
| 20 |
+
|
| 21 |
+
```
|
| 22 |
+
"scroll stopping | clear cta | thumb stopping"
|
| 23 |
+
"forgettable | looks clean | low contrast text"
|
| 24 |
+
"lazy design | model looks fake | low contrast"
|
| 25 |
+
```
|
| 26 |
+
|
| 27 |
+
The student is meant to be the *feedback callout* shown on the result
|
| 28 |
+
screen for paid users — plain-language pros and cons that go alongside
|
| 29 |
+
the scorer's numeric output.
|
| 30 |
+
|
| 31 |
+
## Files
|
| 32 |
+
|
| 33 |
+
| file | purpose |
|
| 34 |
+
|---|---|
|
| 35 |
+
| `student_int8.onnx` | TinyTransformer decoder, 4 layers / 256-dim / 4 heads, INT8 dynamic-quantized. 6.9 MB. |
|
| 36 |
+
| `tokenizer.json` | Whole-phrase tokenizer (vocab ~115; specials `<pad>`, `<bos>`, `<eos>`, `<sep>`). |
|
| 37 |
+
| `config.json` | Encoder dim, pillar names, vocab size, special-token ids — read by the TS/JS runtime to shape inputs. |
|
| 38 |
+
|
| 39 |
+
## Inference shape
|
| 40 |
+
|
| 41 |
+
```
|
| 42 |
+
inputs:
|
| 43 |
+
encoder_feats (1, 768) float32 # mean-pooled CLIP-ViT-B/32 vision output
|
| 44 |
+
scores (1, 4) float32 # [universal_appeal, demographic_appeal, audience_drive, engagement] in [0,1]
|
| 45 |
+
scores_present (1,) float32 # 1.0 anchored, 0.0 fast-mode
|
| 46 |
+
input_ids (1, T) int64 # decoder context
|
| 47 |
+
outputs:
|
| 48 |
+
logits (1, T, V) float32
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
Greedy decode works; **temperature 0.8 + top-k 20 + SEP-veto** is the
|
| 52 |
+
recommended sampling config when running on more than one input
|
| 53 |
+
(prevents the greedy "forgettable | forgettable | forgettable" collapse
|
| 54 |
+
the v0 model exhibited).
|
| 55 |
+
|
| 56 |
+
## Training
|
| 57 |
+
|
| 58 |
+
15k phrase triples from 5k COCO photos. Each photo scored locally
|
| 59 |
+
against the cortexa_v10 head; phrase chains generated by
|
| 60 |
+
`research.distill_adjectives.phrase_rules.scores_to_phrase`. 12 epochs,
|
| 61 |
+
AdamW, cosine schedule. Val loss 2.31 → 1.87. See
|
| 62 |
+
`research/distill_students/train_marketing.py` in the app repo.
|
| 63 |
+
|
| 64 |
+
## License
|
| 65 |
+
|
| 66 |
+
Pleius internal — see https://pleius.com. Not for redistribution.
|
config.json
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"modality": "marketing",
|
| 3 |
+
"encoder": "openai/clip-vit-base-patch32",
|
| 4 |
+
"encoder_dim": 768,
|
| 5 |
+
"n_pillars": 4,
|
| 6 |
+
"pillars": [
|
| 7 |
+
"universal_appeal",
|
| 8 |
+
"demographic_appeal",
|
| 9 |
+
"audience_drive",
|
| 10 |
+
"engagement"
|
| 11 |
+
],
|
| 12 |
+
"d_model": 256,
|
| 13 |
+
"n_layers": 4,
|
| 14 |
+
"max_seq_len": 16,
|
| 15 |
+
"vocab_size": 115,
|
| 16 |
+
"bos_id": 1,
|
| 17 |
+
"eos_id": 2,
|
| 18 |
+
"pad_id": 0,
|
| 19 |
+
"sep_id": 3
|
| 20 |
+
}
|
student_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c2cb29dbdbdd5431d927b52a0d92fb6208085958e787cc14d8755a50c1eaed04
|
| 3 |
+
size 7226461
|
tokenizer.json
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"modality": "marketing",
|
| 3 |
+
"tokens": [
|
| 4 |
+
"<pad>",
|
| 5 |
+
"<bos>",
|
| 6 |
+
"<eos>",
|
| 7 |
+
"<sep>",
|
| 8 |
+
"eye catching",
|
| 9 |
+
"scroll stopping",
|
| 10 |
+
"thumb stopping",
|
| 11 |
+
"pops on feed",
|
| 12 |
+
"stops the scroll",
|
| 13 |
+
"bold colors",
|
| 14 |
+
"good contrast",
|
| 15 |
+
"clean composition",
|
| 16 |
+
"strong focal point",
|
| 17 |
+
"good lighting",
|
| 18 |
+
"well lit",
|
| 19 |
+
"well framed",
|
| 20 |
+
"color works",
|
| 21 |
+
"color palette slaps",
|
| 22 |
+
"vibe is right",
|
| 23 |
+
"looks premium",
|
| 24 |
+
"looks expensive",
|
| 25 |
+
"feels intentional",
|
| 26 |
+
"too busy",
|
| 27 |
+
"blurry",
|
| 28 |
+
"low contrast",
|
| 29 |
+
"no clear focus",
|
| 30 |
+
"bad lighting",
|
| 31 |
+
"cluttered",
|
| 32 |
+
"looks dated",
|
| 33 |
+
"looks like 2014",
|
| 34 |
+
"uncanny",
|
| 35 |
+
"ai generated feel",
|
| 36 |
+
"weird crop",
|
| 37 |
+
"off center weird",
|
| 38 |
+
"background too loud",
|
| 39 |
+
"colors clash",
|
| 40 |
+
"low effort",
|
| 41 |
+
"lazy design",
|
| 42 |
+
"no vibe",
|
| 43 |
+
"looks like clip art",
|
| 44 |
+
"on brand",
|
| 45 |
+
"feels native to the platform",
|
| 46 |
+
"looks like a real photo",
|
| 47 |
+
"model looks natural",
|
| 48 |
+
"feels like a real creator",
|
| 49 |
+
"talks to the right person",
|
| 50 |
+
"knows the audience",
|
| 51 |
+
"feels organic",
|
| 52 |
+
"doesn't feel like an ad",
|
| 53 |
+
"lands for the target",
|
| 54 |
+
"the right energy",
|
| 55 |
+
"right vibe for the audience",
|
| 56 |
+
"off brand",
|
| 57 |
+
"screams ad",
|
| 58 |
+
"looks like an ad",
|
| 59 |
+
"stock photo feel",
|
| 60 |
+
"feels like a stock photo",
|
| 61 |
+
"wrong audience",
|
| 62 |
+
"wrong tone",
|
| 63 |
+
"feels generic",
|
| 64 |
+
"feels templated",
|
| 65 |
+
"model looks fake",
|
| 66 |
+
"wrong vibe",
|
| 67 |
+
"doesn't fit the platform",
|
| 68 |
+
"clear cta",
|
| 69 |
+
"the offer pops",
|
| 70 |
+
"price tag clear",
|
| 71 |
+
"deal feels real",
|
| 72 |
+
"social proof shows",
|
| 73 |
+
"you'd actually click",
|
| 74 |
+
"you know what they sell",
|
| 75 |
+
"product is the hero",
|
| 76 |
+
"hero shot works",
|
| 77 |
+
"instantly readable",
|
| 78 |
+
"headline lands",
|
| 79 |
+
"headline sells it",
|
| 80 |
+
"weak cta",
|
| 81 |
+
"no offer",
|
| 82 |
+
"offer is unclear",
|
| 83 |
+
"can't read the cta",
|
| 84 |
+
"where's the product",
|
| 85 |
+
"what are they selling",
|
| 86 |
+
"headline buried",
|
| 87 |
+
"headline doesn't sell",
|
| 88 |
+
"no reason to click",
|
| 89 |
+
"small text",
|
| 90 |
+
"small cta",
|
| 91 |
+
"the ask is buried",
|
| 92 |
+
"clear product shot",
|
| 93 |
+
"clear text",
|
| 94 |
+
"memorable",
|
| 95 |
+
"saveable",
|
| 96 |
+
"shareable",
|
| 97 |
+
"you'd save this",
|
| 98 |
+
"you'd send this to a friend",
|
| 99 |
+
"feels native",
|
| 100 |
+
"screams brand",
|
| 101 |
+
"logo placement good",
|
| 102 |
+
"logo readable",
|
| 103 |
+
"text hierarchy clean",
|
| 104 |
+
"tight crop",
|
| 105 |
+
"negative space works",
|
| 106 |
+
"too much text",
|
| 107 |
+
"wall of text",
|
| 108 |
+
"cluttered text",
|
| 109 |
+
"no hierarchy",
|
| 110 |
+
"logo is huge",
|
| 111 |
+
"logo is invisible",
|
| 112 |
+
"you'd scroll past",
|
| 113 |
+
"forgettable",
|
| 114 |
+
"boring",
|
| 115 |
+
"low contrast text",
|
| 116 |
+
"text overlaps the product",
|
| 117 |
+
"background fights the product",
|
| 118 |
+
"looks clean"
|
| 119 |
+
]
|
| 120 |
+
}
|