M725 commited on
Commit
339338b
·
verified ·
1 Parent(s): 76cc921

upload cortexa-marketing-feedback v1

Browse files
Files changed (4) hide show
  1. README.md +66 -0
  2. config.json +20 -0
  3. student_int8.onnx +3 -0
  4. tokenizer.json +120 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: other
5
+ license_name: pleius-internal
6
+ tags:
7
+ - onnx
8
+ - conditional-text-generation
9
+ - ad-feedback
10
+ - distillation
11
+ - creator-tools
12
+ ---
13
+
14
+ # cortexa-marketing-feedback (distilled student)
15
+
16
+ A ~4.4M-parameter conditional decoder distilled from
17
+ `M725/cortexa-marketing-scorer` outputs. Takes CLIP-ViT-B/32 vision
18
+ features (768-d) + the 4 Marketing pillar scores (or a "no-scores"
19
+ sentinel for fast mode) and emits a creator-vernacular phrase chain:
20
+
21
+ ```
22
+ "scroll stopping | clear cta | thumb stopping"
23
+ "forgettable | looks clean | low contrast text"
24
+ "lazy design | model looks fake | low contrast"
25
+ ```
26
+
27
+ The student is meant to be the *feedback callout* shown on the result
28
+ screen for paid users — plain-language pros and cons that go alongside
29
+ the scorer's numeric output.
30
+
31
+ ## Files
32
+
33
+ | file | purpose |
34
+ |---|---|
35
+ | `student_int8.onnx` | TinyTransformer decoder, 4 layers / 256-dim / 4 heads, INT8 dynamic-quantized. 6.9 MB. |
36
+ | `tokenizer.json` | Whole-phrase tokenizer (vocab ~115; specials `<pad>`, `<bos>`, `<eos>`, `<sep>`). |
37
+ | `config.json` | Encoder dim, pillar names, vocab size, special-token ids — read by the TS/JS runtime to shape inputs. |
38
+
39
+ ## Inference shape
40
+
41
+ ```
42
+ inputs:
43
+ encoder_feats (1, 768) float32 # mean-pooled CLIP-ViT-B/32 vision output
44
+ scores (1, 4) float32 # [universal_appeal, demographic_appeal, audience_drive, engagement] in [0,1]
45
+ scores_present (1,) float32 # 1.0 anchored, 0.0 fast-mode
46
+ input_ids (1, T) int64 # decoder context
47
+ outputs:
48
+ logits (1, T, V) float32
49
+ ```
50
+
51
+ Greedy decode works; **temperature 0.8 + top-k 20 + SEP-veto** is the
52
+ recommended sampling config when running on more than one input
53
+ (prevents the greedy "forgettable | forgettable | forgettable" collapse
54
+ the v0 model exhibited).
55
+
56
+ ## Training
57
+
58
+ 15k phrase triples from 5k COCO photos. Each photo scored locally
59
+ against the cortexa_v10 head; phrase chains generated by
60
+ `research.distill_adjectives.phrase_rules.scores_to_phrase`. 12 epochs,
61
+ AdamW, cosine schedule. Val loss 2.31 → 1.87. See
62
+ `research/distill_students/train_marketing.py` in the app repo.
63
+
64
+ ## License
65
+
66
+ Pleius internal — see https://pleius.com. Not for redistribution.
config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "modality": "marketing",
3
+ "encoder": "openai/clip-vit-base-patch32",
4
+ "encoder_dim": 768,
5
+ "n_pillars": 4,
6
+ "pillars": [
7
+ "universal_appeal",
8
+ "demographic_appeal",
9
+ "audience_drive",
10
+ "engagement"
11
+ ],
12
+ "d_model": 256,
13
+ "n_layers": 4,
14
+ "max_seq_len": 16,
15
+ "vocab_size": 115,
16
+ "bos_id": 1,
17
+ "eos_id": 2,
18
+ "pad_id": 0,
19
+ "sep_id": 3
20
+ }
student_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2cb29dbdbdd5431d927b52a0d92fb6208085958e787cc14d8755a50c1eaed04
3
+ size 7226461
tokenizer.json ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "modality": "marketing",
3
+ "tokens": [
4
+ "<pad>",
5
+ "<bos>",
6
+ "<eos>",
7
+ "<sep>",
8
+ "eye catching",
9
+ "scroll stopping",
10
+ "thumb stopping",
11
+ "pops on feed",
12
+ "stops the scroll",
13
+ "bold colors",
14
+ "good contrast",
15
+ "clean composition",
16
+ "strong focal point",
17
+ "good lighting",
18
+ "well lit",
19
+ "well framed",
20
+ "color works",
21
+ "color palette slaps",
22
+ "vibe is right",
23
+ "looks premium",
24
+ "looks expensive",
25
+ "feels intentional",
26
+ "too busy",
27
+ "blurry",
28
+ "low contrast",
29
+ "no clear focus",
30
+ "bad lighting",
31
+ "cluttered",
32
+ "looks dated",
33
+ "looks like 2014",
34
+ "uncanny",
35
+ "ai generated feel",
36
+ "weird crop",
37
+ "off center weird",
38
+ "background too loud",
39
+ "colors clash",
40
+ "low effort",
41
+ "lazy design",
42
+ "no vibe",
43
+ "looks like clip art",
44
+ "on brand",
45
+ "feels native to the platform",
46
+ "looks like a real photo",
47
+ "model looks natural",
48
+ "feels like a real creator",
49
+ "talks to the right person",
50
+ "knows the audience",
51
+ "feels organic",
52
+ "doesn't feel like an ad",
53
+ "lands for the target",
54
+ "the right energy",
55
+ "right vibe for the audience",
56
+ "off brand",
57
+ "screams ad",
58
+ "looks like an ad",
59
+ "stock photo feel",
60
+ "feels like a stock photo",
61
+ "wrong audience",
62
+ "wrong tone",
63
+ "feels generic",
64
+ "feels templated",
65
+ "model looks fake",
66
+ "wrong vibe",
67
+ "doesn't fit the platform",
68
+ "clear cta",
69
+ "the offer pops",
70
+ "price tag clear",
71
+ "deal feels real",
72
+ "social proof shows",
73
+ "you'd actually click",
74
+ "you know what they sell",
75
+ "product is the hero",
76
+ "hero shot works",
77
+ "instantly readable",
78
+ "headline lands",
79
+ "headline sells it",
80
+ "weak cta",
81
+ "no offer",
82
+ "offer is unclear",
83
+ "can't read the cta",
84
+ "where's the product",
85
+ "what are they selling",
86
+ "headline buried",
87
+ "headline doesn't sell",
88
+ "no reason to click",
89
+ "small text",
90
+ "small cta",
91
+ "the ask is buried",
92
+ "clear product shot",
93
+ "clear text",
94
+ "memorable",
95
+ "saveable",
96
+ "shareable",
97
+ "you'd save this",
98
+ "you'd send this to a friend",
99
+ "feels native",
100
+ "screams brand",
101
+ "logo placement good",
102
+ "logo readable",
103
+ "text hierarchy clean",
104
+ "tight crop",
105
+ "negative space works",
106
+ "too much text",
107
+ "wall of text",
108
+ "cluttered text",
109
+ "no hierarchy",
110
+ "logo is huge",
111
+ "logo is invisible",
112
+ "you'd scroll past",
113
+ "forgettable",
114
+ "boring",
115
+ "low contrast text",
116
+ "text overlaps the product",
117
+ "background fights the product",
118
+ "looks clean"
119
+ ]
120
+ }