--- language: - en license: other license_name: pleius-internal tags: - onnx - conditional-text-generation - video-feedback - distillation - creator-tools --- # cortexa-create-feedback (distilled student) A ~4.4M-parameter conditional decoder distilled from `M725/cortexa-create-scorer` outputs. Takes CLIP-ViT-B/32 vision features (mean-pooled across video frames, 768-d) + the 5 Create pillar scores and emits a creator-vernacular phrase chain about the short-form video: ``` "first frame slaps | feels intentional" "thumb stopping | shareable" "filler | feels rushed | first frame is nothing" "feels off beat | slow open | no payoff" ``` ## Files | file | purpose | |---|---| | `student_int8.onnx` | TinyTransformer decoder, 4 layers / 256-dim / 4 heads, INT8 dynamic-quantized. 6.9 MB. | | `tokenizer.json` | Whole-phrase tokenizer (vocab ~138; specials ``, ``, ``, ``). | | `config.json` | Encoder dim, pillar names, vocab size, special-token ids. | ## Inference shape ``` inputs: encoder_feats (1, 768) float32 # mean-pooled CLIP-ViT-B/32 vision across frames scores (1, 5) float32 # [hook, hold, algorithmic_fit, brand_lift, overall] in [0,1] scores_present (1,) float32 # 1.0 anchored, 0.0 fast-mode input_ids (1, T) int64 outputs: logits (1, T, V) float32 ``` Same sampling recommendation as `cortexa-marketing-feedback`: temperature 0.8 + top-k 20 + SEP-veto. ## Training 6k phrase triples from 3 real short-form videos (`public/create-tutorial/*.mp4`) + 1997 synthetic "videos" built by random-crop + color jitter over COCO stills (each frame goes through cortexa_v10 separately, so the per-frame curve has real variation). 15 epochs. Val loss 2.39 → 1.97. See `research/distill_students/train_create.py` in the app repo. ## License Pleius internal — see https://pleius.com. Not for redistribution.