File size: 6,627 Bytes
bccb271 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
# workflow_notes.md
[AGENTARIUM_ASSET]
Name: Viral Muse – Workflow Notes (Implementation)
Version: v1.0
Status: Draft
## Goal
Implement Viral Muse as a **dataset-driven** agent using:
- system prompt + reasoning template + personality fingerprint
- guardrails
- RAG over the 6 CSV datasets
- optional “knowledge map” layer for cross-dataset linking
- memory: user profile + project workspace
This guide assumes an orchestration runtime like **n8n**, but the logic applies to LangChain, Flowise, Dify, etc.
---
## 0) Folder sanity check (Agentarium v1)
You should have:
- `/core/` (system_prompt.md, reasoning_template.md, personality_fingerprint.md)
- `/datasets/` (6 CSVs)
- `/guardrails/guardrails.md`
- `/memory_schemas/` (2 CSV schemas + memory_rules.md)
- `/docs/` (this file + readme + use cases)
If you add a knowledge map later, put it in:
- `/datasets/knowledge_map.csv` (recommended) or `/datasets/master_grid.csv`
---
## 1) Implement the core behavior files
### 1.1 System Prompt
- Paste `/core/system_prompt.md` into your agent’s **system** message.
- This defines the agent’s role: pattern-first creative partner.
### 1.2 Reasoning Template
- Store `/core/reasoning_template.md` as internal guidance in your runtime (developer message / hidden instruction / “policy doc”).
- Your runtime should prepend it before each completion (or inject as a “rules” section).
### 1.3 Personality Fingerprint
- Add `/core/personality_fingerprint.md` as a style constraint layer.
- Use it to keep tone consistent: compact, direct, pattern-oriented.
**Result:** the model behaves consistently even before RAG.
---
## 2) Apply guardrails
- Load `/guardrails/guardrails.md` as a rules block.
- Enforce:
- no plagiarism / no “copy this hit song” behavior
- no made-up dataset facts
- no unsafe content requests
- outputs should be structured and testable
In n8n: you typically inject guardrails as part of the prompt assembly (before the user message).
---
## 3) Prepare datasets for RAG
You have 6 CSV datasets in `/datasets/`.
Best practice is to convert each row into a **retrieval document** with:
- `source_dataset`
- `row_id`
- key fields
- a short “row summary” string for embeddings
### 3.1 Minimal row-to-document format (recommended)
For each CSV row, create a text payload like:
- Title line: `[DATASET=lyric_structure_map | id=LSM_012]`
- Then: `field=value` lines (only the meaningful ones)
- Then: a compact 1–2 sentence row summary
This makes retrieval clean and avoids embedding empty columns.
---
## 4) Upsert into a Vector DB (VDB)
You can use Pinecone, Qdrant, Weaviate, Chroma, FAISS — anything that supports:
- embeddings vector
- metadata filters
- similarity search
### 4.1 What to store per vector
**Vector record**
- `id`: stable id (ex: `lyric_structure_map:LSM_012`)
- `text`: the row-to-document payload
- `metadata`:
- `dataset` (one of the 6)
- tags / genre / pattern_type (if available)
- any fields you want to filter by
### 4.2 n8n implementation (practical steps)
1) **Read file(s)**
- Node: “Read Binary File” (or fetch from GitHub / Drive)
2) **Parse CSV**
- Node: “Spreadsheet File” → Convert to JSON (or CSV Parse)
3) **Normalize rows**
- Node: “Function” (build `id`, `text`, `metadata`)
4) **Create embeddings**
- Node: “OpenAI” → Embeddings (or any embedding provider)
5) **Upsert to VDB**
- Pinecone/Qdrant/Weaviate via:
- native node if available, OR
- “HTTP Request” node to the VDB REST API
6) **Verify**
- Run a test query and confirm you retrieve relevant rows.
**Tip:** store dataset name in metadata so you can filter retrieval per task:
- “only TikTok formats” → filter dataset=`tiktok_concept_patterns`
- “structure help” → dataset=`lyric_structure_map`
---
## 5) RAG retrieval at runtime
At inference time, your agent should:
1) classify intent (hook / structure / tiktok / genre flip / audit)
2) select 1–3 datasets to query
3) retrieve top-K rows (ex: K=6–12)
4) synthesize output using retrieved rows only (no invented dataset claims)
### 5.1 Prompt assembly (runtime order)
1) System prompt
2) Guardrails
3) Reasoning template
4) Personality fingerprint
5) Memory snapshot (user profile + project workspace)
6) Retrieved context (RAG)
7) User message
---
## 6) Knowledge map / “Master Grid” (optional but recommended)
If you want cross-dataset reasoning, add a **knowledge map** file to link patterns:
### 6.1 Simple schema (CSV)
Store links as triplets:
- `source_node`, `relation`, `target_node`, `weight`, `notes`
Examples:
- `tiktok_format:duet_bait` → `supports` → `viral_signal:comment_trigger`
- `structure:prechorus_lift` → `amplifies` → `viral_signal:anticipation`
- `genre_flip:reggaeton` → `prefers` → `hook_style:call_response`
### 6.2 How to use it
- Upsert the knowledge map into the same VDB (or keep as a small local lookup table).
- When generating, retrieve:
- primary rows from the relevant dataset(s)
- plus 3–8 knowledge-map links that connect them
- Use those links to produce “why this works” explanations and better constraints.
---
## 7) Memory implementation (User Profile + Project Workspace)
Use the files in `/memory_schemas/`:
- `user_profile_memory.csv`
- `project_workspace_memory.csv`
- `memory_rules.md`
### 7.1 Read memory
Before responding:
- load active user profile facts (preferences, style constraints)
- load current project workspace (objectives, constraints, next actions)
### 7.2 Write memory
After responding, write only durable facts:
- user preferences that recur
- project decisions (selected concept, chosen genre, chosen structure)
- next actions (what to test next)
**Important:** append new rows; don’t overwrite old ones.
---
## 8) Quick acceptance test (you can run in any runtime)
Try these prompts and verify RAG is working:
1) “Give me 8 hook angles + why each is replayable.”
2) “Design a 30s TikTok loop concept. 1 prop, 1 angle.”
3) “Transform this concept into cumbia and then into alt-rock.”
4) “Audit this chorus for viral signals and give minimal fixes.”
If outputs reference your dataset concepts consistently, you’re done.
---
## 9) Common failure modes (and fixes)
- **Generic output** → increase retrieval K; tighten prompt to require citing retrieved patterns
- **Hallucinated claims** → enforce: “If not in retrieved context, say unknown”
- **Too long** → cap variants; default to compact bullet outputs
- **Bad retrieval** → improve row-to-document summaries; add better metadata filters
|