workflow_notes.md
[AGENTARIUM_ASSET] Name: Viral Muse – Workflow Notes (Implementation) Version: v1.0 Status: Draft
Goal
Implement Viral Muse as a dataset-driven agent using:
- system prompt + reasoning template + personality fingerprint
- guardrails
- RAG over the 6 CSV datasets
- optional “knowledge map” layer for cross-dataset linking
- memory: user profile + project workspace
This guide assumes an orchestration runtime like n8n, but the logic applies to LangChain, Flowise, Dify, etc.
0) Folder sanity check (Agentarium v1)
You should have:
/core/(system_prompt.md, reasoning_template.md, personality_fingerprint.md)/datasets/(6 CSVs)/guardrails/guardrails.md/memory_schemas/(2 CSV schemas + memory_rules.md)/docs/(this file + readme + use cases)
If you add a knowledge map later, put it in:
/datasets/knowledge_map.csv(recommended) or/datasets/master_grid.csv
1) Implement the core behavior files
1.1 System Prompt
- Paste
/core/system_prompt.mdinto your agent’s system message. - This defines the agent’s role: pattern-first creative partner.
1.2 Reasoning Template
- Store
/core/reasoning_template.mdas internal guidance in your runtime (developer message / hidden instruction / “policy doc”). - Your runtime should prepend it before each completion (or inject as a “rules” section).
1.3 Personality Fingerprint
- Add
/core/personality_fingerprint.mdas a style constraint layer. - Use it to keep tone consistent: compact, direct, pattern-oriented.
Result: the model behaves consistently even before RAG.
2) Apply guardrails
- Load
/guardrails/guardrails.mdas a rules block. - Enforce:
- no plagiarism / no “copy this hit song” behavior
- no made-up dataset facts
- no unsafe content requests
- outputs should be structured and testable
In n8n: you typically inject guardrails as part of the prompt assembly (before the user message).
3) Prepare datasets for RAG
You have 6 CSV datasets in /datasets/.
Best practice is to convert each row into a retrieval document with:
source_datasetrow_id- key fields
- a short “row summary” string for embeddings
3.1 Minimal row-to-document format (recommended)
For each CSV row, create a text payload like:
- Title line:
[DATASET=lyric_structure_map | id=LSM_012] - Then:
field=valuelines (only the meaningful ones) - Then: a compact 1–2 sentence row summary
This makes retrieval clean and avoids embedding empty columns.
4) Upsert into a Vector DB (VDB)
You can use Pinecone, Qdrant, Weaviate, Chroma, FAISS — anything that supports:
- embeddings vector
- metadata filters
- similarity search
4.1 What to store per vector
Vector record
id: stable id (ex:lyric_structure_map:LSM_012)text: the row-to-document payloadmetadata:dataset(one of the 6)- tags / genre / pattern_type (if available)
- any fields you want to filter by
4.2 n8n implementation (practical steps)
- Read file(s)
- Node: “Read Binary File” (or fetch from GitHub / Drive)
- Parse CSV
- Node: “Spreadsheet File” → Convert to JSON (or CSV Parse)
- Normalize rows
- Node: “Function” (build
id,text,metadata)
- Node: “Function” (build
- Create embeddings
- Node: “OpenAI” → Embeddings (or any embedding provider)
- Upsert to VDB
- Pinecone/Qdrant/Weaviate via:
- native node if available, OR
- “HTTP Request” node to the VDB REST API
- Pinecone/Qdrant/Weaviate via:
- Verify
- Run a test query and confirm you retrieve relevant rows.
Tip: store dataset name in metadata so you can filter retrieval per task:
- “only TikTok formats” → filter dataset=
tiktok_concept_patterns - “structure help” → dataset=
lyric_structure_map
5) RAG retrieval at runtime
At inference time, your agent should:
- classify intent (hook / structure / tiktok / genre flip / audit)
- select 1–3 datasets to query
- retrieve top-K rows (ex: K=6–12)
- synthesize output using retrieved rows only (no invented dataset claims)
5.1 Prompt assembly (runtime order)
- System prompt
- Guardrails
- Reasoning template
- Personality fingerprint
- Memory snapshot (user profile + project workspace)
- Retrieved context (RAG)
- User message
6) Knowledge map / “Master Grid” (optional but recommended)
If you want cross-dataset reasoning, add a knowledge map file to link patterns:
6.1 Simple schema (CSV)
Store links as triplets:
source_node,relation,target_node,weight,notes
Examples:
tiktok_format:duet_bait→supports→viral_signal:comment_triggerstructure:prechorus_lift→amplifies→viral_signal:anticipationgenre_flip:reggaeton→prefers→hook_style:call_response
6.2 How to use it
- Upsert the knowledge map into the same VDB (or keep as a small local lookup table).
- When generating, retrieve:
- primary rows from the relevant dataset(s)
- plus 3–8 knowledge-map links that connect them
- Use those links to produce “why this works” explanations and better constraints.
7) Memory implementation (User Profile + Project Workspace)
Use the files in /memory_schemas/:
user_profile_memory.csvproject_workspace_memory.csvmemory_rules.md
7.1 Read memory
Before responding:
- load active user profile facts (preferences, style constraints)
- load current project workspace (objectives, constraints, next actions)
7.2 Write memory
After responding, write only durable facts:
- user preferences that recur
- project decisions (selected concept, chosen genre, chosen structure)
- next actions (what to test next)
Important: append new rows; don’t overwrite old ones.
8) Quick acceptance test (you can run in any runtime)
Try these prompts and verify RAG is working:
- “Give me 8 hook angles + why each is replayable.”
- “Design a 30s TikTok loop concept. 1 prop, 1 angle.”
- “Transform this concept into cumbia and then into alt-rock.”
- “Audit this chorus for viral signals and give minimal fixes.”
If outputs reference your dataset concepts consistently, you’re done.
9) Common failure modes (and fixes)
- Generic output → increase retrieval K; tighten prompt to require citing retrieved patterns
- Hallucinated claims → enforce: “If not in retrieved context, say unknown”
- Too long → cap variants; default to compact bullet outputs
- Bad retrieval → improve row-to-document summaries; add better metadata filters