Viral_Muse-Music_Pattern_Agent / docs /workflow_notes_viral_muse.md
frankbrsrk's picture
Upload 2 files
bccb271 verified

workflow_notes.md

[AGENTARIUM_ASSET] Name: Viral Muse – Workflow Notes (Implementation) Version: v1.0 Status: Draft

Goal

Implement Viral Muse as a dataset-driven agent using:

  • system prompt + reasoning template + personality fingerprint
  • guardrails
  • RAG over the 6 CSV datasets
  • optional “knowledge map” layer for cross-dataset linking
  • memory: user profile + project workspace

This guide assumes an orchestration runtime like n8n, but the logic applies to LangChain, Flowise, Dify, etc.


0) Folder sanity check (Agentarium v1)

You should have:

  • /core/ (system_prompt.md, reasoning_template.md, personality_fingerprint.md)
  • /datasets/ (6 CSVs)
  • /guardrails/guardrails.md
  • /memory_schemas/ (2 CSV schemas + memory_rules.md)
  • /docs/ (this file + readme + use cases)

If you add a knowledge map later, put it in:

  • /datasets/knowledge_map.csv (recommended) or /datasets/master_grid.csv

1) Implement the core behavior files

1.1 System Prompt

  • Paste /core/system_prompt.md into your agent’s system message.
  • This defines the agent’s role: pattern-first creative partner.

1.2 Reasoning Template

  • Store /core/reasoning_template.md as internal guidance in your runtime (developer message / hidden instruction / “policy doc”).
  • Your runtime should prepend it before each completion (or inject as a “rules” section).

1.3 Personality Fingerprint

  • Add /core/personality_fingerprint.md as a style constraint layer.
  • Use it to keep tone consistent: compact, direct, pattern-oriented.

Result: the model behaves consistently even before RAG.


2) Apply guardrails

  • Load /guardrails/guardrails.md as a rules block.
  • Enforce:
    • no plagiarism / no “copy this hit song” behavior
    • no made-up dataset facts
    • no unsafe content requests
    • outputs should be structured and testable

In n8n: you typically inject guardrails as part of the prompt assembly (before the user message).


3) Prepare datasets for RAG

You have 6 CSV datasets in /datasets/. Best practice is to convert each row into a retrieval document with:

  • source_dataset
  • row_id
  • key fields
  • a short “row summary” string for embeddings

3.1 Minimal row-to-document format (recommended)

For each CSV row, create a text payload like:

  • Title line: [DATASET=lyric_structure_map | id=LSM_012]
  • Then: field=value lines (only the meaningful ones)
  • Then: a compact 1–2 sentence row summary

This makes retrieval clean and avoids embedding empty columns.


4) Upsert into a Vector DB (VDB)

You can use Pinecone, Qdrant, Weaviate, Chroma, FAISS — anything that supports:

  • embeddings vector
  • metadata filters
  • similarity search

4.1 What to store per vector

Vector record

  • id: stable id (ex: lyric_structure_map:LSM_012)
  • text: the row-to-document payload
  • metadata:
    • dataset (one of the 6)
    • tags / genre / pattern_type (if available)
    • any fields you want to filter by

4.2 n8n implementation (practical steps)

  1. Read file(s)
    • Node: “Read Binary File” (or fetch from GitHub / Drive)
  2. Parse CSV
    • Node: “Spreadsheet File” → Convert to JSON (or CSV Parse)
  3. Normalize rows
    • Node: “Function” (build id, text, metadata)
  4. Create embeddings
    • Node: “OpenAI” → Embeddings (or any embedding provider)
  5. Upsert to VDB
    • Pinecone/Qdrant/Weaviate via:
      • native node if available, OR
      • “HTTP Request” node to the VDB REST API
  6. Verify
    • Run a test query and confirm you retrieve relevant rows.

Tip: store dataset name in metadata so you can filter retrieval per task:

  • “only TikTok formats” → filter dataset=tiktok_concept_patterns
  • “structure help” → dataset=lyric_structure_map

5) RAG retrieval at runtime

At inference time, your agent should:

  1. classify intent (hook / structure / tiktok / genre flip / audit)
  2. select 1–3 datasets to query
  3. retrieve top-K rows (ex: K=6–12)
  4. synthesize output using retrieved rows only (no invented dataset claims)

5.1 Prompt assembly (runtime order)

  1. System prompt
  2. Guardrails
  3. Reasoning template
  4. Personality fingerprint
  5. Memory snapshot (user profile + project workspace)
  6. Retrieved context (RAG)
  7. User message

6) Knowledge map / “Master Grid” (optional but recommended)

If you want cross-dataset reasoning, add a knowledge map file to link patterns:

6.1 Simple schema (CSV)

Store links as triplets:

  • source_node, relation, target_node, weight, notes

Examples:

  • tiktok_format:duet_baitsupportsviral_signal:comment_trigger
  • structure:prechorus_liftamplifiesviral_signal:anticipation
  • genre_flip:reggaetonprefershook_style:call_response

6.2 How to use it

  • Upsert the knowledge map into the same VDB (or keep as a small local lookup table).
  • When generating, retrieve:
    • primary rows from the relevant dataset(s)
    • plus 3–8 knowledge-map links that connect them
  • Use those links to produce “why this works” explanations and better constraints.

7) Memory implementation (User Profile + Project Workspace)

Use the files in /memory_schemas/:

  • user_profile_memory.csv
  • project_workspace_memory.csv
  • memory_rules.md

7.1 Read memory

Before responding:

  • load active user profile facts (preferences, style constraints)
  • load current project workspace (objectives, constraints, next actions)

7.2 Write memory

After responding, write only durable facts:

  • user preferences that recur
  • project decisions (selected concept, chosen genre, chosen structure)
  • next actions (what to test next)

Important: append new rows; don’t overwrite old ones.


8) Quick acceptance test (you can run in any runtime)

Try these prompts and verify RAG is working:

  1. “Give me 8 hook angles + why each is replayable.”
  2. “Design a 30s TikTok loop concept. 1 prop, 1 angle.”
  3. “Transform this concept into cumbia and then into alt-rock.”
  4. “Audit this chorus for viral signals and give minimal fixes.”

If outputs reference your dataset concepts consistently, you’re done.


9) Common failure modes (and fixes)

  • Generic output → increase retrieval K; tighten prompt to require citing retrieved patterns
  • Hallucinated claims → enforce: “If not in retrieved context, say unknown”
  • Too long → cap variants; default to compact bullet outputs
  • Bad retrieval → improve row-to-document summaries; add better metadata filters