| [{"id":"auto_memory","name":"Auto Memory","meta":{"description":"Automatically store relevant information as Memories.","type":"filter","manifest":{"title":"Auto Memory","author":"@nokodo","description":"automatically identify and store valuable information from chats as Memories.","author_email":"nokodo@nokodo.net","author_url":"https://nokodo.net","repository_url":"https://nokodo.net/github/open-webui-extensions","version":"1.1.0-alpha1","required_open_webui_version":">= 0.5.0","funding_url":"https://ko-fi.com/nokodo","license":"see extension documentation file `auto_memory.md` (License section) for the licensing terms."}},"content":"\"\"\"\ntitle: Auto Memory\nauthor: @nokodo\ndescription: automatically identify and store valuable information from chats as Memories.\nauthor_email: nokodo@nokodo.net\nauthor_url: https://nokodo.net\nrepository_url: https://nokodo.net/github/open-webui-extensions\nversion: 1.1.0-alpha1\nrequired_open_webui_version: >= 0.5.0\nfunding_url: https://ko-fi.com/nokodo\nlicense: see extension documentation file `auto_memory.md` (License section) for the licensing terms.\n\"\"\"\n\nimport asyncio\nimport json\nimport logging\nimport re\nimport threading\nfrom datetime import datetime\nfrom typing import (\n Any,\n Awaitable,\n Callable,\n Literal,\n Optional,\n Type,\n TypeVar,\n Union,\n cast,\n overload,\n)\nfrom urllib.parse import urlparse\n\nfrom fastapi import HTTPException, Request\nfrom open_webui.main import app as webui_app\nfrom open_webui.models.users import UserModel, Users\nfrom open_webui.retrieval.vector.main import SearchResult\nfrom open_webui.routers.memories import (\n AddMemoryForm,\n MemoryUpdateModel,\n QueryMemoryForm,\n add_memory,\n delete_memory_by_id,\n query_memory,\n update_memory_by_id,\n)\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field, ValidationError, create_model\n\nLogLevel = Literal[\"debug\", \"info\", \"warning\", \"error\"]\n\nSTRINGIFIED_MESSAGE_TEMPLATE = \"-{index}. {role}: ```{content}```\"\n\n\nUNIFIED_SYSTEM_PROMPT = \"\"\"\\\nYou are maintaining a collection of Memories - individual \"journal entries\" or facts about a user, each automatically timestamped upon creation or update.\n\nYou will be provided with:\n1. Recent messages from a conversation (displayed with negative indices; -1 is the most recent overall message)\n2. Any existing related memories that might potentially be relevant\n\nYour job is to determine what actions to take on the memory collection based on the User's **latest** message (-2).\n\n<key_instructions>\n## Instructions\n1. Focus ONLY on the **User's most recent message** (-2). Older messages provide context but should not generate new memories unless explicitly referenced in the latest message.\n2. Each Memory should represent **a single fact or statement**. Never combine multiple facts into one Memory.\n3. When the User's latest message contradicts existing memories, **update the existing memory** rather than creating a conflicting new one.\n4. If memories are exact duplicates or direct conflicts about the same topic, **consolidate them by updating or deleting** as appropriate.\n5. **Link related Memories** by including brief references when relevant to maintain semantic connections.\n6. Capture anything valuable for **personalizing future interactions** with the User.\n7. Always **honor memory requests**, whether direct from the User (\"remember this\", \"forget that\", \"update X\") or implicit through the Assistant's commitment (\"I'll remember that\", \"I'll keep that in mind\"). Treat these as strong signals to store, update, or delete the referenced information.\n8. Each memory must be **self-contained and understandable without external context.** Avoid ambiguous references like \"it\", \"that\", or \"there\" - instead, include the specific subject being referenced. For example, prefer \"User's new TV broke\" over \"It broke\".\n9. Be alert to **sarcasm, jokes, and non-literal language.** If the User's statement appears to be hyperbole, sarcasm, or non-literal rather than a factual claim, do not store it as a memory.\n10. When determining which memory is \"most recent\" for conflict resolution, **refer to the `created_at` or `update_at` timestamps** from the existing memories.\n</key_instructions>\n\n<what_to_extract>\n## What you WANT to extract\n- Personal preferences, opinions, and feelings\n- Long-term personal information (likely true for months/years)\n- Future-oriented statements (\"from now on\", \"going forward\")\n- Direct memory requests (\"remember that\", \"note this\", \"forget that\")\n- Hobbies, interests, skills\n- Important life details (job, education, relationships, location)\n- Long term goals, plans, aspirations\n- Recurring patterns or habits\n- Strong likes/dislikes affecting future conversations\n</what_to_extract>\n\n<what_not_to_extract>\n## What you do NOT want to extract\n- User/assistant names (already in profile)\n- User gender, age and birthdate (already in profile)\n- ANY kind of short-term or ephemeral information that is unlikely to be relevant in future conversations\n- Information the assistant confirms is already known\n- Content from translation/rewrite/summarization/similar tasks (\"Please help me write my essay about x\")\n- Trivial observations or fleeting thoughts\n- Temporary activities\n- Sarcastic remarks or obvious jokes\n- Non-literal statements or hyperbole\n</what_not_to_extract>\n\n<actions_to_take>\nBased on your analysis, return a list of actions:\n\n**ADD**: Create new memory when:\n- New information not covered by existing memories\n- Distinct facts even if related to existing topics\n- User explicitly requests to remember something\n\n**UPDATE**: Modify existing memory when:\n- User provides updated/corrected information about the same fact\n- Consolidating small, inseparable or closely related facts into one memory\n- User explicitly asks to update something\n- New information refines but doesn't fundamentally change existing memory\n\n**DELETE**: Remove existing memory when:\n- User explicitly requests to forget something\n- User's statement directly contradicts an existing memory\n- Consolidating memories (update the oldest, delete the rest)\n- Memory is completely obsolete due to new information\n- Duplicate memories exist (keep oldest based on `created_at` timestamp)\n\nWhen updating or deleting, ONLY use the memory ID from the related memories list.\n</actions_to_take>\n\n<consolidation_rules>\n**Core Principle**: Default to keeping memories separate and granular for precise retrieval. Only consolidate when it meaningfully improves memory quality and coherence.\n\n**When to CONSOLIDATE** (merge existing memories):\n\n- **Exact Duplicates** - Same fact, different wording\n - Action: Delete the newer duplicate, keep the oldest (based on `created_at` timestamp)\n - Example: \"User prefers Python for scripting\" + \"User likes Python for scripting tasks\" β Keep oldest, delete duplicate\n\n- **Direct Conflicts** - Contradictory facts about the same subject\n - Action: Update the older memory to reflect the latest information, or delete if completely obsolete\n - Example: \"User lives in San Francisco\" conflicts with \"User moved to Mountain View\" β Update or delete old info\n\n- **Inseparable Facts** - Multiple facts about the same entity that would be incomplete or confusing if retrieved separately\n - Action: Merge into the oldest memory as a single self-contained statement, then delete the redundant memories\n - Test: Would retrieving one fact without the other create confusion or require additional context?\n - Example: \"User's cat is named Luna\" + \"User's cat is a Siamese\" β \"User has a Siamese cat named Luna\"\n - Counter-example: \"User works at Google\" + \"User started at Google in 2023\" β Keep separate (start date is distinct from employment)\n\n- **Small, better retrieved together** - Closely related facts that enhance understanding when combined\n - Action: Merge into the oldest memory, delete the others\n - Test: Would I prefer to retrieve these facts together every time, rather than separately?\n - Example: \"User loves Italian food\" + \"User loves Indian food\" β \"User loves Italian and Indian food\"\n\n**When to keep SEPARATE** (or split if wrongly combined):\n\nFacts should remain separate when they represent distinct, independently-retrievable information:\n\n- **Similar but distinct facts** - Related information representing different aspects or time periods\n - Example: \"User works at Google\" vs \"User got promoted to team lead\" (employment vs career progression)\n \n- **Past events as journal entries** - Historical facts that provide temporal context\n - Example: \"User bought a Samsung TV\" and \"User's Samsung TV broke\" (separate events in time)\n\n- **Related but separable facts** - Facts about the same topic that are meaningful independently\n - Example: \"User loves dogs\" vs \"User has a golden retriever named Max\" (general preference vs specific pet)\n\n- **Too long or complex** - Merging would create an overly long memory that contains too many distinct facts\n\nIf an existing memory wrongly combines separable facts: UPDATE the existing memory to contain one fact (preserves timestamp), then ADD new memories for the other facts. Deleting the original would lose the timestamp.\n\n**Guiding Question**: If vector search retrieves only one of these memories, would the user experience be degraded? If yes, consider merging. If no, keep separate.\n</consolidation_rules>\n\n<examples>\n**Example 1 - Store new memories when no related found**\nConversation:\n-2. user: ```I work as a senior data scientist at Tesla and my favorite programming language is Rust```\n-1. assistant: ```That's impressive! Working at Tesla must be exciting, and Rust is a great choice for systems programming```\n\nRelated Memories:\n[\n {\"mem_id\": \"1\", \"created_at\": \"2024-01-05T10:00:00\", \"update_at\": \"2024-01-05T10:00:00\", \"content\": \"User enjoys electric vehicles\"},\n {\"mem_id\": \"2\", \"created_at\": \"2024-02-10T14:00:00\", \"update_at\": \"2024-02-10T14:00:00\", \"content\": \"User has experience with Python and data analysis\"},\n {\"mem_id\": \"3\", \"created_at\": \"2024-01-20T09:30:00\", \"update_at\": \"2024-01-20T09:30:00\", \"content\": \"User likes reading science fiction novels\"}\n]\n\n**Analysis**\n- Existing memories might be tangentially related (electric vehicles/Tesla, data analysis) but don't actually cover the specific facts mentioned\n- User provides two distinct new facts: job/company and programming preference\n- Each should be stored as a separate new memory\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"add\", \"content\": \"User works as a senior data scientist at Tesla\"},\n {\"action\": \"add\", \"content\": \"User's favorite programming language is Rust\"}\n ]\n}\n\n**Example 2 - Consolidate similar memories while retaining context**\nConversation:\n-2. user: ```Actually I prefer TypeScript over JavaScript for frontend work these days```\n-1. assistant: ```TypeScript's type safety definitely makes frontend development more maintainable!```\n\nRelated Memories:\n[\n {\"mem_id\": \"123\", \"created_at\": \"2024-01-15T10:00:00\", \"update_at\": \"2024-01-15T10:00:00\", \"content\": \"User likes JavaScript for web development\"},\n {\"mem_id\": \"456\", \"created_at\": \"2024-02-20T14:30:00\", \"update_at\": \"2024-02-20T14:30:00\", \"content\": \"User prefers JavaScript for frontend projects\"},\n {\"mem_id\": \"789\", \"created_at\": \"2024-03-01T09:00:00\", \"update_at\": \"2024-03-01T09:00:00\", \"content\": \"User is learning React\"}\n]\n\n**Analysis**\n- Two existing similar memories about JavaScript preference\n- User said they now prefer TypeScript, but it doesn't mean they don't *like* JavaScript anymore\n- Update one memory to reflect the new preference, leave all other memories untouched\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"update\", \"id\": \"456\", \"new_content\": \"User prefers TypeScript for frontend work\"}\n ]\n}\n\n**Example 3 - Delete conflicting memory while retaining others**\nConversation:\n-2. user: ```I'm joking! I didn't actually buy the iPhone!```\n-1. assistant: ```Ahh, you got me there! No worries.```\n\nRelated Memories:\n[\n {\"mem_id\": \"789\", \"created_at\": \"2024-03-01T09:00:00\", \"update_at\": \"2024-03-01T09:00:00\", \"content\": \"User just bought a new iPhone\"},\n {\"mem_id\": \"012\", \"created_at\": \"2024-03-02T11:00:00\", \"update_at\": \"2024-03-02T11:00:00\", \"content\": \"User likes Apple products\"},\n {\"mem_id\": \"345\", \"created_at\": \"2024-03-02T11:00:00\", \"update_at\": \"2024-03-02T11:00:00\", \"content\": \"User is considering buying a new iPad\"}\n]\n\n**Analysis**\n- User negates a previous statement about buying an iPhone\n- We should delete the memory about the iPhone purchase\n- The other memories about liking Apple products and considering an iPad remain valid\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"delete\", \"id\": \"789\"}\n ]\n}\n\n**Example 4 - Handling multiple updates while retaining context**\nConversation:\n-4. user: ```I'm thinking of switching from my current role```\n-3. assistant: ```What's motivating you to consider a change?```\n-2. user: ```Well, I got promoted to team lead last month, but I'm also interviewing at Google next week. The commute would be better since I just moved to Mountain View```\n-1. assistant: ```Congratulations on the promotion! That's interesting timing with the Google interview```\n\nRelated Memories:\n[\n {\"mem_id\": \"345\", \"created_at\": \"2024-02-15T10:00:00\", \"update_at\": \"2024-02-15T10:00:00\", \"content\": \"User lives in San Francisco\"},\n {\"mem_id\": \"678\", \"created_at\": \"2024-01-10T08:00:00\", \"update_at\": \"2024-01-10T08:00:00\", \"content\": \"User works as a software engineer\"}\n]\n\n**Analysis**\n- User reveals: promoted to team lead (updates role), moved to Mountain View (conflicts with SF), interviewing at Google (new info)\n- We don't want to forget any of the user's life details, unless there is a conflict. So we create a new memory, and update the legacy ones.\n- Add new memory about Google interview as it's distinct future event\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"update\", \"id\": \"345\", \"new_content\": \"User used to live in San Francisco\"},\n {\"action\": \"update\", \"id\": \"678\", \"new_content\": \"User works as a team lead software engineer\"},\n {\"action\": \"add\", \"content\": \"User got promoted to team lead\"},\n {\"action\": \"add\", \"content\": \"User has just moved to Mountain View\"},\n {\"action\": \"add\", \"content\": \"User lives in Mountain View\"},\n {\"action\": \"add\", \"content\": \"User has an interview at Google\"}\n ]\n}\n\n**Example 5 - Handling sarcasm and non-literal language**\nConversation:\n-3. assistant: ```As an AI assistant, I can perform extremely complex calculations in seconds.```\n-2. user: ```Oh yeah? I can do that with my eyes closed! I'm basically a human calculator!```\n-1. assistant: ```π Sure you can!```\n\nRelated Memories:\n[]\n\n**Analysis**\n- The User's message is clearly sarcastic/joking - they're not literally claiming to be a human calculator\n- This is hyperbole used for humorous effect, not a factual statement about their abilities\n- No memories should be created from obvious sarcasm or jokes\n\nOutput:\n{\n \"actions\": []\n}\n\n**Example 6 - Cross-message context linking**\nConversation:\n-5. assistant: ```How's your new TV working out?```\n-4. user: ```Remember how I bought that Samsung OLED TV last week?```\n-3. assistant: ```Yes, I remember that. What about it?```\n-2. user: ```Well, it broke down today! The screen just went black.```\n-1. assistant: ```Oh no! That's terrible for such a new TV!```\n\nRelated Memories:\n[\n {\"mem_id\": \"101\", \"created_at\": \"2024-03-15T10:00:00\", \"update_at\": \"2024-03-15T10:00:00\", \"content\": \"User bought a Samsung OLED TV\"}\n]\n\n**Analysis**\n- The User's latest message provides new information about the TV breaking\n- We need to create a self-contained memory that includes context from earlier messages\n- The new memory should reference the Samsung OLED TV specifically, not just \"it\" or \"the TV\"\n- This helps semantically link to the existing memory about the purchase\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"add\", \"content\": \"User's Samsung OLED TV, that was recently purchased, just broke down with a black screen\"}\n ]\n}\n\n**Example 7 - Memory maintenance: merging and deleting duplicates and bad memories**\nConversation:\n-2. user: ```Can you help me write a Python function to sort a list?```\n-1. assistant: ```Of course! Here's a simple example using sorted()...```\n\nRelated Memories:\n[\n {\"mem_id\": \"234\", \"created_at\": \"2024-02-10T09:00:00\", \"update_at\": \"2024-02-10T09:00:00\", \"content\": \"User prefers Python for scripting\"},\n {\"mem_id\": \"567\", \"created_at\": \"2024-03-15T14:30:00\", \"update_at\": \"2024-03-15T14:30:00\", \"content\": \"User likes Python for scripting tasks\"},\n {\"mem_id\": \"890\", \"created_at\": \"2024-01-05T10:00:00\", \"update_at\": \"2024-01-05T10:00:00\", \"content\": \"User knows Python programming\"},\n {\"mem_id\": \"123\", \"created_at\": \"2024-01-10T11:00:00\", \"update_at\": \"2024-01-10T11:00:00\", \"content\": \"User's name is Jake\"},\n {\"mem_id\": \"456\", \"created_at\": \"2024-01-15T08:00:00\", \"update_at\": \"2024-01-15T08:00:00\", \"content\": \"User's cat is named Luna\"},\n {\"mem_id\": \"789\", \"created_at\": \"2024-02-20T10:00:00\", \"update_at\": \"2024-02-20T10:00:00\", \"content\": \"User's cat is a Siamese\"}\n]\n\n**Analysis**\n- The current conversation is just a technical question about Python - no new personal information\n- However, the related memories show issues that need maintenance. We apply the relevant Memory rules:\n 1. **Delete bad memory**: Memory 123 contains the user's name, which violates the rule \"never store user/assistant names\" - should be deleted\n 2. **Delete duplicate**: Memory 234 and 567 express essentially the same preference (Python for scripting) - keep older (234), delete newer duplicate (567)\n 3. **Merge inseparable facts**: Memory 456 and 789 are about the same cat and should ALWAYS be retrieved together (cat's name + breed) - merge into oldest memory (456)\n- Memory 890 is distinct (knowledge vs preference) so it should remain\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"delete\", \"id\": \"123\"},\n {\"action\": \"delete\", \"id\": \"567\"},\n {\"action\": \"update\", \"id\": \"456\", \"new_content\": \"User has a Siamese cat named Luna\"},\n {\"action\": \"delete\", \"id\": \"789\"}\n ]\n}\n\n**Example 8 - Explicit memory request**\nConversation:\n-4. user: ```Hey, do you remember what my dog's name is?```\n-3. assistant: ```I don't have that information. Could you tell me?```\n-2. user: ```Sure! His name is Max and he's a golden retriever.```\n-1. assistant: ```What a lovely name! Max sounds like a wonderful companion. I'll remember that.```\n\nRelated Memories:\n[\n {\"mem_id\": \"111\", \"created_at\": \"2024-01-20T10:00:00\", \"update_at\": \"2024-01-20T10:00:00\", \"content\": \"User loves dogs\"}\n]\n\n**Analysis**\n- Assistant explicitly expresses intent to remember something. We ALWAYS honor explicit memory requests.\n- User provides info about his dog's name and breed these can be stored as a single memory as they are closely related\n- The existing memory about loving dogs is related but doesn't conflict\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"add\", \"content\": \"User has a golden retriever named Max\"}\n ]\n}\n\n**Example 9 - Memory maintenance: splitting and adding context**\nConversation:\n-2. user: ```Sadie invited me to her birthday party next week, I'm excited!```\n-1. assistant: ```That's wonderful! I hope you have a great time at Sadie's party.```\n\nRelated Memories:\n[\n {\"mem_id\": \"555\", \"created_at\": \"2024-02-10T10:00:00\", \"update_at\": \"2024-02-10T10:00:00\", \"content\": \"User has an old time friend named Sadie who they grew up with, and whose mother is a long time friend of User's mother\"},\n {\"mem_id\": \"666\", \"created_at\": \"2024-02-12T14:00:00\", \"update_at\": \"2024-02-12T14:00:00\", \"content\": \"The two mothers also did their english courses together\"}\n]\n\n**Analysis**\n- User mentions Sadie's party (new event to store)\n- Memory 555 combines two separable facts: User's friendship with Sadie (including growing up together), and the mothers' friendship\n- Memory 666 lacks clear context - \"the two mothers\" is ambiguous without memory 555\n- This is a **passive maintenance scenario**: even though the conversation doesn't directly discuss the memory issues, we should fix them\n- Actions: update 555 to remove the mothers' friendship, add new memory for mothers' relationship, add context to 666\n\nOutput:\n{\n \"actions\": [\n {\"action\": \"add\", \"content\": \"User is invited to Sadie's birthday party next week\"},\n {\"action\": \"update\", \"id\": \"555\", \"new_content\": \"User has an old friend named Sadie who they grew up with\"},\n {\"action\": \"add\", \"content\": \"User's mother and Sadie's mother are long time friends\"},\n {\"action\": \"update\", \"id\": \"666\", \"new_content\": \"User's mother and Sadie's mother did their english courses together\"}\n ]\n}\n</examples>\\\n\"\"\"\n\n\nasync def emit_status(\n description: str,\n emitter: Any,\n status: Literal[\"in_progress\", \"complete\", \"error\"] = \"complete\",\n extra_data: Optional[dict] = None,\n):\n if not emitter:\n raise ValueError(\"Emitter is required to emit status updates\")\n\n await emitter(\n {\n \"type\": \"status\",\n \"data\": {\n \"description\": description,\n \"status\": status,\n \"done\": status in (\"complete\", \"error\"),\n \"error\": status == \"error\",\n **(extra_data or {}),\n },\n }\n )\n\n\nclass MemoryAddAction(BaseModel):\n action: Literal[\"add\"] = Field(..., description=\"Action type (add)\")\n content: str = Field(..., description=\"Content of the memory to add\")\n\n\nclass MemoryUpdateAction(BaseModel):\n action: Literal[\"update\"] = Field(..., description=\"Action type (update)\")\n id: str = Field(..., description=\"ID of the memory to update\")\n new_content: str = Field(..., description=\"New content for the memory\")\n\n\nclass MemoryDeleteAction(BaseModel):\n action: Literal[\"delete\"] = Field(..., description=\"Action type (delete)\")\n id: str = Field(..., description=\"ID of the memory to delete\")\n\n\nclass MemoryActionRequestStub(BaseModel):\n \"\"\"This is a stub model to correctly type parameters. Not used directly.\"\"\"\n\n actions: list[Union[MemoryAddAction, MemoryUpdateAction, MemoryDeleteAction]] = (\n Field(\n default_factory=list,\n description=\"List of actions to perform on memories\",\n max_length=20,\n )\n )\n\n\nclass Memory(BaseModel):\n \"\"\"Single memory entry with metadata.\"\"\"\n\n mem_id: str = Field(..., description=\"ID of the memory\")\n created_at: datetime = Field(..., description=\"Creation timestamp\")\n update_at: datetime = Field(..., description=\"Last update timestamp\")\n content: str = Field(..., description=\"Content of the memory\")\n similarity_score: Optional[float] = Field(\n None,\n description=\"Similarity score (0 to 1 - higher is **more similar** to user query) if available\",\n )\n\n\ndef build_actions_request_model(existing_ids: list[str]):\n \"\"\"Dynamically build versions of the Update/Delete action models whose `id` fields\n are Literal[...] constrained to the provided existing_ids. Returns a tuple:\n\n (DynamicMemoryUpdateAction, DynamicMemoryDeleteAction, DynamicMemoryUpdateRequest)\n\n If existing_ids is empty, we still return permissive forms (falls back to str) so that\n add-only flows still parse.\n \"\"\"\n if not existing_ids:\n # No IDs to constrain, so no relevant memories = can only create new memories\n allowed_actions = MemoryAddAction\n else:\n id_literal_type = Literal[tuple(existing_ids)]\n\n DynamicMemoryUpdateAction = create_model(\n \"MemoryUpdateAction\",\n id=(id_literal_type, ...),\n __base__=MemoryUpdateAction,\n )\n\n DynamicMemoryDeleteAction = create_model(\n \"MemoryDeleteAction\",\n id=(id_literal_type, ...),\n __base__=MemoryDeleteAction,\n )\n\n allowed_actions = Union[\n MemoryAddAction, DynamicMemoryUpdateAction, DynamicMemoryDeleteAction\n ]\n\n return create_model(\n \"MemoriesActionRequest\",\n actions=(\n list[allowed_actions],\n Field(\n default_factory=list,\n description=\"List of actions to perform on memories\",\n max_length=20,\n ),\n ),\n __base__=BaseModel,\n )\n\n\ndef searchresults_to_memories(results: SearchResult) -> list[Memory]:\n memories = []\n\n if not results.ids or not results.documents or not results.metadatas:\n raise ValueError(\"SearchResult must contain ids, documents, and metadatas\")\n\n for batch_idx, (ids_batch, docs_batch, metas_batch) in enumerate(\n zip(results.ids, results.documents, results.metadatas)\n ):\n distances_batch = results.distances[batch_idx] if results.distances else None\n\n for doc_idx, (mem_id, content, meta) in enumerate(\n zip(ids_batch, docs_batch, metas_batch)\n ):\n if not meta:\n raise ValueError(f\"Missing metadata for memory id={mem_id}\")\n if \"created_at\" not in meta:\n raise ValueError(\n f\"Missing 'created_at' in metadata for memory id={mem_id}\"\n )\n if \"updated_at\" not in meta:\n # If updated_at is missing, default to created_at\n meta[\"updated_at\"] = meta[\"created_at\"]\n\n created_at = datetime.fromtimestamp(meta[\"created_at\"])\n updated_at = datetime.fromtimestamp(meta[\"updated_at\"])\n\n # Extract similarity score if available\n similarity_score = None\n if distances_batch is not None and doc_idx < len(distances_batch):\n similarity_score = round(distances_batch[doc_idx], 3)\n\n mem = Memory(\n mem_id=mem_id,\n created_at=created_at,\n update_at=updated_at,\n content=content,\n similarity_score=similarity_score,\n )\n memories.append(mem)\n\n return memories\n\n\ndef _run_detached(coro):\n \"\"\"Helper to run coroutine in detached thread\"\"\"\n\n def _runner():\n loop = asyncio.new_event_loop()\n asyncio.set_event_loop(loop)\n try:\n loop.run_until_complete(coro)\n finally:\n loop.close()\n\n thread = threading.Thread(target=_runner, daemon=True)\n thread.start()\n\n\nR = TypeVar(\"R\", bound=BaseModel)\nValveType = TypeVar(\"ValveType\", str, int)\n\n\nclass Filter:\n class Valves(BaseModel):\n openai_api_url: str = Field(\n default=\"https://api.openai.com/v1\",\n description=\"openai compatible endpoint\",\n )\n model: str = Field(\n default=\"gpt-5-mini\",\n description=\"model to use to determine memory. an intelligent model is highly recommended, as it will be able to better understand the context of the conversation.\",\n )\n api_key: str = Field(\n default=\"\", description=\"API key for OpenAI compatible endpoint\"\n )\n messages_to_consider: int = Field(\n default=4,\n description=\"global default number of recent messages to consider for memory extraction (user override can supply a different value).\",\n )\n related_memories_n: int = Field(\n default=5,\n description=\"number of related memories to consider when updating memories\",\n )\n minimum_memory_similarity: Optional[float] = Field(\n default=None,\n ge=0.0,\n le=1.0,\n description=\"minimum similarity of memories to consider for updates. higher is more similar to user query. if not set, no filtering is applied.\",\n )\n allow_unsafe_user_overrides: bool = Field(\n default=False,\n description=\"SECURITY WARNING: allow users to override API URL/model without providing their own API key. this could allow users to steal your API key or use expensive models at your expense. only enable if you trust all users.\",\n )\n override_memory_context: bool = Field(\n default=False,\n description=\"intercept and override memory context injection in system prompts. when enabled, allows customization of how memories are presented to the model.\",\n )\n debug_mode: bool = Field(\n default=False,\n description=\"enable debug logging\",\n )\n\n class UserValves(BaseModel):\n enabled: bool = Field(\n default=True,\n description=\"whether to enable Auto Memory for this user\",\n )\n show_status: bool = Field(\n default=True, description=\"show status of the action.\"\n )\n openai_api_url: Optional[str] = Field(\n default=None,\n description=\"user-specific openai compatible endpoint (overrides global)\",\n )\n model: Optional[str] = Field(\n default=None,\n description=\"user-specific model to use (overrides global). an intelligent model is highly recommended, as it will be able to better understand the context of the conversation.\",\n )\n api_key: Optional[str] = Field(\n default=None, description=\"user-specific API key (overrides global)\"\n )\n messages_to_consider: Optional[int] = Field(\n default=None,\n description=\"override for number of recent messages to consider (falls back to global if null). includes assistant responses.\",\n )\n\n def log(self, message: str, level: LogLevel = \"info\"):\n if level == \"debug\" and not self.valves.debug_mode:\n return\n if level not in {\"debug\", \"info\", \"warning\", \"error\"}:\n level = \"info\"\n\n logger = logging.getLogger()\n getattr(logger, level, logger.info)(message)\n\n def messages_to_string(self, messages: list[dict[str, Any]]) -> str:\n stringified_messages: list[str] = []\n\n effective_messages_to_consider = self.get_restricted_user_valve(\n user_valve_value=self.user_valves.messages_to_consider,\n admin_fallback=self.valves.messages_to_consider,\n authorization_check=bool(\n self.user_valves.api_key and self.user_valves.api_key.strip()\n ),\n valve_name=\"messages_to_consider\",\n )\n\n self.log(\n f\"using last {effective_messages_to_consider} messages\",\n level=\"debug\",\n )\n\n for i in range(1, effective_messages_to_consider + 1):\n if i > len(messages):\n break\n try:\n message = messages[-i]\n stringified_messages.append(\n STRINGIFIED_MESSAGE_TEMPLATE.format(\n index=i,\n role=message.get(\"role\", \"user\"),\n content=message.get(\"content\", \"\"),\n )\n )\n except Exception as e:\n self.log(f\"error stringifying message {i}: {e}\", level=\"warning\")\n\n return \"\\n\".join(stringified_messages)\n\n @overload\n async def query_openai_sdk(\n self,\n system_prompt: str,\n user_message: str,\n response_model: Type[R],\n ) -> R: ...\n\n @overload\n async def query_openai_sdk(\n self,\n system_prompt: str,\n user_message: str,\n response_model: None = None,\n ) -> str: ...\n\n async def query_openai_sdk(\n self,\n system_prompt: str,\n user_message: str,\n response_model: Optional[Type[R]] = None,\n ) -> Union[str, R]:\n \"\"\"Generic wrapper around OpenAI chat completions.\n - Uses SDK for api.openai.com only\n - Structured outputs when official domain and response_model provided\n - Returns: model instance or raw string\n \"\"\"\n\n user_has_own_key = bool(\n self.user_valves.api_key and self.user_valves.api_key.strip()\n )\n\n api_url = self.get_restricted_user_valve(\n user_valve_value=self.user_valves.openai_api_url,\n admin_fallback=self.valves.openai_api_url,\n authorization_check=user_has_own_key,\n valve_name=\"openai_api_url\",\n ).rstrip(\"/\")\n\n model_name = self.get_restricted_user_valve(\n user_valve_value=self.user_valves.model,\n admin_fallback=self.valves.model,\n authorization_check=user_has_own_key,\n valve_name=\"model\",\n )\n api_key = self.user_valves.api_key or self.valves.api_key\n\n hostname = urlparse(api_url).hostname or \"\"\n enable_structured_outputs = (\n hostname == \"api.openai.com\" and response_model is not None\n )\n\n if \"gpt-5\" in model_name:\n temperature = 1.0\n extra_args = {\"reasoning_effort\": \"medium\"}\n else:\n temperature = 0.3\n extra_args = {}\n\n client = OpenAI(api_key=api_key, base_url=api_url)\n messages: list[dict[str, str]] = [\n {\"role\": \"system\", \"content\": system_prompt},\n {\"role\": \"user\", \"content\": user_message},\n ]\n\n if enable_structured_outputs:\n response_model = cast(Type[R], response_model)\n self.log(\n f\"using structured outputs with {response_model.__name__}\",\n level=\"debug\",\n )\n\n response = client.chat.completions.parse(\n model=model_name,\n messages=messages, # type: ignore[arg-type]\n temperature=temperature,\n response_format=response_model,\n **extra_args, # pyright: ignore[reportArgumentType]\n )\n\n message = response.choices[0].message\n if message.parsed is None:\n raise ValueError(\n f\"unable to parse structured response. message={message}\"\n )\n\n return cast(R, message.parsed)\n\n else:\n self.log(\"not using structured outputs\", level=\"debug\")\n\n response = client.chat.completions.create(\n model=model_name,\n messages=messages, # type: ignore[arg-type]\n temperature=temperature,\n **extra_args, # pyright: ignore[reportArgumentType]\n )\n self.log(f\"sdk response: {response}\", level=\"debug\")\n\n text_response = response.choices[0].message.content\n if text_response is None:\n raise ValueError(f\"no text response from LLM. message={text_response}\")\n\n if response_model:\n try:\n return response_model.model_validate_json(text_response)\n except ValidationError as e:\n self.log(f\"response model validation error: {e}\", level=\"warning\")\n raise\n\n return text_response\n\n def __init__(self):\n self.valves = self.Valves()\n\n def extract_memory_context(self, content: str) -> Optional[tuple[str, list[dict]]]:\n \"\"\"\n Extract memory context from system message content.\n\n Returns:\n tuple of (full_match_string, parsed_memories_list) if found, None otherwise\n \"\"\"\n # Open WebUI uses this standard format\n pattern = r\"<memory_user_context>\\s*(\\[[\\s\\S]*?\\])\\s*</memory_user_context>\"\n match = re.search(pattern, content)\n\n if not match:\n self.log(\"no memory context found in system message\", level=\"debug\")\n return None\n\n try:\n memories_json = match.group(1)\n memories_list = json.loads(memories_json)\n self.log(\n f\"extracted {len(memories_list)} memories from context\", level=\"debug\"\n )\n return (match.group(0), memories_list)\n except json.JSONDecodeError as e:\n self.log(\n f\"failed to parse memory context JSON: {e}. raw content: {match.group(1)[:200]}...\",\n level=\"error\",\n )\n return None\n\n def format_memory_context(self, memories: list[dict]) -> str:\n \"\"\"\n Format memories into the memory context string.\n Override this method to customize how memories are presented.\n\n Args:\n memories: List of memory objects with 'content', 'created_at', 'updated_at', 'similarity_score'\n\n Returns:\n Formatted memory context string to inject into system prompt\n \"\"\"\n # Remove similarity_score from each memory\n memories = [\n {k: v for k, v in mem.items() if k != \"similarity_score\"}\n for mem in memories\n ]\n\n # Format with custom XML tag\n memories_json = json.dumps(memories, indent=2, ensure_ascii=False)\n return f\"<long_term_memory>\\n{memories_json}\\n</long_term_memory>\"\n\n def process_memory_context_in_messages(self, messages: list[dict]) -> list[dict]:\n \"\"\"\n Process messages to intercept and optionally override memory context.\n\n Args:\n messages: List of message dicts from the body\n\n Returns:\n Modified messages list\n \"\"\"\n found_any_memory_context = False\n\n # Find system message(s)\n for i, message in enumerate(messages):\n if message.get(\"role\") != \"system\":\n continue\n\n content = message.get(\"content\", \"\")\n if not content:\n continue\n\n # Try to extract existing memory context\n extraction_result = self.extract_memory_context(content)\n\n if extraction_result:\n found_any_memory_context = True\n full_match, memories_list = extraction_result\n\n # Override: format the memories using custom method\n new_context = self.format_memory_context(memories_list)\n\n # Replace in content\n messages[i][\"content\"] = content.replace(full_match, new_context)\n\n # Log successful override\n self.log(\n f\"overrode memory context in system message {i}: {len(memories_list)} memories processed, \"\n f\"similarity scores removed, XML tag changed to <long_term_memory>\",\n level=\"info\",\n )\n else:\n self.log(f\"no memory context in system message {i}\", level=\"debug\")\n\n # If valve is enabled and we didn't find any memory context, that's unusual\n if not found_any_memory_context:\n self.log(\n \"memory context override is enabled but no <memory_user_context> found in any system message\",\n level=\"warning\",\n )\n\n return messages\n\n def get_restricted_user_valve(\n self,\n user_valve_value: Optional[ValveType],\n admin_fallback: ValveType,\n authorization_check: Optional[bool] = None,\n valve_name: Optional[str] = None,\n ) -> ValveType:\n \"\"\"\n Get user valve value with security checks.\n\n Args:\n user_valve_value: The user's valve value to check\n admin_fallback: Admin's fallback value\n authorization_check: The valve value to check for authorization (e.g., user's API key)\n valve_name: Name of the valve being checked (for logging)\n\n Returns user's value only if:\n 1. authorization_check is provided and non-empty, OR\n 2. User is an admin, OR\n 3. Admin allows unsafe overrides\n\n Otherwise returns admin fallback.\n \"\"\"\n if authorization_check is None:\n authorization_check = False\n\n if authorization_check:\n if user_valve_value is not None:\n self.log(\n f\"'{valve_name or 'unknown'}' override authorized (user has own API key)\",\n level=\"debug\",\n )\n return user_valve_value if user_valve_value is not None else admin_fallback\n\n # Allow admins to override without providing their own API key\n if hasattr(self, \"current_user\") and self.current_user.get(\"role\") == \"admin\":\n if user_valve_value is not None:\n self.log(\n f\"'{valve_name or 'unknown'}' override allowed for admin user\",\n level=\"info\",\n )\n return user_valve_value if user_valve_value is not None else admin_fallback\n\n if self.valves.allow_unsafe_user_overrides:\n if user_valve_value is not None:\n self.log(\n f\"'{valve_name or 'unknown'}' override allowed (unsafe overrides enabled)\",\n level=\"warning\",\n )\n return user_valve_value if user_valve_value is not None else admin_fallback\n\n if user_valve_value is not None:\n self.log(\n f\"'{valve_name or 'unknown'}' override blocked - user attempted override without authorization, using admin defaults for security\",\n level=\"warning\",\n )\n return admin_fallback\n\n def build_memory_query(self, messages: list[dict[str, Any]]) -> str:\n \"\"\"\n Build a query string for memory retrieval from recent messages.\n\n Strategy:\n - Always include: last user message + last assistant response\n - If user message is short (β€8 words), also include the previous assistant message\n\n This gives embeddings enough context without overwhelming with noise.\n \"\"\"\n query_parts = []\n\n # Find last user message and its index\n last_user_idx = None\n last_user_msg = None\n for idx in range(len(messages) - 1, -1, -1):\n if messages[idx].get(\"role\") == \"user\":\n last_user_idx = idx\n last_user_msg = messages[idx].get(\"content\", \"\")\n break\n\n if last_user_msg is None or last_user_idx is None:\n raise ValueError(\"no user message found in messages\")\n\n # Count words in last user message\n user_word_count = len(last_user_msg.split())\n\n # Check if we should include extra context for short messages\n include_extra_context = user_word_count <= 8\n\n # Build query from most recent to older messages\n # Add last assistant response (if exists)\n if last_user_idx + 1 < len(messages):\n last_assistant_msg = messages[last_user_idx + 1].get(\"content\", \"\")\n if last_assistant_msg:\n query_parts.append(f\"Assistant: {last_assistant_msg}\")\n\n # Add last user message\n query_parts.append(f\"User: {last_user_msg}\")\n\n # If short message, add previous assistant context\n if include_extra_context and last_user_idx > 0:\n prev_assistant_msg = messages[last_user_idx - 1].get(\"content\", \"\")\n if (\n prev_assistant_msg\n and messages[last_user_idx - 1].get(\"role\") == \"assistant\"\n ):\n query_parts.append(f\"Assistant: {prev_assistant_msg}\")\n\n # Reverse to get chronological order and join\n query_parts.reverse()\n query = \"\\n\".join(query_parts)\n\n self.log(\n f\"built memory query with {len(query_parts)} messages (user message: {user_word_count} words)\",\n level=\"debug\",\n )\n self.log(f\"memory query: {query}\", level=\"debug\")\n\n return query\n\n async def get_related_memories(\n self,\n messages: list[dict[str, Any]],\n user: UserModel,\n ) -> list[Memory]:\n memory_query = self.build_memory_query(messages)\n\n # Query related memories\n try:\n results = await query_memory(\n request=Request(scope={\"type\": \"http\", \"app\": webui_app}),\n form_data=QueryMemoryForm(\n content=memory_query, k=self.valves.related_memories_n\n ),\n user=user,\n )\n except HTTPException as e:\n if e.status_code == 404:\n self.log(\"no related memories found\", level=\"info\")\n results = None\n else:\n self.log(\n f\"failed to query memories due to HTTP error {e.status_code}: {e.detail}\",\n level=\"error\",\n )\n raise RuntimeError(\"failed to query memories\") from e\n except Exception as e:\n self.log(f\"failed to query memories: {e}\", level=\"error\")\n raise RuntimeError(\"failed to query memories\") from e\n\n related_memories = searchresults_to_memories(results) if results else []\n self.log(\n f\"found {len(related_memories)} related memories before filtering\",\n level=\"info\",\n )\n\n # Filter by minimum similarity if configured\n if self.valves.minimum_memory_similarity is not None:\n filtered_memories = [\n mem\n for mem in related_memories\n if mem.similarity_score is not None\n and mem.similarity_score >= self.valves.minimum_memory_similarity\n ]\n filtered_count = len(related_memories) - len(filtered_memories)\n if filtered_count > 0:\n self.log(\n f\"filtered out {filtered_count} memories below similarity threshold {self.valves.minimum_memory_similarity}\",\n level=\"info\",\n )\n related_memories = filtered_memories\n\n self.log(f\"using {len(related_memories)} related memories\", level=\"info\")\n self.log(f\"related memories: {related_memories}\", level=\"debug\")\n\n return related_memories\n\n async def auto_memory(\n self,\n messages: list[dict[str, Any]],\n user: UserModel,\n emitter: Callable[[Any], Awaitable[None]],\n ) -> None:\n \"\"\"Execute the auto-memory extraction and update flow.\"\"\"\n\n if len(messages) < 2:\n self.log(\"need at least 2 messages for context\", level=\"debug\")\n return\n self.log(f\"flow started. user ID: {user.id}\", level=\"debug\")\n\n related_memories = await self.get_related_memories(messages=messages, user=user)\n\n stringified_memories = json.dumps(\n [memory.model_dump(mode=\"json\") for memory in related_memories]\n )\n conversation_str = self.messages_to_string(messages)\n\n try:\n action_plan = await self.query_openai_sdk(\n system_prompt=UNIFIED_SYSTEM_PROMPT,\n user_message=f\"Conversation snippet:\\n{conversation_str}\\n\\nRelated Memories:\\n{stringified_memories}\",\n response_model=build_actions_request_model(\n [m.mem_id for m in related_memories]\n ),\n )\n self.log(f\"action plan: {action_plan}\", level=\"debug\")\n\n await self.apply_memory_actions(\n action_plan=action_plan, # pyright: ignore[reportArgumentType]\n user=user,\n emitter=emitter,\n )\n\n except Exception as e:\n self.log(f\"LLM query failed: {e}\", level=\"error\")\n if self.user_valves.show_status:\n await emit_status(\n \"memory processing failed\", emitter=emitter, status=\"error\"\n )\n return None\n\n async def apply_memory_actions(\n self,\n action_plan: MemoryActionRequestStub,\n user: UserModel,\n emitter: Callable[[Any], Awaitable[None]],\n ) -> None:\n \"\"\"\n Execute memory actions from the plan.\n Order: delete -> update -> add (prevents conflicts)\n \"\"\"\n self.log(\"started apply_memory_actions\", level=\"debug\")\n actions = action_plan.actions\n\n # Show processing status\n if emitter and len(actions) > 0:\n self.log(f\"processing {len(actions)} memory actions\", level=\"debug\")\n await emit_status(\n f\"processing {len(actions)} memory actions\",\n emitter=emitter,\n status=\"in_progress\",\n )\n if self.valves.debug_mode:\n self.log(f\"memory actions to apply: {actions}\", level=\"debug\")\n\n # Group actions and define handlers\n operations = {\n \"delete\": {\n \"actions\": [a for a in actions if a.action == \"delete\"],\n \"handler\": lambda a: delete_memory_by_id(memory_id=a.id, user=user),\n \"log_msg\": lambda a: f\"deleted memory. id={a.id}\",\n \"error_msg\": lambda a, e: f\"failed to delete memory {a.id}: {e}\",\n \"skip_empty\": lambda a: False,\n \"status_verb\": \"deleted\",\n },\n \"update\": {\n \"actions\": [a for a in actions if a.action == \"update\"],\n \"handler\": lambda a: update_memory_by_id(\n memory_id=a.id,\n request=Request(scope={\"type\": \"http\", \"app\": webui_app}),\n form_data=MemoryUpdateModel(content=a.new_content),\n user=user,\n ),\n \"log_msg\": lambda a: f\"updated memory. id={a.id}\",\n \"error_msg\": lambda a, e: f\"failed to update memory {a.id}: {e}\",\n \"skip_empty\": lambda a: not a.new_content.strip(),\n \"status_verb\": \"updated\",\n },\n \"add\": {\n \"actions\": [a for a in actions if a.action == \"add\"],\n \"handler\": lambda a: add_memory(\n request=Request(scope={\"type\": \"http\", \"app\": webui_app}),\n form_data=AddMemoryForm(content=a.content),\n user=user,\n ),\n \"log_msg\": lambda a: f\"added memory. content={a.content}\",\n \"error_msg\": lambda a, e: f\"failed to add memory: {e}\",\n \"skip_empty\": lambda a: not a.content.strip(),\n \"status_verb\": \"saved\",\n },\n }\n\n # Process all operations in order\n counts = {}\n for op_name, op_config in operations.items():\n counts[op_name] = 0\n for action in op_config[\"actions\"]:\n if op_config[\"skip_empty\"](action):\n continue\n try:\n await op_config[\"handler\"](action)\n self.log(op_config[\"log_msg\"](action))\n counts[op_name] += 1\n except Exception as e:\n raise RuntimeError(op_config[\"error_msg\"](action, e))\n\n # Build status message\n status_parts = []\n for op_name, op_config in operations.items():\n count = counts[op_name]\n if count > 0:\n memory_word = \"memory\" if count == 1 else \"memories\"\n status_parts.append(f\"{op_config['status_verb']} {count} {memory_word}\")\n\n status_message = \", \".join(status_parts)\n self.log(status_message or \"no changes\", level=\"info\")\n\n if status_message and self.user_valves.show_status:\n await emit_status(status_message, emitter=emitter, status=\"complete\")\n\n def inlet(\n self,\n body: dict,\n __event_emitter__: Callable[[Any], Awaitable[None]],\n __user__: Optional[dict] = None,\n ) -> dict:\n self.log(f\"inlet: {__name__}\", level=\"info\")\n self.log(\n f\"inlet: user ID: {__user__.get('id') if __user__ else 'no user'}\",\n level=\"debug\",\n )\n\n # Process memory context interception if enabled\n if self.valves.override_memory_context and \"messages\" in body:\n try:\n body[\"messages\"] = self.process_memory_context_in_messages(\n body[\"messages\"]\n )\n except Exception as e:\n self.log(f\"error processing memory context: {e}\", level=\"error\")\n\n return body\n\n async def outlet(\n self,\n body: dict,\n __event_emitter__: Callable[[Any], Awaitable[None]],\n __user__: Optional[dict] = None,\n ) -> dict:\n\n self.log(\"outlet invoked\")\n if __user__ is None:\n raise ValueError(\"user information is required\")\n\n user = Users.get_user_by_id(__user__[\"id\"])\n if user is None:\n raise ValueError(\"user not found\")\n self.current_user = __user__\n\n self.log(f\"input user type = {type(__user__)}\", level=\"debug\")\n self.log(\n f\"user.id = {user.id} user.name = {user.name} user.email = {user.email}\",\n level=\"debug\",\n )\n\n self.user_valves = __user__.get(\"valves\", self.UserValves())\n if not isinstance(self.user_valves, self.UserValves):\n raise ValueError(\"invalid user valves\")\n self.user_valves = cast(Filter.UserValves, self.user_valves)\n self.log(f\"user valves = {self.user_valves}\", level=\"debug\")\n\n if not self.user_valves.enabled:\n self.log(\"component was disabled by user, skipping\", level=\"info\")\n return body\n\n _run_detached(\n self.auto_memory(\n body.get(\"messages\", []), user=user, emitter=__event_emitter__\n )\n )\n\n return body\n"}] |