Spaces:
Paused
Squashed commit of the following:
Browse filescommit 9d4e1b68b2ab41eefc534ef1f48953496d7d1cc6
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 15 18:01:51 2024 +0100
ctx window popup fix, default settings fix
commit 9ef32085651bc02610e1317b29f8d0b4913ae49f
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 15 14:55:53 2024 +0100
models, settings, initializer refactor
Rate limiter fix
Models initialized JIT
Model call wrappers for agent
Message compression fix
Log progress update
Settings frontend numeric fields
commit f7b3e2540c0ab798658cc6fed783942423d74df8
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 8 19:42:17 2024 +0100
knowledge import/reload
commit 4e028a3ce428cec581273a2261b6bdd8c474853e
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 8 11:28:47 2024 +0100
Memory recall speedup
commit 8ec3b24696e0346a3aea22d0f304ca297004bdcc
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 8 00:17:02 2024 +0100
keyboard input tool
commit a76a302f3f4e9fce9183aa0585595d6f18ae215a
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sat Dec 7 23:28:03 2024 +0100
solutions cleanup
commit 884007cdb0064ef9d550dc99e6c3066719242da1
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sat Dec 7 21:51:14 2024 +0100
console print edits for docker
commit 927c234d69312d57a90b03896bf9e62bf2583bd6
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sat Dec 7 20:40:28 2024 +0100
openai azure model func name fix
commit 53a46288f9a72928ec902b36625080ad07ebe2a1
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 15:17:58 2024 +0100
mistral fix, error text output
commit 6aa37744fc9ca77271e33c195bf67a78dd7937a7
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 14:58:10 2024 +0100
toast fix
commit f0be03ea77c3d4ef72ec8180df87e0bfffb46198
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 14:33:28 2024 +0100
toast errors
commit 84346828128d230a39a14d4ad32d14dec9bea8b7
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 11:30:34 2024 +0100
warnings cleanup
commit 2b94af895d517e32932f7f71ffea277a63dce940
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 10:54:06 2024 +0100
Preload fix
commit 7f270d4a14032bbe05a8b22c873af0febfa78668
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 09:44:13 2024 +0100
Server startup log msg
commit f9c9b5c93369269dd5eb71d222d8918f2bec6715
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 07:50:15 2024 +0100
Update run_ui.py
commit f3ca7e0742b12a93a8d5cc6cce066d88ad56a63b
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Dec 6 06:21:14 2024 +0100
Update run_ui.py
commit 21975c5a7cc7b3ad8b9ab95f940b5e6f6a743231
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Thu Dec 5 20:45:51 2024 +0100
local models docker url
commit f0a8b07c4fd2b1a5daaaf52142f194a8fdb8fcef
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Thu Dec 5 16:40:49 2024 +0100
Server addr notice
commit 656612726a3bbf01e4cd6ede01a60ce05b855c97
Merge: 49594fe 7c2866c
Author: Jan Tomášek <38891707+frdel@users.noreply.github.com>
Date: Thu Dec 5 16:11:23 2024 +0100
Merge pull request #260 from 3clyp50/development
fix: toast handling, mobile breakpoint
commit 7c2866ca614fff704b820e8f1ff6c9f50006320b
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Wed Dec 4 19:37:50 2024 +0100
fix: toast handling, mobile breakpoint
`toast.css` and `index.js`
- fixed toasts disappearing right after showing
- simplified toast animation
`index.css`
- set 2ⁿᵈ mobile breakpoint at 640px
commit 49594fe6ec2d32a1855a2ccbd9479d4fda347651
Merge: f697754 70b1fa3
Author: Jan Tomášek <38891707+frdel@users.noreply.github.com>
Date: Wed Dec 4 10:39:58 2024 +0100
Merge pull request #259 from 3clyp50/development
CSS refactor and toasts
commit 70b1fa385af8d86d1d5280a5b34e1a8a9abeb3cf
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Wed Dec 4 02:17:50 2024 +0100
refactor: css, style: toasts, fix: z-index
- organized structure
- consolidated selectors and states
- shorthand everywhere
- modern toasts
- bigger action buttons for mobile
commit f6977546c11b63e2e47dce8367cad8a6c62248fc
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 22:42:36 2024 +0100
call subordinate fix
commit fbe47ac03e56cfb005a1cd2b044c6305e27ca436
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 21:19:03 2024 +0100
Minor fixes
commit 961dbc405af8a784ecadcfcbcd7652d1f8d9be28
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 21:10:45 2024 +0100
restart
commit 357909c16a66c0e7ec78a2a993a9a4e54dd67bf9
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 19:41:29 2024 +0100
whisper remote preload
commit e0b0b6f6367841c85dfa9c2156f46db755c88497
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 17:39:56 2024 +0100
nudge
commit 9fae02b2a55bb1760fae926c2b40cd07ee26a61c
Merge: 0ebc142 fedf2d4
Author: Jan Tomášek <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 14:57:18 2024 +0100
Merge pull request #256 from 3clyp50/development
feature: copy text button, nudge & fix: various styles
commit 0ebc142124fa3dcb370d95fd2a84bdba8f3145e8
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 14:56:33 2024 +0100
ssh connection retry
commit deae13d3834c7031a18cd30d5ee593f804b1b09a
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 14:38:57 2024 +0100
root pass fix
commit 9109fcbf60a8c9cc975c9e21306c619d81a2b43c
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 14:28:53 2024 +0100
root password change fix
commit 46689d6477d51966b9876b7d51b180e871569ebb
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Tue Dec 3 14:22:18 2024 +0100
RFC & SSH exchange for development
commit fedf2d4bdc6357f9e50e76b9202e06081c66db5e
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Tue Dec 3 04:03:14 2024 +0100
feature: copy text button, nudge & fix: various styles
- Copy button for all messages
- Nudge button front-end
- Fixed various non-styled light mode elements
to do -> css cleanup and whisper loading
commit 19f50d6d9509acdaea2a5ccd846b5de2722b4a07
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Sun Dec 1 20:50:17 2024 +0100
attachments, files, prompt extras, prompt caching, refactors, cleanups
commit c99b1a47d4f25d8184661a77418ebfafa5c00ee9
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Fri Nov 29 08:55:27 2024 +0100
Alpine fix version, STT fixes
commit 81e653ba2d710ad31e43d658738cf6a843461792
Merge: 857f8b6 89b8483
Author: Jan Tomášek <38891707+frdel@users.noreply.github.com>
Date: Thu Nov 28 23:08:09 2024 +0100
Merge pull request #255 from 3clyp50/development
feature: speech to text settings
commit 857f8b6d82ec6707f45c67fa7e2a360e535071b0
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Thu Nov 28 23:05:17 2024 +0100
download and remove folders in browser
commit 89b848312b6f553fe22a6c0039bc7ab93b716384
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Thu Nov 28 16:07:50 2024 +0100
feature: speech to text settings
- initial commit: voice settings
- Settings section for STT
commit b3a27bb442668e4a21e79be4ab96c73a09f2b864
Merge: 5e8d6b1 bb980ea
Author: Jan Tomášek <38891707+frdel@users.noreply.github.com>
Date: Thu Nov 28 08:39:01 2024 +0100
Merge pull request #254 from 3clyp50/development
fix: file browser bugs + final ui polishing
commit bb980ea6b93a074b24cf86c54b0be69596b34cb1
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Thu Nov 28 01:13:56 2024 +0100
fix: file browser deletion bug + parent directory
Underscore matters!
- fixed both bugs for the browser
Extra:
- style for toasts
quickfix generic modals
commit f0126a6ef87c43aa34e6fbc7595d89f10f6c3b27
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Wed Nov 27 23:44:20 2024 +0100
style: polishing and consistency
commit 5e8d6b1c7d3ec965eac864b2bb72c85360bae8c2
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Wed Nov 27 22:16:13 2024 +0100
Minor fixes
commit 184f8dcf53ec49733d20967246374f08469d7e84
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Wed Nov 27 22:05:23 2024 +0100
Pause button fix
commit 969f142af12c01abd9009a8e35e0cbbd225bca8d
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Wed Nov 27 22:01:06 2024 +0100
RFC fix, history bugfixes
commit 733b8de5163b3fc36c68df099a1860af210e6a1d
Merge: f2057d3 6a83e79
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Wed Nov 27 20:57:15 2024 +0100
Merge branch 'pr/253' into development
commit 6a83e79d5a42fb44bfcac88c5ade3fda85ba2b28
Author: Alessandro <real.eclypso@gmail.com>
Date: Wed Nov 27 20:41:53 2024 +0100
fix: bigger modals
commit f2057d390178a760b7a857f918a8bb4dee586194
Author: frdel <38891707+frdel@users.noreply.github.com>
Date: Wed Nov 27 17:30:19 2024 +0100
Squashed commit of the following:
commit e626817332661f48ec97da1d4ab42479ca40b50f
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Wed Nov 27 12:51:22 2024 +0100
refactor: modals css
Modals now get the base styles from modals.css, with any spec in the individual files (settings.css, file_manager.css, ecc).
commit 306db0ca395a9f5691e558c7a18a02c9cecabaa3
Author: Alessandro <155005371+3clyp50@users.noreply.github.com>
Date: Wed Nov 27 03:17:20 2024 +0100
style: new action buttons + ghost buttons
Updated styles for but
- .vscode/settings.json +2 -0
- agent.py +119 -70
- initialize.py +40 -36
- models.py +36 -2
- python/extensions/message_loop_prompts/_50_recall_memories.py +3 -3
- python/extensions/message_loop_prompts/_51_recall_solutions.py +3 -3
- python/extensions/message_loop_prompts/_91_recall_wait.py +2 -2
- python/extensions/monologue_end/_50_memorize_fragments.py +5 -4
- python/extensions/monologue_end/_51_memorize_solutions.py +5 -4
- python/helpers/history.py +34 -40
- python/helpers/log.py +36 -9
- python/helpers/memory.py +33 -21
- python/helpers/persist_chat.py +1 -1
- python/helpers/rate_limiter.py +46 -58
- python/helpers/settings.py +168 -83
- python/tools/behaviour_adjustment.py +3 -3
- python/tools/response.py +0 -1
- run_ui.py +1 -1
- webui/css/settings.css +1 -0
- webui/index.html +10 -1
|
@@ -1,3 +1,5 @@
|
|
| 1 |
{
|
| 2 |
"python.analysis.typeCheckingMode": "standard",
|
|
|
|
|
|
|
| 3 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"python.analysis.typeCheckingMode": "standard",
|
| 3 |
+
"windsurfPyright.analysis.diagnosticMode": "workspace",
|
| 4 |
+
"windsurfPyright.analysis.typeCheckingMode": "standard",
|
| 5 |
}
|
|
@@ -2,8 +2,12 @@ import asyncio
|
|
| 2 |
from collections import OrderedDict
|
| 3 |
from dataclasses import dataclass, field
|
| 4 |
import time, importlib, inspect, os, json
|
| 5 |
-
|
|
|
|
| 6 |
import uuid
|
|
|
|
|
|
|
|
|
|
| 7 |
from python.helpers import extract_tools, rate_limiter, files, errors, history, tokens
|
| 8 |
from python.helpers.print_style import PrintStyle
|
| 9 |
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
|
@@ -131,19 +135,25 @@ class AgentContext:
|
|
| 131 |
agent.handle_critical_exception(e)
|
| 132 |
|
| 133 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
@dataclass
|
| 135 |
class AgentConfig:
|
| 136 |
-
chat_model:
|
| 137 |
-
utility_model:
|
| 138 |
-
embeddings_model:
|
| 139 |
prompts_subdir: str = ""
|
| 140 |
memory_subdir: str = ""
|
| 141 |
knowledge_subdirs: list[str] = field(default_factory=lambda: ["default", "custom"])
|
| 142 |
-
rate_limit_seconds: int = 60
|
| 143 |
-
rate_limit_requests: int = 15
|
| 144 |
-
rate_limit_input_tokens: int = 0
|
| 145 |
-
rate_limit_output_tokens: int = 0
|
| 146 |
-
response_timeout_seconds: int = 60
|
| 147 |
code_exec_docker_enabled: bool = False
|
| 148 |
code_exec_docker_name: str = "A0-dev"
|
| 149 |
code_exec_docker_image: str = "frdel/agent-zero-run:development"
|
|
@@ -222,13 +232,6 @@ class Agent:
|
|
| 222 |
self.history = history.History(self)
|
| 223 |
self.last_user_message: history.Message | None = None
|
| 224 |
self.intervention: UserMessage | None = None
|
| 225 |
-
self.rate_limiter = rate_limiter.RateLimiter(
|
| 226 |
-
self.context.log,
|
| 227 |
-
max_calls=self.config.rate_limit_requests,
|
| 228 |
-
max_input_tokens=self.config.rate_limit_input_tokens,
|
| 229 |
-
max_output_tokens=self.config.rate_limit_output_tokens,
|
| 230 |
-
window_seconds=self.config.rate_limit_seconds,
|
| 231 |
-
)
|
| 232 |
self.data = {} # free data object all the tools can use
|
| 233 |
|
| 234 |
async def monologue(self):
|
|
@@ -245,20 +248,11 @@ class Agent:
|
|
| 245 |
while True:
|
| 246 |
|
| 247 |
self.context.streaming_agent = self # mark self as current streamer
|
| 248 |
-
agent_response = ""
|
| 249 |
self.loop_data.iteration += 1
|
| 250 |
|
| 251 |
try:
|
| 252 |
# prepare LLM chain (model, system, history)
|
| 253 |
-
|
| 254 |
-
loop_data=self.loop_data
|
| 255 |
-
)
|
| 256 |
-
|
| 257 |
-
# rate limiter TODO - move to extension, make per-model
|
| 258 |
-
formatted_inputs = prompt.format()
|
| 259 |
-
self.set_data(self.DATA_NAME_CTX_WINDOW, formatted_inputs)
|
| 260 |
-
token_count = tokens.approximate_tokens(formatted_inputs)
|
| 261 |
-
self.rate_limiter.limit_call_and_input(token_count)
|
| 262 |
|
| 263 |
# output that the agent is starting
|
| 264 |
PrintStyle(
|
|
@@ -271,27 +265,18 @@ class Agent:
|
|
| 271 |
type="agent", heading=f"{self.agent_name}: Generating"
|
| 272 |
)
|
| 273 |
|
| 274 |
-
async
|
| 275 |
-
#
|
| 276 |
-
|
| 277 |
-
|
| 278 |
-
|
| 279 |
-
content = chunk
|
| 280 |
-
elif hasattr(chunk, "content"):
|
| 281 |
-
content = str(chunk.content)
|
| 282 |
-
else:
|
| 283 |
-
content = str(chunk)
|
| 284 |
|
| 285 |
-
|
| 286 |
-
|
| 287 |
-
printer.stream(content)
|
| 288 |
-
# concatenate stream into the response
|
| 289 |
-
agent_response += content
|
| 290 |
-
self.log_from_stream(agent_response, log)
|
| 291 |
|
| 292 |
-
self.
|
| 293 |
-
|
| 294 |
-
)
|
| 295 |
|
| 296 |
await self.handle_intervention(agent_response)
|
| 297 |
|
|
@@ -319,14 +304,14 @@ class Agent:
|
|
| 319 |
# exceptions inside message loop:
|
| 320 |
except InterventionException as e:
|
| 321 |
pass # intervention message has been handled in handle_intervention(), proceed with conversation loop
|
| 322 |
-
except
|
| 323 |
-
|
| 324 |
-
) as e: # Forward repairable errors to the LLM, maybe it can fix them
|
| 325 |
error_message = errors.format_error(e)
|
| 326 |
await self.hist_add_warning(error_message)
|
| 327 |
PrintStyle(font_color="red", padding=True).print(error_message)
|
| 328 |
self.context.log.log(type="error", content=error_message)
|
| 329 |
-
except Exception as e:
|
|
|
|
| 330 |
self.handle_critical_exception(e)
|
| 331 |
|
| 332 |
finally:
|
|
@@ -345,7 +330,7 @@ class Agent:
|
|
| 345 |
# call monologue_end extensions
|
| 346 |
await self.call_extensions("monologue_end", loop_data=self.loop_data) # type: ignore
|
| 347 |
|
| 348 |
-
async def
|
| 349 |
# set system prompt and message history
|
| 350 |
loop_data.system = await self.get_system_prompt(self.loop_data)
|
| 351 |
loop_data.history_output = self.history.output()
|
|
@@ -374,10 +359,7 @@ class Agent:
|
|
| 374 |
*history_langchain,
|
| 375 |
]
|
| 376 |
)
|
| 377 |
-
|
| 378 |
-
# return callable chain
|
| 379 |
-
chain = prompt | self.config.chat_model
|
| 380 |
-
return chain, prompt
|
| 381 |
|
| 382 |
def handle_critical_exception(self, exception: Exception):
|
| 383 |
if isinstance(exception, HandledException):
|
|
@@ -498,39 +480,106 @@ class Agent:
|
|
| 498 |
): # TODO add param for message range, topic, history
|
| 499 |
return self.history.output_text(human_label="user", ai_label="assistant")
|
| 500 |
|
| 501 |
-
async def
|
| 502 |
-
self,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 503 |
):
|
| 504 |
prompt = ChatPromptTemplate.from_messages(
|
| 505 |
-
[SystemMessage(content=system), HumanMessage(content=
|
| 506 |
)
|
| 507 |
|
| 508 |
-
chain = prompt | self.config.utility_model
|
| 509 |
response = ""
|
| 510 |
|
| 511 |
-
|
| 512 |
-
|
| 513 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 514 |
|
| 515 |
-
async for chunk in
|
| 516 |
await self.handle_intervention() # wait for intervention and handle it, if paused
|
| 517 |
|
| 518 |
-
|
| 519 |
-
|
| 520 |
-
|
| 521 |
-
content = str(chunk.content)
|
| 522 |
-
else:
|
| 523 |
-
content = str(chunk)
|
| 524 |
|
| 525 |
if callback:
|
| 526 |
-
callback(content)
|
| 527 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 528 |
response += content
|
| 529 |
|
| 530 |
-
|
|
|
|
| 531 |
|
| 532 |
return response
|
| 533 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 534 |
async def handle_intervention(self, progress: str = ""):
|
| 535 |
while self.context.paused:
|
| 536 |
await asyncio.sleep(0.1) # wait if paused
|
|
|
|
| 2 |
from collections import OrderedDict
|
| 3 |
from dataclasses import dataclass, field
|
| 4 |
import time, importlib, inspect, os, json
|
| 5 |
+
import token
|
| 6 |
+
from typing import Any, Awaitable, Optional, Dict, TypedDict
|
| 7 |
import uuid
|
| 8 |
+
import models
|
| 9 |
+
|
| 10 |
+
from langchain_core.prompt_values import ChatPromptValue
|
| 11 |
from python.helpers import extract_tools, rate_limiter, files, errors, history, tokens
|
| 12 |
from python.helpers.print_style import PrintStyle
|
| 13 |
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
|
|
|
| 135 |
agent.handle_critical_exception(e)
|
| 136 |
|
| 137 |
|
| 138 |
+
@dataclass
|
| 139 |
+
class ModelConfig:
|
| 140 |
+
provider: models.ModelProvider
|
| 141 |
+
name: str
|
| 142 |
+
ctx_length: int
|
| 143 |
+
limit_requests: int
|
| 144 |
+
limit_input: int
|
| 145 |
+
limit_output: int
|
| 146 |
+
kwargs: dict
|
| 147 |
+
|
| 148 |
+
|
| 149 |
@dataclass
|
| 150 |
class AgentConfig:
|
| 151 |
+
chat_model: ModelConfig
|
| 152 |
+
utility_model: ModelConfig
|
| 153 |
+
embeddings_model: ModelConfig
|
| 154 |
prompts_subdir: str = ""
|
| 155 |
memory_subdir: str = ""
|
| 156 |
knowledge_subdirs: list[str] = field(default_factory=lambda: ["default", "custom"])
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 157 |
code_exec_docker_enabled: bool = False
|
| 158 |
code_exec_docker_name: str = "A0-dev"
|
| 159 |
code_exec_docker_image: str = "frdel/agent-zero-run:development"
|
|
|
|
| 232 |
self.history = history.History(self)
|
| 233 |
self.last_user_message: history.Message | None = None
|
| 234 |
self.intervention: UserMessage | None = None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 235 |
self.data = {} # free data object all the tools can use
|
| 236 |
|
| 237 |
async def monologue(self):
|
|
|
|
| 248 |
while True:
|
| 249 |
|
| 250 |
self.context.streaming_agent = self # mark self as current streamer
|
|
|
|
| 251 |
self.loop_data.iteration += 1
|
| 252 |
|
| 253 |
try:
|
| 254 |
# prepare LLM chain (model, system, history)
|
| 255 |
+
prompt = await self.prepare_prompt(loop_data=self.loop_data)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 256 |
|
| 257 |
# output that the agent is starting
|
| 258 |
PrintStyle(
|
|
|
|
| 265 |
type="agent", heading=f"{self.agent_name}: Generating"
|
| 266 |
)
|
| 267 |
|
| 268 |
+
async def stream_callback(chunk: str, full: str):
|
| 269 |
+
# output the agent response stream
|
| 270 |
+
if chunk:
|
| 271 |
+
printer.stream(chunk)
|
| 272 |
+
self.log_from_stream(full, log)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 273 |
|
| 274 |
+
# store as last context window content
|
| 275 |
+
self.set_data(Agent.DATA_NAME_CTX_WINDOW, prompt.format())
|
|
|
|
|
|
|
|
|
|
|
|
|
| 276 |
|
| 277 |
+
agent_response = await self.call_chat_model(
|
| 278 |
+
prompt, callback=stream_callback
|
| 279 |
+
)
|
| 280 |
|
| 281 |
await self.handle_intervention(agent_response)
|
| 282 |
|
|
|
|
| 304 |
# exceptions inside message loop:
|
| 305 |
except InterventionException as e:
|
| 306 |
pass # intervention message has been handled in handle_intervention(), proceed with conversation loop
|
| 307 |
+
except RepairableException as e:
|
| 308 |
+
# Forward repairable errors to the LLM, maybe it can fix them
|
|
|
|
| 309 |
error_message = errors.format_error(e)
|
| 310 |
await self.hist_add_warning(error_message)
|
| 311 |
PrintStyle(font_color="red", padding=True).print(error_message)
|
| 312 |
self.context.log.log(type="error", content=error_message)
|
| 313 |
+
except Exception as e:
|
| 314 |
+
# Other exception kill the loop
|
| 315 |
self.handle_critical_exception(e)
|
| 316 |
|
| 317 |
finally:
|
|
|
|
| 330 |
# call monologue_end extensions
|
| 331 |
await self.call_extensions("monologue_end", loop_data=self.loop_data) # type: ignore
|
| 332 |
|
| 333 |
+
async def prepare_prompt(self, loop_data: LoopData) -> ChatPromptTemplate:
|
| 334 |
# set system prompt and message history
|
| 335 |
loop_data.system = await self.get_system_prompt(self.loop_data)
|
| 336 |
loop_data.history_output = self.history.output()
|
|
|
|
| 359 |
*history_langchain,
|
| 360 |
]
|
| 361 |
)
|
| 362 |
+
return prompt
|
|
|
|
|
|
|
|
|
|
| 363 |
|
| 364 |
def handle_critical_exception(self, exception: Exception):
|
| 365 |
if isinstance(exception, HandledException):
|
|
|
|
| 480 |
): # TODO add param for message range, topic, history
|
| 481 |
return self.history.output_text(human_label="user", ai_label="assistant")
|
| 482 |
|
| 483 |
+
async def call_utility_model(
|
| 484 |
+
self,
|
| 485 |
+
system: str,
|
| 486 |
+
message: str,
|
| 487 |
+
callback: Callable[[str], Awaitable[None]] | None = None,
|
| 488 |
+
background: bool = False,
|
| 489 |
):
|
| 490 |
prompt = ChatPromptTemplate.from_messages(
|
| 491 |
+
[SystemMessage(content=system), HumanMessage(content=message)]
|
| 492 |
)
|
| 493 |
|
|
|
|
| 494 |
response = ""
|
| 495 |
|
| 496 |
+
# model class
|
| 497 |
+
model = models.get_model(
|
| 498 |
+
models.ModelType.CHAT,
|
| 499 |
+
self.config.utility_model.provider,
|
| 500 |
+
self.config.utility_model.name,
|
| 501 |
+
**self.config.utility_model.kwargs,
|
| 502 |
+
)
|
| 503 |
+
|
| 504 |
+
# rate limiter
|
| 505 |
+
limiter = await self.rate_limiter(
|
| 506 |
+
self.config.utility_model, prompt.format(), background
|
| 507 |
+
)
|
| 508 |
|
| 509 |
+
async for chunk in (prompt | model).astream({}):
|
| 510 |
await self.handle_intervention() # wait for intervention and handle it, if paused
|
| 511 |
|
| 512 |
+
content = models.parse_chunk(chunk)
|
| 513 |
+
limiter.add(output=tokens.approximate_tokens(content))
|
| 514 |
+
response += content
|
|
|
|
|
|
|
|
|
|
| 515 |
|
| 516 |
if callback:
|
| 517 |
+
await callback(content)
|
| 518 |
|
| 519 |
+
return response
|
| 520 |
+
|
| 521 |
+
async def call_chat_model(
|
| 522 |
+
self,
|
| 523 |
+
prompt: ChatPromptTemplate,
|
| 524 |
+
callback: Callable[[str, str], Awaitable[None]] | None = None,
|
| 525 |
+
):
|
| 526 |
+
response = ""
|
| 527 |
+
|
| 528 |
+
# model class
|
| 529 |
+
model = models.get_model(
|
| 530 |
+
models.ModelType.CHAT,
|
| 531 |
+
self.config.chat_model.provider,
|
| 532 |
+
self.config.chat_model.name,
|
| 533 |
+
**self.config.chat_model.kwargs,
|
| 534 |
+
)
|
| 535 |
+
|
| 536 |
+
# rate limiter
|
| 537 |
+
limiter = await self.rate_limiter(self.config.chat_model, prompt.format())
|
| 538 |
+
|
| 539 |
+
async for chunk in (prompt | model).astream({}):
|
| 540 |
+
await self.handle_intervention() # wait for intervention and handle it, if paused
|
| 541 |
+
|
| 542 |
+
content = models.parse_chunk(chunk)
|
| 543 |
+
limiter.add(output=tokens.approximate_tokens(content))
|
| 544 |
response += content
|
| 545 |
|
| 546 |
+
if callback:
|
| 547 |
+
await callback(content, response)
|
| 548 |
|
| 549 |
return response
|
| 550 |
|
| 551 |
+
async def rate_limiter(
|
| 552 |
+
self, model_config: ModelConfig, input: str, background: bool = False
|
| 553 |
+
):
|
| 554 |
+
# rate limiter log
|
| 555 |
+
wait_log = None
|
| 556 |
+
|
| 557 |
+
async def wait_callback(msg: str, key: str, total: int, limit: int):
|
| 558 |
+
nonlocal wait_log
|
| 559 |
+
if not wait_log:
|
| 560 |
+
wait_log = self.context.log.log(
|
| 561 |
+
type="util",
|
| 562 |
+
update_progress="none",
|
| 563 |
+
heading=msg,
|
| 564 |
+
model=f"{model_config.provider.value}\\{model_config.name}",
|
| 565 |
+
)
|
| 566 |
+
wait_log.update(heading=msg, key=key, value=total, limit=limit)
|
| 567 |
+
if not background:
|
| 568 |
+
self.context.log.set_progress(msg, -1)
|
| 569 |
+
|
| 570 |
+
# rate limiter
|
| 571 |
+
limiter = models.get_rate_limiter(
|
| 572 |
+
model_config.provider,
|
| 573 |
+
model_config.name,
|
| 574 |
+
model_config.limit_requests,
|
| 575 |
+
model_config.limit_input,
|
| 576 |
+
model_config.limit_output,
|
| 577 |
+
)
|
| 578 |
+
limiter.add(input=tokens.approximate_tokens(input))
|
| 579 |
+
limiter.add(requests=1)
|
| 580 |
+
await limiter.wait(callback=wait_callback)
|
| 581 |
+
return limiter
|
| 582 |
+
|
| 583 |
async def handle_intervention(self, progress: str = ""):
|
| 584 |
while self.context.paused:
|
| 585 |
await asyncio.sleep(0.1) # wait if paused
|
|
@@ -1,6 +1,6 @@
|
|
| 1 |
import asyncio
|
| 2 |
import models
|
| 3 |
-
from agent import AgentConfig
|
| 4 |
from python.helpers import dotenv, files, rfc_exchange, runtime, settings, docker, log
|
| 5 |
|
| 6 |
|
|
@@ -8,36 +8,45 @@ def initialize():
|
|
| 8 |
|
| 9 |
current_settings = settings.get_settings()
|
| 10 |
|
| 11 |
-
#
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
) # chat model from user settings
|
| 25 |
-
|
| 26 |
-
# utility model used for helper functions (cheaper, faster)
|
| 27 |
-
# utility_llm = chat_llm
|
| 28 |
-
utility_llm = settings.get_utility_model(
|
| 29 |
-
current_settings
|
| 30 |
-
) # utility model from user settings
|
| 31 |
-
|
| 32 |
-
# embedding model used for memory
|
| 33 |
-
# embedding_llm = models.get_openai_embedding(model_name="text-embedding-3-small")
|
| 34 |
-
# embedding_llm = models.get_ollama_embedding(model_name="nomic-embed-text")
|
| 35 |
-
# embedding_llm = models.get_huggingface_embedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
|
| 36 |
-
# embedding_llm = models.get_lmstudio_embedding(model_name="nomic-ai/nomic-embed-text-v1.5-GGUF")
|
| 37 |
-
embedding_llm = settings.get_embedding_model(
|
| 38 |
-
current_settings
|
| 39 |
-
) # embedding model from user settings
|
| 40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
# agent configuration
|
| 42 |
config = AgentConfig(
|
| 43 |
chat_model=chat_llm,
|
|
@@ -46,12 +55,7 @@ def initialize():
|
|
| 46 |
prompts_subdir=current_settings["agent_prompts_subdir"],
|
| 47 |
memory_subdir=current_settings["agent_memory_subdir"],
|
| 48 |
knowledge_subdirs=["default", current_settings["agent_knowledge_subdir"]],
|
| 49 |
-
|
| 50 |
-
rate_limit_requests=30,
|
| 51 |
-
# rate_limit_input_tokens = 0,
|
| 52 |
-
# rate_limit_output_tokens = 0,
|
| 53 |
-
# response_timeout_seconds = 60,
|
| 54 |
-
code_exec_docker_enabled = False,
|
| 55 |
# code_exec_docker_name = "A0-dev",
|
| 56 |
# code_exec_docker_image = "frdel/agent-zero-run:development",
|
| 57 |
# code_exec_docker_ports = { "22/tcp": 55022, "80/tcp": 55080 }
|
|
|
|
| 1 |
import asyncio
|
| 2 |
import models
|
| 3 |
+
from agent import AgentConfig, ModelConfig
|
| 4 |
from python.helpers import dotenv, files, rfc_exchange, runtime, settings, docker, log
|
| 5 |
|
| 6 |
|
|
|
|
| 8 |
|
| 9 |
current_settings = settings.get_settings()
|
| 10 |
|
| 11 |
+
# chat model from user settings
|
| 12 |
+
chat_llm = ModelConfig(
|
| 13 |
+
provider=models.ModelProvider[current_settings["chat_model_provider"]],
|
| 14 |
+
name=current_settings["chat_model_name"],
|
| 15 |
+
ctx_length=current_settings["chat_model_ctx_length"],
|
| 16 |
+
limit_requests=current_settings["chat_model_rl_requests"],
|
| 17 |
+
limit_input=current_settings["chat_model_rl_input"],
|
| 18 |
+
limit_output=current_settings["chat_model_rl_output"],
|
| 19 |
+
kwargs={
|
| 20 |
+
"temperature": current_settings["chat_model_temperature"],
|
| 21 |
+
**current_settings["chat_model_kwargs"],
|
| 22 |
+
},
|
| 23 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
+
# utility model from user settings
|
| 26 |
+
utility_llm = ModelConfig(
|
| 27 |
+
provider=models.ModelProvider[current_settings["util_model_provider"]],
|
| 28 |
+
name=current_settings["util_model_name"],
|
| 29 |
+
ctx_length=current_settings["util_model_ctx_length"],
|
| 30 |
+
limit_requests=current_settings["util_model_rl_requests"],
|
| 31 |
+
limit_input=current_settings["util_model_rl_input"],
|
| 32 |
+
limit_output=current_settings["util_model_rl_output"],
|
| 33 |
+
kwargs={
|
| 34 |
+
"temperature": current_settings["util_model_temperature"],
|
| 35 |
+
**current_settings["util_model_kwargs"],
|
| 36 |
+
},
|
| 37 |
+
)
|
| 38 |
+
# embedding model from user settings
|
| 39 |
+
embedding_llm = ModelConfig(
|
| 40 |
+
provider=models.ModelProvider[current_settings["embed_model_provider"]],
|
| 41 |
+
name=current_settings["embed_model_name"],
|
| 42 |
+
ctx_length=0,
|
| 43 |
+
limit_requests=current_settings["embed_model_rl_requests"],
|
| 44 |
+
limit_input=0,
|
| 45 |
+
limit_output=0,
|
| 46 |
+
kwargs={
|
| 47 |
+
**current_settings["embed_model_kwargs"],
|
| 48 |
+
},
|
| 49 |
+
)
|
| 50 |
# agent configuration
|
| 51 |
config = AgentConfig(
|
| 52 |
chat_model=chat_llm,
|
|
|
|
| 55 |
prompts_subdir=current_settings["agent_prompts_subdir"],
|
| 56 |
memory_subdir=current_settings["agent_memory_subdir"],
|
| 57 |
knowledge_subdirs=["default", current_settings["agent_knowledge_subdir"]],
|
| 58 |
+
code_exec_docker_enabled=False,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
# code_exec_docker_name = "A0-dev",
|
| 60 |
# code_exec_docker_image = "frdel/agent-zero-run:development",
|
| 61 |
# code_exec_docker_ports = { "22/tcp": 55022, "80/tcp": 55080 }
|
|
@@ -1,5 +1,6 @@
|
|
| 1 |
from enum import Enum
|
| 2 |
import os
|
|
|
|
| 3 |
from langchain_openai import (
|
| 4 |
ChatOpenAI,
|
| 5 |
OpenAI,
|
|
@@ -28,6 +29,7 @@ from langchain_mistralai import ChatMistralAI
|
|
| 28 |
from pydantic.v1.types import SecretStr
|
| 29 |
from python.helpers import dotenv, runtime
|
| 30 |
from python.helpers.dotenv import load_dotenv
|
|
|
|
| 31 |
|
| 32 |
# environment variables
|
| 33 |
load_dotenv()
|
|
@@ -56,6 +58,9 @@ class ModelProvider(Enum):
|
|
| 56 |
OTHER = "Other"
|
| 57 |
|
| 58 |
|
|
|
|
|
|
|
|
|
|
| 59 |
# Utility function to get API keys from environment variables
|
| 60 |
def get_api_key(service):
|
| 61 |
return (
|
|
@@ -71,11 +76,36 @@ def get_model(type: ModelType, provider: ModelProvider, name: str, **kwargs):
|
|
| 71 |
return model
|
| 72 |
|
| 73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
|
| 76 |
# Ollama models
|
| 77 |
def get_ollama_base_url():
|
| 78 |
-
return
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
def get_ollama_chat(
|
| 81 |
model_name: str,
|
|
@@ -138,7 +168,11 @@ def get_huggingface_embedding(model_name: str, **kwargs):
|
|
| 138 |
|
| 139 |
# LM Studio and other OpenAI compatible interfaces
|
| 140 |
def get_lmstudio_base_url():
|
| 141 |
-
return
|
|
|
|
|
|
|
|
|
|
|
|
|
| 142 |
|
| 143 |
def get_lmstudio_chat(
|
| 144 |
model_name: str,
|
|
|
|
| 1 |
from enum import Enum
|
| 2 |
import os
|
| 3 |
+
from typing import Any
|
| 4 |
from langchain_openai import (
|
| 5 |
ChatOpenAI,
|
| 6 |
OpenAI,
|
|
|
|
| 29 |
from pydantic.v1.types import SecretStr
|
| 30 |
from python.helpers import dotenv, runtime
|
| 31 |
from python.helpers.dotenv import load_dotenv
|
| 32 |
+
from python.helpers.rate_limiter import RateLimiter
|
| 33 |
|
| 34 |
# environment variables
|
| 35 |
load_dotenv()
|
|
|
|
| 58 |
OTHER = "Other"
|
| 59 |
|
| 60 |
|
| 61 |
+
rate_limiters: dict[str, RateLimiter] = {}
|
| 62 |
+
|
| 63 |
+
|
| 64 |
# Utility function to get API keys from environment variables
|
| 65 |
def get_api_key(service):
|
| 66 |
return (
|
|
|
|
| 76 |
return model
|
| 77 |
|
| 78 |
|
| 79 |
+
def get_rate_limiter(
|
| 80 |
+
provider: ModelProvider, name: str, requests: int, input: int, output: int
|
| 81 |
+
) -> RateLimiter:
|
| 82 |
+
# get or create
|
| 83 |
+
key = f"{provider.name}\\{name}"
|
| 84 |
+
rate_limiters[key] = limiter = rate_limiters.get(key, RateLimiter(seconds=60))
|
| 85 |
+
# always update
|
| 86 |
+
limiter.limits["requests"] = requests or 0
|
| 87 |
+
limiter.limits["input"] = input or 0
|
| 88 |
+
limiter.limits["output"] = output or 0
|
| 89 |
+
return limiter
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def parse_chunk(chunk: Any):
|
| 93 |
+
if isinstance(chunk, str):
|
| 94 |
+
content = chunk
|
| 95 |
+
elif hasattr(chunk, "content"):
|
| 96 |
+
content = str(chunk.content)
|
| 97 |
+
else:
|
| 98 |
+
content = str(chunk)
|
| 99 |
+
return content
|
| 100 |
|
| 101 |
|
| 102 |
# Ollama models
|
| 103 |
def get_ollama_base_url():
|
| 104 |
+
return (
|
| 105 |
+
dotenv.get_dotenv_value("OLLAMA_BASE_URL")
|
| 106 |
+
or f"http://{runtime.get_local_url()}:11434"
|
| 107 |
+
)
|
| 108 |
+
|
| 109 |
|
| 110 |
def get_ollama_chat(
|
| 111 |
model_name: str,
|
|
|
|
| 168 |
|
| 169 |
# LM Studio and other OpenAI compatible interfaces
|
| 170 |
def get_lmstudio_base_url():
|
| 171 |
+
return (
|
| 172 |
+
dotenv.get_dotenv_value("LM_STUDIO_BASE_URL")
|
| 173 |
+
or f"http://{runtime.get_local_url()}:1234/v1"
|
| 174 |
+
)
|
| 175 |
+
|
| 176 |
|
| 177 |
def get_lmstudio_chat(
|
| 178 |
model_name: str,
|
|
@@ -53,13 +53,13 @@ class RecallMemories(Extension):
|
|
| 53 |
)
|
| 54 |
|
| 55 |
# log query streamed by LLM
|
| 56 |
-
def log_callback(content):
|
| 57 |
log_item.stream(query=content)
|
| 58 |
|
| 59 |
# call util llm to summarize conversation
|
| 60 |
-
query = await self.agent.
|
| 61 |
system=system,
|
| 62 |
-
|
| 63 |
callback=log_callback,
|
| 64 |
)
|
| 65 |
|
|
|
|
| 53 |
)
|
| 54 |
|
| 55 |
# log query streamed by LLM
|
| 56 |
+
async def log_callback(content):
|
| 57 |
log_item.stream(query=content)
|
| 58 |
|
| 59 |
# call util llm to summarize conversation
|
| 60 |
+
query = await self.agent.call_utility_model(
|
| 61 |
system=system,
|
| 62 |
+
message=loop_data.user_message.output_text() if loop_data.user_message else "",
|
| 63 |
callback=log_callback,
|
| 64 |
)
|
| 65 |
|
|
@@ -53,12 +53,12 @@ class RecallSolutions(Extension):
|
|
| 53 |
)
|
| 54 |
|
| 55 |
# log query streamed by LLM
|
| 56 |
-
def log_callback(content):
|
| 57 |
log_item.stream(query=content)
|
| 58 |
|
| 59 |
# call util llm to summarize conversation
|
| 60 |
-
query = await self.agent.
|
| 61 |
-
system=system,
|
| 62 |
)
|
| 63 |
|
| 64 |
# get solutions database
|
|
|
|
| 53 |
)
|
| 54 |
|
| 55 |
# log query streamed by LLM
|
| 56 |
+
async def log_callback(content):
|
| 57 |
log_item.stream(query=content)
|
| 58 |
|
| 59 |
# call util llm to summarize conversation
|
| 60 |
+
query = await self.agent.call_utility_model(
|
| 61 |
+
system=system, message=loop_data.user_message.output_text() if loop_data.user_message else "", callback=log_callback
|
| 62 |
)
|
| 63 |
|
| 64 |
# get solutions database
|
|
@@ -9,11 +9,11 @@ class RecallWait(Extension):
|
|
| 9 |
|
| 10 |
task = self.agent.get_data(DATA_NAME_TASK_MEMORIES)
|
| 11 |
if task and not task.done():
|
| 12 |
-
self.agent.context.log.set_progress("Recalling memories...")
|
| 13 |
await task
|
| 14 |
|
| 15 |
task = self.agent.get_data(DATA_NAME_TASK_SOLUTIONS)
|
| 16 |
if task and not task.done():
|
| 17 |
-
self.agent.context.log.set_progress("Recalling solutions...")
|
| 18 |
await task
|
| 19 |
|
|
|
|
| 9 |
|
| 10 |
task = self.agent.get_data(DATA_NAME_TASK_MEMORIES)
|
| 11 |
if task and not task.done():
|
| 12 |
+
# self.agent.context.log.set_progress("Recalling memories...")
|
| 13 |
await task
|
| 14 |
|
| 15 |
task = self.agent.get_data(DATA_NAME_TASK_SOLUTIONS)
|
| 16 |
if task and not task.done():
|
| 17 |
+
# self.agent.context.log.set_progress("Recalling solutions...")
|
| 18 |
await task
|
| 19 |
|
|
@@ -35,14 +35,15 @@ class MemorizeMemories(Extension):
|
|
| 35 |
msgs_text = self.agent.concat_messages(self.agent.history)
|
| 36 |
|
| 37 |
# log query streamed by LLM
|
| 38 |
-
def log_callback(content):
|
| 39 |
log_item.stream(content=content)
|
| 40 |
|
| 41 |
# call util llm to find info in history
|
| 42 |
-
memories_json = await self.agent.
|
| 43 |
system=system,
|
| 44 |
-
|
| 45 |
callback=log_callback,
|
|
|
|
| 46 |
)
|
| 47 |
|
| 48 |
memories = DirtyJson.parse_string(memories_json)
|
|
@@ -76,7 +77,7 @@ class MemorizeMemories(Extension):
|
|
| 76 |
log_item.update(replaced=rem_txt)
|
| 77 |
|
| 78 |
# insert new solution
|
| 79 |
-
db.insert_text(text=txt, metadata={"area": Memory.Area.FRAGMENTS.value})
|
| 80 |
|
| 81 |
log_item.update(
|
| 82 |
result=f"{len(memories)} entries memorized.",
|
|
|
|
| 35 |
msgs_text = self.agent.concat_messages(self.agent.history)
|
| 36 |
|
| 37 |
# log query streamed by LLM
|
| 38 |
+
async def log_callback(content):
|
| 39 |
log_item.stream(content=content)
|
| 40 |
|
| 41 |
# call util llm to find info in history
|
| 42 |
+
memories_json = await self.agent.call_utility_model(
|
| 43 |
system=system,
|
| 44 |
+
message=msgs_text,
|
| 45 |
callback=log_callback,
|
| 46 |
+
background=True,
|
| 47 |
)
|
| 48 |
|
| 49 |
memories = DirtyJson.parse_string(memories_json)
|
|
|
|
| 77 |
log_item.update(replaced=rem_txt)
|
| 78 |
|
| 79 |
# insert new solution
|
| 80 |
+
await db.insert_text(text=txt, metadata={"area": Memory.Area.FRAGMENTS.value})
|
| 81 |
|
| 82 |
log_item.update(
|
| 83 |
result=f"{len(memories)} entries memorized.",
|
|
@@ -33,14 +33,15 @@ class MemorizeSolutions(Extension):
|
|
| 33 |
msgs_text = self.agent.concat_messages(self.agent.history)
|
| 34 |
|
| 35 |
# log query streamed by LLM
|
| 36 |
-
def log_callback(content):
|
| 37 |
log_item.stream(content=content)
|
| 38 |
|
| 39 |
# call util llm to find solutions in history
|
| 40 |
-
solutions_json = await self.agent.
|
| 41 |
system=system,
|
| 42 |
-
|
| 43 |
callback=log_callback,
|
|
|
|
| 44 |
)
|
| 45 |
|
| 46 |
solutions = DirtyJson.parse_string(solutions_json)
|
|
@@ -75,7 +76,7 @@ class MemorizeSolutions(Extension):
|
|
| 75 |
log_item.update(replaced=rem_txt)
|
| 76 |
|
| 77 |
# insert new solution
|
| 78 |
-
db.insert_text(text=txt, metadata={"area": Memory.Area.SOLUTIONS.value})
|
| 79 |
|
| 80 |
solutions_txt = solutions_txt.strip()
|
| 81 |
log_item.update(solutions=solutions_txt)
|
|
|
|
| 33 |
msgs_text = self.agent.concat_messages(self.agent.history)
|
| 34 |
|
| 35 |
# log query streamed by LLM
|
| 36 |
+
async def log_callback(content):
|
| 37 |
log_item.stream(content=content)
|
| 38 |
|
| 39 |
# call util llm to find solutions in history
|
| 40 |
+
solutions_json = await self.agent.call_utility_model(
|
| 41 |
system=system,
|
| 42 |
+
message=msgs_text,
|
| 43 |
callback=log_callback,
|
| 44 |
+
background=True,
|
| 45 |
)
|
| 46 |
|
| 47 |
solutions = DirtyJson.parse_string(solutions_json)
|
|
|
|
| 76 |
log_item.update(replaced=rem_txt)
|
| 77 |
|
| 78 |
# insert new solution
|
| 79 |
+
await db.insert_text(text=txt, metadata={"area": Memory.Area.SOLUTIONS.value})
|
| 80 |
|
| 81 |
solutions_txt = solutions_txt.strip()
|
| 82 |
log_item.update(solutions=solutions_txt)
|
|
@@ -130,12 +130,12 @@ class Topic(Record):
|
|
| 130 |
* LARGE_MESSAGE_TO_TOPIC_RATIO
|
| 131 |
)
|
| 132 |
large_msgs = []
|
| 133 |
-
for m in self.messages:
|
| 134 |
out = m.output()
|
| 135 |
text = output_text(out)
|
| 136 |
tok = tokens.approximate_tokens(text)
|
| 137 |
leng = len(text)
|
| 138 |
-
if
|
| 139 |
large_msgs.append((m, tok, leng, out))
|
| 140 |
large_msgs.sort(key=lambda x: x[1], reverse=True)
|
| 141 |
for msg, tok, leng, out in large_msgs:
|
|
@@ -173,12 +173,11 @@ class Topic(Record):
|
|
| 173 |
|
| 174 |
async def summarize_messages(self, messages: list[Message]):
|
| 175 |
msg_txt = [m.output_text() for m in messages]
|
| 176 |
-
summary = await
|
| 177 |
system=self.history.agent.read_prompt("fw.topic_summary.sys.md"),
|
| 178 |
message=self.history.agent.read_prompt(
|
| 179 |
"fw.topic_summary.msg.md", content=msg_txt
|
| 180 |
),
|
| 181 |
-
model=settings.get_utility_model(),
|
| 182 |
)
|
| 183 |
return summary
|
| 184 |
|
|
@@ -218,12 +217,11 @@ class Bulk(Record):
|
|
| 218 |
return False
|
| 219 |
|
| 220 |
async def summarize(self):
|
| 221 |
-
self.summary = await
|
| 222 |
system=self.history.agent.read_prompt("fw.topic_summary.sys.md"),
|
| 223 |
message=self.history.agent.read_prompt(
|
| 224 |
"fw.topic_summary.msg.md", content=self.output_text()
|
| 225 |
),
|
| 226 |
-
model=settings.get_utility_model(),
|
| 227 |
)
|
| 228 |
return self.summary
|
| 229 |
|
|
@@ -309,42 +307,38 @@ class History(Record):
|
|
| 309 |
return json.dumps(data)
|
| 310 |
|
| 311 |
async def compress(self):
|
| 312 |
-
curr, hist, bulk = (
|
| 313 |
-
self.get_current_topic_tokens(),
|
| 314 |
-
self.get_topics_tokens(),
|
| 315 |
-
self.get_bulks_tokens(),
|
| 316 |
-
)
|
| 317 |
-
total = get_ctx_size_for_history()
|
| 318 |
compressed = False
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
| 330 |
-
|
| 331 |
-
|
| 332 |
-
|
| 333 |
-
|
| 334 |
-
|
| 335 |
-
|
| 336 |
-
|
| 337 |
-
|
| 338 |
-
|
| 339 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 340 |
else:
|
| 341 |
-
|
| 342 |
-
|
| 343 |
-
# try the whole function again to see if there is still a need for compression
|
| 344 |
-
if compressed:
|
| 345 |
-
await self.compress()
|
| 346 |
-
|
| 347 |
-
return compressed
|
| 348 |
|
| 349 |
async def compress_topics(self) -> bool:
|
| 350 |
# summarize topics one by one
|
|
|
|
| 130 |
* LARGE_MESSAGE_TO_TOPIC_RATIO
|
| 131 |
)
|
| 132 |
large_msgs = []
|
| 133 |
+
for m in (m for m in self.messages if not m.summary):
|
| 134 |
out = m.output()
|
| 135 |
text = output_text(out)
|
| 136 |
tok = tokens.approximate_tokens(text)
|
| 137 |
leng = len(text)
|
| 138 |
+
if tok > msg_max_size:
|
| 139 |
large_msgs.append((m, tok, leng, out))
|
| 140 |
large_msgs.sort(key=lambda x: x[1], reverse=True)
|
| 141 |
for msg, tok, leng, out in large_msgs:
|
|
|
|
| 173 |
|
| 174 |
async def summarize_messages(self, messages: list[Message]):
|
| 175 |
msg_txt = [m.output_text() for m in messages]
|
| 176 |
+
summary = await self.history.agent.call_utility_model(
|
| 177 |
system=self.history.agent.read_prompt("fw.topic_summary.sys.md"),
|
| 178 |
message=self.history.agent.read_prompt(
|
| 179 |
"fw.topic_summary.msg.md", content=msg_txt
|
| 180 |
),
|
|
|
|
| 181 |
)
|
| 182 |
return summary
|
| 183 |
|
|
|
|
| 217 |
return False
|
| 218 |
|
| 219 |
async def summarize(self):
|
| 220 |
+
self.summary = await self.history.agent.call_utility_model(
|
| 221 |
system=self.history.agent.read_prompt("fw.topic_summary.sys.md"),
|
| 222 |
message=self.history.agent.read_prompt(
|
| 223 |
"fw.topic_summary.msg.md", content=self.output_text()
|
| 224 |
),
|
|
|
|
| 225 |
)
|
| 226 |
return self.summary
|
| 227 |
|
|
|
|
| 307 |
return json.dumps(data)
|
| 308 |
|
| 309 |
async def compress(self):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 310 |
compressed = False
|
| 311 |
+
while True:
|
| 312 |
+
curr, hist, bulk = (
|
| 313 |
+
self.get_current_topic_tokens(),
|
| 314 |
+
self.get_topics_tokens(),
|
| 315 |
+
self.get_bulks_tokens(),
|
| 316 |
+
)
|
| 317 |
+
total = get_ctx_size_for_history()
|
| 318 |
+
ratios = [
|
| 319 |
+
(curr, CURRENT_TOPIC_RATIO, "current_topic"),
|
| 320 |
+
(hist, HISTORY_TOPIC_RATIO, "history_topic"),
|
| 321 |
+
(bulk, HISTORY_BULK_RATIO, "history_bulk"),
|
| 322 |
+
]
|
| 323 |
+
ratios = sorted(ratios, key=lambda x: (x[0] / total) / x[1], reverse=True)
|
| 324 |
+
compressed_part = False
|
| 325 |
+
for ratio in ratios:
|
| 326 |
+
if ratio[0] > ratio[1] * total:
|
| 327 |
+
over_part = ratio[2]
|
| 328 |
+
if over_part == "current_topic":
|
| 329 |
+
compressed_part = await self.current.compress()
|
| 330 |
+
elif over_part == "history_topic":
|
| 331 |
+
compressed_part = await self.compress_topics()
|
| 332 |
+
else:
|
| 333 |
+
compressed_part = await self.compress_bulks()
|
| 334 |
+
if compressed_part:
|
| 335 |
+
break
|
| 336 |
+
|
| 337 |
+
if compressed_part:
|
| 338 |
+
compressed = True
|
| 339 |
+
continue
|
| 340 |
else:
|
| 341 |
+
return compressed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 342 |
|
| 343 |
async def compress_topics(self) -> bool:
|
| 344 |
# summarize topics one by one
|
|
@@ -19,6 +19,9 @@ Type = Literal[
|
|
| 19 |
"warning",
|
| 20 |
]
|
| 21 |
|
|
|
|
|
|
|
|
|
|
| 22 |
@dataclass
|
| 23 |
class LogItem:
|
| 24 |
log: "Log"
|
|
@@ -27,6 +30,7 @@ class LogItem:
|
|
| 27 |
heading: str
|
| 28 |
content: str
|
| 29 |
temp: bool
|
|
|
|
| 30 |
kvps: Optional[OrderedDict] = None # Use OrderedDict for kvps
|
| 31 |
id: Optional[str] = None # Add id field
|
| 32 |
guid: str = ""
|
|
@@ -41,20 +45,27 @@ class LogItem:
|
|
| 41 |
content: str | None = None,
|
| 42 |
kvps: dict | None = None,
|
| 43 |
temp: bool | None = None,
|
|
|
|
| 44 |
**kwargs,
|
| 45 |
):
|
| 46 |
if self.guid == self.log.guid:
|
| 47 |
-
self.log.
|
| 48 |
self.no,
|
| 49 |
type=type,
|
| 50 |
heading=heading,
|
| 51 |
content=content,
|
| 52 |
kvps=kvps,
|
| 53 |
temp=temp,
|
|
|
|
| 54 |
**kwargs,
|
| 55 |
)
|
| 56 |
|
| 57 |
-
def stream(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
if heading is not None:
|
| 59 |
self.update(heading=self.heading + heading)
|
| 60 |
if content is not None:
|
|
@@ -75,6 +86,7 @@ class LogItem:
|
|
| 75 |
"kvps": self.kvps,
|
| 76 |
}
|
| 77 |
|
|
|
|
| 78 |
class Log:
|
| 79 |
|
| 80 |
def __init__(self):
|
|
@@ -90,7 +102,9 @@ class Log:
|
|
| 90 |
content: str | None = None,
|
| 91 |
kvps: dict | None = None,
|
| 92 |
temp: bool | None = None,
|
|
|
|
| 93 |
id: Optional[str] = None, # Add id parameter
|
|
|
|
| 94 |
) -> LogItem:
|
| 95 |
# Use OrderedDict if kvps is provided
|
| 96 |
if kvps is not None:
|
|
@@ -101,17 +115,19 @@ class Log:
|
|
| 101 |
type=type,
|
| 102 |
heading=heading or "",
|
| 103 |
content=content or "",
|
| 104 |
-
kvps=kvps,
|
| 105 |
-
|
|
|
|
|
|
|
|
|
|
| 106 |
id=id, # Pass id to LogItem
|
| 107 |
)
|
| 108 |
self.logs.append(item)
|
| 109 |
self.updates += [item.no]
|
| 110 |
-
|
| 111 |
-
self.set_progress(heading, item.no)
|
| 112 |
return item
|
| 113 |
|
| 114 |
-
def
|
| 115 |
self,
|
| 116 |
no: int,
|
| 117 |
type: str | None = None,
|
|
@@ -119,15 +135,16 @@ class Log:
|
|
| 119 |
content: str | None = None,
|
| 120 |
kvps: dict | None = None,
|
| 121 |
temp: bool | None = None,
|
|
|
|
| 122 |
**kwargs,
|
| 123 |
):
|
| 124 |
item = self.logs[no]
|
| 125 |
if type is not None:
|
| 126 |
item.type = type
|
|
|
|
|
|
|
| 127 |
if heading is not None:
|
| 128 |
item.heading = heading
|
| 129 |
-
if no >= self.progress_no:
|
| 130 |
-
self.set_progress(heading, no)
|
| 131 |
if content is not None:
|
| 132 |
item.content = content
|
| 133 |
if kvps is not None:
|
|
@@ -143,6 +160,7 @@ class Log:
|
|
| 143 |
item.kvps[k] = v
|
| 144 |
|
| 145 |
self.updates += [item.no]
|
|
|
|
| 146 |
|
| 147 |
def set_progress(self, progress: str, no: int = 0, active: bool = True):
|
| 148 |
self.progress = progress
|
|
@@ -174,3 +192,12 @@ class Log:
|
|
| 174 |
self.updates = []
|
| 175 |
self.logs = []
|
| 176 |
self.set_initial_progress()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
"warning",
|
| 20 |
]
|
| 21 |
|
| 22 |
+
ProgressUpdate = Literal["persistent", "temporary", "none"]
|
| 23 |
+
|
| 24 |
+
|
| 25 |
@dataclass
|
| 26 |
class LogItem:
|
| 27 |
log: "Log"
|
|
|
|
| 30 |
heading: str
|
| 31 |
content: str
|
| 32 |
temp: bool
|
| 33 |
+
update_progress: Optional[ProgressUpdate] = "persistent"
|
| 34 |
kvps: Optional[OrderedDict] = None # Use OrderedDict for kvps
|
| 35 |
id: Optional[str] = None # Add id field
|
| 36 |
guid: str = ""
|
|
|
|
| 45 |
content: str | None = None,
|
| 46 |
kvps: dict | None = None,
|
| 47 |
temp: bool | None = None,
|
| 48 |
+
update_progress: ProgressUpdate | None = None,
|
| 49 |
**kwargs,
|
| 50 |
):
|
| 51 |
if self.guid == self.log.guid:
|
| 52 |
+
self.log._update_item(
|
| 53 |
self.no,
|
| 54 |
type=type,
|
| 55 |
heading=heading,
|
| 56 |
content=content,
|
| 57 |
kvps=kvps,
|
| 58 |
temp=temp,
|
| 59 |
+
update_progress=update_progress,
|
| 60 |
**kwargs,
|
| 61 |
)
|
| 62 |
|
| 63 |
+
def stream(
|
| 64 |
+
self,
|
| 65 |
+
heading: str | None = None,
|
| 66 |
+
content: str | None = None,
|
| 67 |
+
**kwargs,
|
| 68 |
+
):
|
| 69 |
if heading is not None:
|
| 70 |
self.update(heading=self.heading + heading)
|
| 71 |
if content is not None:
|
|
|
|
| 86 |
"kvps": self.kvps,
|
| 87 |
}
|
| 88 |
|
| 89 |
+
|
| 90 |
class Log:
|
| 91 |
|
| 92 |
def __init__(self):
|
|
|
|
| 102 |
content: str | None = None,
|
| 103 |
kvps: dict | None = None,
|
| 104 |
temp: bool | None = None,
|
| 105 |
+
update_progress: ProgressUpdate | None = None,
|
| 106 |
id: Optional[str] = None, # Add id parameter
|
| 107 |
+
**kwargs,
|
| 108 |
) -> LogItem:
|
| 109 |
# Use OrderedDict if kvps is provided
|
| 110 |
if kvps is not None:
|
|
|
|
| 115 |
type=type,
|
| 116 |
heading=heading or "",
|
| 117 |
content=content or "",
|
| 118 |
+
kvps=OrderedDict({**(kvps or {}), **(kwargs or {})}),
|
| 119 |
+
update_progress=(
|
| 120 |
+
update_progress if update_progress is not None else "persistent"
|
| 121 |
+
),
|
| 122 |
+
temp=temp if temp is not None else False,
|
| 123 |
id=id, # Pass id to LogItem
|
| 124 |
)
|
| 125 |
self.logs.append(item)
|
| 126 |
self.updates += [item.no]
|
| 127 |
+
self._update_progress_from_item(item)
|
|
|
|
| 128 |
return item
|
| 129 |
|
| 130 |
+
def _update_item(
|
| 131 |
self,
|
| 132 |
no: int,
|
| 133 |
type: str | None = None,
|
|
|
|
| 135 |
content: str | None = None,
|
| 136 |
kvps: dict | None = None,
|
| 137 |
temp: bool | None = None,
|
| 138 |
+
update_progress: ProgressUpdate | None = None,
|
| 139 |
**kwargs,
|
| 140 |
):
|
| 141 |
item = self.logs[no]
|
| 142 |
if type is not None:
|
| 143 |
item.type = type
|
| 144 |
+
if update_progress is not None:
|
| 145 |
+
item.update_progress = update_progress
|
| 146 |
if heading is not None:
|
| 147 |
item.heading = heading
|
|
|
|
|
|
|
| 148 |
if content is not None:
|
| 149 |
item.content = content
|
| 150 |
if kvps is not None:
|
|
|
|
| 160 |
item.kvps[k] = v
|
| 161 |
|
| 162 |
self.updates += [item.no]
|
| 163 |
+
self._update_progress_from_item(item)
|
| 164 |
|
| 165 |
def set_progress(self, progress: str, no: int = 0, active: bool = True):
|
| 166 |
self.progress = progress
|
|
|
|
| 192 |
self.updates = []
|
| 193 |
self.logs = []
|
| 194 |
self.set_initial_progress()
|
| 195 |
+
|
| 196 |
+
def _update_progress_from_item(self, item: LogItem):
|
| 197 |
+
if item.heading and item.update_progress != "none":
|
| 198 |
+
if item.no >= self.progress_no:
|
| 199 |
+
self.set_progress(
|
| 200 |
+
item.heading,
|
| 201 |
+
(item.no if item.update_progress == "persistent" else -1),
|
| 202 |
+
)
|
| 203 |
+
|
|
@@ -10,6 +10,8 @@ from langchain_community.docstore.in_memory import InMemoryDocstore
|
|
| 10 |
from langchain_community.vectorstores.utils import (
|
| 11 |
DistanceStrategy,
|
| 12 |
)
|
|
|
|
|
|
|
| 13 |
import os, json
|
| 14 |
|
| 15 |
import numpy as np
|
|
@@ -22,6 +24,7 @@ from python.helpers import knowledge_import
|
|
| 22 |
from python.helpers.log import Log, LogItem
|
| 23 |
from enum import Enum
|
| 24 |
from agent import Agent
|
|
|
|
| 25 |
|
| 26 |
|
| 27 |
class MyFaiss(FAISS):
|
|
@@ -54,7 +57,12 @@ class Memory:
|
|
| 54 |
)
|
| 55 |
db = Memory.initialize(
|
| 56 |
log_item,
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
memory_subdir,
|
| 59 |
False,
|
| 60 |
)
|
|
@@ -82,7 +90,7 @@ class Memory:
|
|
| 82 |
@staticmethod
|
| 83 |
def initialize(
|
| 84 |
log_item: LogItem | None,
|
| 85 |
-
embeddings_model,
|
| 86 |
memory_subdir: str,
|
| 87 |
in_memory=False,
|
| 88 |
) -> MyFaiss:
|
|
@@ -187,7 +195,7 @@ class Memory:
|
|
| 187 |
index[file]["ids"]
|
| 188 |
) # remove original version
|
| 189 |
if index[file]["state"] == "changed":
|
| 190 |
-
index[file]["ids"] = self.insert_documents(
|
| 191 |
index[file]["documents"]
|
| 192 |
) # insert new version
|
| 193 |
|
|
@@ -234,6 +242,11 @@ class Memory:
|
|
| 234 |
self, query: str, limit: int, threshold: float, filter: str = ""
|
| 235 |
):
|
| 236 |
comparator = Memory._get_comparator(filter) if filter else None
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 237 |
return await self.db.asearch(
|
| 238 |
query,
|
| 239 |
search_type="similarity_score_threshold",
|
|
@@ -287,30 +300,28 @@ class Memory:
|
|
| 287 |
self._save_db() # persist
|
| 288 |
return rem_docs
|
| 289 |
|
| 290 |
-
def insert_text(self, text, metadata: dict = {}):
|
| 291 |
-
|
| 292 |
-
|
| 293 |
-
|
| 294 |
|
| 295 |
-
|
| 296 |
-
documents=[
|
| 297 |
-
Document(
|
| 298 |
-
text,
|
| 299 |
-
metadata={"id": id, "timestamp": self.get_timestamp(), **metadata},
|
| 300 |
-
)
|
| 301 |
-
],
|
| 302 |
-
ids=[id],
|
| 303 |
-
)
|
| 304 |
-
self._save_db() # persist
|
| 305 |
-
return id
|
| 306 |
-
|
| 307 |
-
def insert_documents(self, docs: list[Document]):
|
| 308 |
ids = [str(uuid.uuid4()) for _ in range(len(docs))]
|
| 309 |
timestamp = self.get_timestamp()
|
|
|
|
|
|
|
| 310 |
if ids:
|
| 311 |
for doc, id in zip(docs, ids):
|
| 312 |
doc.metadata["id"] = id # add ids to documents metadata
|
| 313 |
doc.metadata["timestamp"] = timestamp # add timestamp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 314 |
self.db.add_documents(documents=docs, ids=ids)
|
| 315 |
self._save_db() # persist
|
| 316 |
return ids
|
|
@@ -365,8 +376,9 @@ class Memory:
|
|
| 365 |
def get_memory_subdir_abs(agent: Agent) -> str:
|
| 366 |
return files.get_abs_path("memory", agent.config.memory_subdir or "default")
|
| 367 |
|
|
|
|
| 368 |
def get_custom_knowledge_subdir_abs(agent: Agent) -> str:
|
| 369 |
for dir in agent.config.knowledge_subdirs:
|
| 370 |
-
if dir != "default":
|
| 371 |
return files.get_abs_path("knowledge", dir)
|
| 372 |
raise Exception("No custom knowledge subdir set")
|
|
|
|
| 10 |
from langchain_community.vectorstores.utils import (
|
| 11 |
DistanceStrategy,
|
| 12 |
)
|
| 13 |
+
from langchain_core.embeddings import Embeddings
|
| 14 |
+
|
| 15 |
import os, json
|
| 16 |
|
| 17 |
import numpy as np
|
|
|
|
| 24 |
from python.helpers.log import Log, LogItem
|
| 25 |
from enum import Enum
|
| 26 |
from agent import Agent
|
| 27 |
+
import models
|
| 28 |
|
| 29 |
|
| 30 |
class MyFaiss(FAISS):
|
|
|
|
| 57 |
)
|
| 58 |
db = Memory.initialize(
|
| 59 |
log_item,
|
| 60 |
+
models.get_model(
|
| 61 |
+
models.ModelType.EMBEDDING,
|
| 62 |
+
agent.config.embeddings_model.provider,
|
| 63 |
+
agent.config.embeddings_model.name,
|
| 64 |
+
**agent.config.embeddings_model.kwargs,
|
| 65 |
+
),
|
| 66 |
memory_subdir,
|
| 67 |
False,
|
| 68 |
)
|
|
|
|
| 90 |
@staticmethod
|
| 91 |
def initialize(
|
| 92 |
log_item: LogItem | None,
|
| 93 |
+
embeddings_model: Embeddings,
|
| 94 |
memory_subdir: str,
|
| 95 |
in_memory=False,
|
| 96 |
) -> MyFaiss:
|
|
|
|
| 195 |
index[file]["ids"]
|
| 196 |
) # remove original version
|
| 197 |
if index[file]["state"] == "changed":
|
| 198 |
+
index[file]["ids"] = await self.insert_documents(
|
| 199 |
index[file]["documents"]
|
| 200 |
) # insert new version
|
| 201 |
|
|
|
|
| 242 |
self, query: str, limit: int, threshold: float, filter: str = ""
|
| 243 |
):
|
| 244 |
comparator = Memory._get_comparator(filter) if filter else None
|
| 245 |
+
|
| 246 |
+
#rate limiter
|
| 247 |
+
await self.agent.rate_limiter(
|
| 248 |
+
model_config=self.agent.config.embeddings_model, input=query)
|
| 249 |
+
|
| 250 |
return await self.db.asearch(
|
| 251 |
query,
|
| 252 |
search_type="similarity_score_threshold",
|
|
|
|
| 300 |
self._save_db() # persist
|
| 301 |
return rem_docs
|
| 302 |
|
| 303 |
+
async def insert_text(self, text, metadata: dict = {}):
|
| 304 |
+
doc = Document(text, metadata=metadata)
|
| 305 |
+
ids = await self.insert_documents([doc])
|
| 306 |
+
return ids[0]
|
| 307 |
|
| 308 |
+
async def insert_documents(self, docs: list[Document]):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 309 |
ids = [str(uuid.uuid4()) for _ in range(len(docs))]
|
| 310 |
timestamp = self.get_timestamp()
|
| 311 |
+
|
| 312 |
+
|
| 313 |
if ids:
|
| 314 |
for doc, id in zip(docs, ids):
|
| 315 |
doc.metadata["id"] = id # add ids to documents metadata
|
| 316 |
doc.metadata["timestamp"] = timestamp # add timestamp
|
| 317 |
+
if not doc.metadata.get("area", ""):
|
| 318 |
+
doc.metadata["area"] = Memory.Area.MAIN.value
|
| 319 |
+
|
| 320 |
+
#rate limiter
|
| 321 |
+
docs_txt = "".join(self.format_docs_plain(docs))
|
| 322 |
+
await self.agent.rate_limiter(
|
| 323 |
+
model_config=self.agent.config.embeddings_model, input=docs_txt)
|
| 324 |
+
|
| 325 |
self.db.add_documents(documents=docs, ids=ids)
|
| 326 |
self._save_db() # persist
|
| 327 |
return ids
|
|
|
|
| 376 |
def get_memory_subdir_abs(agent: Agent) -> str:
|
| 377 |
return files.get_abs_path("memory", agent.config.memory_subdir or "default")
|
| 378 |
|
| 379 |
+
|
| 380 |
def get_custom_knowledge_subdir_abs(agent: Agent) -> str:
|
| 381 |
for dir in agent.config.knowledge_subdirs:
|
| 382 |
+
if dir != "default":
|
| 383 |
return files.get_abs_path("knowledge", dir)
|
| 384 |
raise Exception("No custom knowledge subdir set")
|
|
@@ -174,7 +174,7 @@ def _deserialize_log(data: dict[str, Any]) -> "Log":
|
|
| 174 |
log.logs.append(
|
| 175 |
LogItem(
|
| 176 |
log=log, # restore the log reference
|
| 177 |
-
no=item_data["no"],
|
| 178 |
type=item_data["type"],
|
| 179 |
heading=item_data.get("heading", ""),
|
| 180 |
content=item_data.get("content", ""),
|
|
|
|
| 174 |
log.logs.append(
|
| 175 |
LogItem(
|
| 176 |
log=log, # restore the log reference
|
| 177 |
+
no=i, #item_data["no"],
|
| 178 |
type=item_data["type"],
|
| 179 |
heading=item_data.get("heading", ""),
|
| 180 |
content=item_data.get("content", ""),
|
|
@@ -1,68 +1,56 @@
|
|
|
|
|
| 1 |
import time
|
| 2 |
-
from
|
| 3 |
-
from dataclasses import dataclass
|
| 4 |
-
from typing import List, Tuple
|
| 5 |
-
from .print_style import PrintStyle
|
| 6 |
-
from .log import Log
|
| 7 |
|
| 8 |
-
@dataclass
|
| 9 |
-
class CallRecord:
|
| 10 |
-
timestamp: float
|
| 11 |
-
input_tokens: int
|
| 12 |
-
output_tokens: int = 0 # Default to 0, will be set separately
|
| 13 |
|
| 14 |
class RateLimiter:
|
| 15 |
-
def __init__(self,
|
| 16 |
-
self.
|
| 17 |
-
self.
|
| 18 |
-
self.
|
| 19 |
-
self.
|
| 20 |
-
self.window_seconds = window_seconds
|
| 21 |
-
self.call_records: deque = deque()
|
| 22 |
|
| 23 |
-
def
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
def
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
|
|
|
| 32 |
|
| 33 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
while True:
|
| 35 |
-
self.
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
if self.max_input_tokens > 0 and input_tokens + new_input_tokens > self.max_input_tokens:
|
| 42 |
-
wait_reasons.append("max input tokens")
|
| 43 |
-
if self.max_output_tokens > 0 and output_tokens >= self.max_output_tokens:
|
| 44 |
-
wait_reasons.append("max output tokens")
|
| 45 |
-
|
| 46 |
-
if not wait_reasons:
|
| 47 |
-
break
|
| 48 |
-
|
| 49 |
-
oldest_record = self.call_records[0]
|
| 50 |
-
wait_time = oldest_record.timestamp + self.window_seconds - current_time
|
| 51 |
-
if wait_time > 0:
|
| 52 |
-
PrintStyle(font_color="yellow", padding=True).print(f"Rate limit exceeded. Waiting for {wait_time:.2f} seconds due to: {', '.join(wait_reasons)}")
|
| 53 |
-
self.logger.log("rate_limit","Rate limit exceeded",f"Rate limit exceeded. Waiting for {wait_time:.2f} seconds due to: {', '.join(wait_reasons)}")
|
| 54 |
-
# TODO rate limit log type
|
| 55 |
-
time.sleep(wait_time)
|
| 56 |
-
current_time = time.time()
|
| 57 |
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
|
| 65 |
-
|
| 66 |
-
if self.call_records:
|
| 67 |
-
self.call_records[-1].output_tokens += output_token_count
|
| 68 |
-
return self
|
|
|
|
| 1 |
+
import asyncio
|
| 2 |
import time
|
| 3 |
+
from typing import Callable, Awaitable
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
|
| 6 |
class RateLimiter:
|
| 7 |
+
def __init__(self, seconds: int = 60, **limits: int):
|
| 8 |
+
self.timeframe = seconds
|
| 9 |
+
self.limits = {key: value if isinstance(value, (int, float)) else 0 for key, value in (limits or {}).items()}
|
| 10 |
+
self.values = {key: [] for key in self.limits.keys()}
|
| 11 |
+
self._lock = asyncio.Lock()
|
|
|
|
|
|
|
| 12 |
|
| 13 |
+
def add(self, **kwargs: int):
|
| 14 |
+
now = time.time()
|
| 15 |
+
for key, value in kwargs.items():
|
| 16 |
+
if not key in self.values:
|
| 17 |
+
self.values[key] = []
|
| 18 |
+
self.values[key].append((now, value))
|
| 19 |
|
| 20 |
+
async def cleanup(self):
|
| 21 |
+
async with self._lock:
|
| 22 |
+
now = time.time()
|
| 23 |
+
cutoff = now - self.timeframe
|
| 24 |
+
for key in self.values:
|
| 25 |
+
self.values[key] = [(t, v) for t, v in self.values[key] if t > cutoff]
|
| 26 |
|
| 27 |
+
async def get_total(self, key: str) -> int:
|
| 28 |
+
async with self._lock:
|
| 29 |
+
if not key in self.values:
|
| 30 |
+
return 0
|
| 31 |
+
return sum(value for _, value in self.values[key])
|
| 32 |
+
|
| 33 |
+
async def wait(
|
| 34 |
+
self,
|
| 35 |
+
callback: Callable[[str, str, int, int], Awaitable[None]] | None = None,
|
| 36 |
+
):
|
| 37 |
while True:
|
| 38 |
+
await self.cleanup()
|
| 39 |
+
should_wait = False
|
| 40 |
+
|
| 41 |
+
for key, limit in self.limits.items():
|
| 42 |
+
if limit <= 0: # Skip if no limit set
|
| 43 |
+
continue
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
+
total = await self.get_total(key)
|
| 46 |
+
if total > limit:
|
| 47 |
+
if callback:
|
| 48 |
+
msg = f"Rate limit exceeded for {key} ({total}/{limit}), waiting..."
|
| 49 |
+
await callback(msg, key, total, limit)
|
| 50 |
+
should_wait = True
|
| 51 |
+
break
|
| 52 |
+
|
| 53 |
+
if not should_wait:
|
| 54 |
+
break
|
| 55 |
|
| 56 |
+
await asyncio.sleep(1)
|
|
|
|
|
|
|
|
|
|
@@ -8,10 +8,6 @@ from typing import Any, Literal, TypedDict
|
|
| 8 |
import models
|
| 9 |
from python.helpers import runtime, whisper, defer
|
| 10 |
from . import files, dotenv
|
| 11 |
-
from models import get_model, ModelProvider, ModelType
|
| 12 |
-
from langchain_core.language_models.chat_models import BaseChatModel
|
| 13 |
-
from langchain_core.embeddings import Embeddings
|
| 14 |
-
|
| 15 |
|
| 16 |
class Settings(TypedDict):
|
| 17 |
chat_model_provider: str
|
|
@@ -20,15 +16,26 @@ class Settings(TypedDict):
|
|
| 20 |
chat_model_kwargs: dict[str, str]
|
| 21 |
chat_model_ctx_length: int
|
| 22 |
chat_model_ctx_history: float
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
util_model_provider: str
|
| 25 |
util_model_name: str
|
| 26 |
util_model_temperature: float
|
| 27 |
util_model_kwargs: dict[str, str]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
|
|
|
| 29 |
embed_model_provider: str
|
| 30 |
embed_model_name: str
|
| 31 |
embed_model_kwargs: dict[str, str]
|
|
|
|
|
|
|
| 32 |
|
| 33 |
agent_prompts_subdir: str
|
| 34 |
agent_memory_subdir: str
|
|
@@ -66,7 +73,7 @@ class SettingsField(TypedDict, total=False):
|
|
| 66 |
id: str
|
| 67 |
title: str
|
| 68 |
description: str
|
| 69 |
-
type: Literal["
|
| 70 |
value: Any
|
| 71 |
min: float
|
| 72 |
max: float
|
|
@@ -91,6 +98,8 @@ _settings: Settings | None = None
|
|
| 91 |
|
| 92 |
|
| 93 |
def convert_out(settings: Settings) -> SettingsOutput:
|
|
|
|
|
|
|
| 94 |
|
| 95 |
# main model section
|
| 96 |
chat_model_fields: list[SettingsField] = []
|
|
@@ -109,7 +118,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 109 |
"id": "chat_model_name",
|
| 110 |
"title": "Chat model name",
|
| 111 |
"description": "Exact name of model from selected provider",
|
| 112 |
-
"type": "
|
| 113 |
"value": settings["chat_model_name"],
|
| 114 |
}
|
| 115 |
)
|
|
@@ -132,7 +141,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 132 |
"id": "chat_model_ctx_length",
|
| 133 |
"title": "Chat model context length",
|
| 134 |
"description": "Maximum number of tokens in the context window for LLM. System prompt, chat history, RAG and response all count towards this limit.",
|
| 135 |
-
"type": "
|
| 136 |
"value": settings["chat_model_ctx_length"],
|
| 137 |
}
|
| 138 |
)
|
|
@@ -150,6 +159,36 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 150 |
}
|
| 151 |
)
|
| 152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
chat_model_fields.append(
|
| 154 |
{
|
| 155 |
"id": "chat_model_kwargs",
|
|
@@ -183,7 +222,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 183 |
"id": "util_model_name",
|
| 184 |
"title": "Utility model name",
|
| 185 |
"description": "Exact name of model from selected provider",
|
| 186 |
-
"type": "
|
| 187 |
"value": settings["util_model_name"],
|
| 188 |
}
|
| 189 |
)
|
|
@@ -200,6 +239,58 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 200 |
"value": settings["util_model_temperature"],
|
| 201 |
}
|
| 202 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
|
| 204 |
util_model_fields.append(
|
| 205 |
{
|
|
@@ -234,46 +325,28 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 234 |
"id": "embed_model_name",
|
| 235 |
"title": "Embedding model name",
|
| 236 |
"description": "Exact name of model from selected provider",
|
| 237 |
-
"type": "
|
| 238 |
"value": settings["embed_model_name"],
|
| 239 |
}
|
| 240 |
)
|
| 241 |
-
|
| 242 |
embed_model_fields.append(
|
| 243 |
{
|
| 244 |
-
"id": "
|
| 245 |
-
"title": "
|
| 246 |
-
"description": "
|
| 247 |
-
"type": "
|
| 248 |
-
"value":
|
| 249 |
}
|
| 250 |
)
|
| 251 |
|
| 252 |
-
embed_model_section: SettingsSection = {
|
| 253 |
-
"title": "Embedding Model",
|
| 254 |
-
"description": "Settings for the embedding model used by Agent Zero.",
|
| 255 |
-
"fields": embed_model_fields,
|
| 256 |
-
}
|
| 257 |
-
|
| 258 |
-
# embedding model section
|
| 259 |
-
embed_model_fields: list[SettingsField] = []
|
| 260 |
-
embed_model_fields.append(
|
| 261 |
-
{
|
| 262 |
-
"id": "embed_model_provider",
|
| 263 |
-
"title": "Embedding model provider",
|
| 264 |
-
"description": "Select provider for embedding model used by the framework",
|
| 265 |
-
"type": "select",
|
| 266 |
-
"value": settings["embed_model_provider"],
|
| 267 |
-
"options": [{"value": p.name, "label": p.value} for p in ModelProvider],
|
| 268 |
-
}
|
| 269 |
-
)
|
| 270 |
embed_model_fields.append(
|
| 271 |
{
|
| 272 |
-
"id": "
|
| 273 |
-
"title": "
|
| 274 |
-
"description": "
|
| 275 |
-
"type": "
|
| 276 |
-
"value": settings["
|
| 277 |
}
|
| 278 |
)
|
| 279 |
|
|
@@ -301,7 +374,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 301 |
"id": "auth_login",
|
| 302 |
"title": "UI Login",
|
| 303 |
"description": "Set user name for web UI",
|
| 304 |
-
"type": "
|
| 305 |
"value": dotenv.get_dotenv_value(dotenv.KEY_AUTH_LOGIN) or "",
|
| 306 |
}
|
| 307 |
)
|
|
@@ -423,7 +496,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 423 |
# "id": "rfc_auto_docker",
|
| 424 |
# "title": "RFC Auto Docker Management",
|
| 425 |
# "description": "Automatically create dockerized instance of A0 for RFCs using this instance's code base and, settings and .env.",
|
| 426 |
-
# "type": "
|
| 427 |
# "value": settings["rfc_auto_docker"],
|
| 428 |
# }
|
| 429 |
# )
|
|
@@ -433,7 +506,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 433 |
"id": "rfc_url",
|
| 434 |
"title": "RFC Destination URL",
|
| 435 |
"description": "URL of dockerized A0 instance for remote function calls. Do not specify port here.",
|
| 436 |
-
"type": "
|
| 437 |
"value": settings["rfc_url"],
|
| 438 |
}
|
| 439 |
)
|
|
@@ -458,7 +531,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 458 |
"id": "rfc_port_http",
|
| 459 |
"title": "RFC HTTP port",
|
| 460 |
"description": "HTTP port for dockerized instance of A0.",
|
| 461 |
-
"type": "
|
| 462 |
"value": settings["rfc_port_http"],
|
| 463 |
}
|
| 464 |
)
|
|
@@ -468,7 +541,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 468 |
"id": "rfc_port_ssh",
|
| 469 |
"title": "RFC SSH port",
|
| 470 |
"description": "SSH port for dockerized instance of A0.",
|
| 471 |
-
"type": "
|
| 472 |
"value": settings["rfc_port_ssh"],
|
| 473 |
}
|
| 474 |
)
|
|
@@ -505,7 +578,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 505 |
"id": "stt_language",
|
| 506 |
"title": "Language Code",
|
| 507 |
"description": "Language code (e.g. en, fr, it)",
|
| 508 |
-
"type": "
|
| 509 |
"value": settings["stt_language"],
|
| 510 |
}
|
| 511 |
)
|
|
@@ -528,7 +601,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 528 |
"id": "stt_silence_duration",
|
| 529 |
"title": "Silence duration (ms)",
|
| 530 |
"description": "Duration of silence before the server considers speaking to have ended.",
|
| 531 |
-
"type": "
|
| 532 |
"value": settings["stt_silence_duration"],
|
| 533 |
}
|
| 534 |
)
|
|
@@ -538,7 +611,7 @@ def convert_out(settings: Settings) -> SettingsOutput:
|
|
| 538 |
"id": "stt_waiting_timeout",
|
| 539 |
"title": "Waiting timeout (ms)",
|
| 540 |
"description": "Duration before the server closes the microphone.",
|
| 541 |
-
"type": "
|
| 542 |
"value": settings["stt_waiting_timeout"],
|
| 543 |
}
|
| 544 |
)
|
|
@@ -617,43 +690,43 @@ def normalize_settings(settings: Settings) -> Settings:
|
|
| 617 |
try:
|
| 618 |
copy[key] = type(value)(copy[key]) # type: ignore
|
| 619 |
except (ValueError, TypeError):
|
| 620 |
-
|
| 621 |
return copy
|
| 622 |
|
| 623 |
|
| 624 |
-
def get_chat_model(settings: Settings | None = None) -> BaseChatModel:
|
| 625 |
-
|
| 626 |
-
|
| 627 |
-
|
| 628 |
-
|
| 629 |
-
|
| 630 |
-
|
| 631 |
-
|
| 632 |
-
|
| 633 |
-
|
| 634 |
-
|
| 635 |
-
|
| 636 |
-
def get_utility_model(settings: Settings | None = None) -> BaseChatModel:
|
| 637 |
-
|
| 638 |
-
|
| 639 |
-
|
| 640 |
-
|
| 641 |
-
|
| 642 |
-
|
| 643 |
-
|
| 644 |
-
|
| 645 |
-
|
| 646 |
-
|
| 647 |
-
|
| 648 |
-
def get_embedding_model(settings: Settings | None = None) -> Embeddings:
|
| 649 |
-
|
| 650 |
-
|
| 651 |
-
|
| 652 |
-
|
| 653 |
-
|
| 654 |
-
|
| 655 |
-
|
| 656 |
-
|
| 657 |
|
| 658 |
|
| 659 |
def _read_settings_file() -> Settings | None:
|
|
@@ -697,20 +770,32 @@ def _write_sensitive_settings(settings: Settings):
|
|
| 697 |
|
| 698 |
|
| 699 |
def get_default_settings() -> Settings:
|
|
|
|
|
|
|
| 700 |
return Settings(
|
| 701 |
chat_model_provider=ModelProvider.OPENAI.name,
|
| 702 |
chat_model_name="gpt-4o-mini",
|
| 703 |
-
chat_model_temperature=0,
|
| 704 |
chat_model_kwargs={},
|
| 705 |
-
chat_model_ctx_length=
|
| 706 |
chat_model_ctx_history=0.7,
|
|
|
|
|
|
|
|
|
|
| 707 |
util_model_provider=ModelProvider.OPENAI.name,
|
| 708 |
util_model_name="gpt-4o-mini",
|
| 709 |
-
util_model_temperature=0,
|
|
|
|
|
|
|
| 710 |
util_model_kwargs={},
|
|
|
|
|
|
|
|
|
|
| 711 |
embed_model_provider=ModelProvider.OPENAI.name,
|
| 712 |
embed_model_name="text-embedding-3-small",
|
| 713 |
embed_model_kwargs={},
|
|
|
|
|
|
|
| 714 |
api_keys={},
|
| 715 |
auth_login="",
|
| 716 |
auth_password="",
|
|
|
|
| 8 |
import models
|
| 9 |
from python.helpers import runtime, whisper, defer
|
| 10 |
from . import files, dotenv
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
class Settings(TypedDict):
|
| 13 |
chat_model_provider: str
|
|
|
|
| 16 |
chat_model_kwargs: dict[str, str]
|
| 17 |
chat_model_ctx_length: int
|
| 18 |
chat_model_ctx_history: float
|
| 19 |
+
chat_model_rl_requests: int
|
| 20 |
+
chat_model_rl_input: int
|
| 21 |
+
chat_model_rl_output: int
|
| 22 |
|
| 23 |
util_model_provider: str
|
| 24 |
util_model_name: str
|
| 25 |
util_model_temperature: float
|
| 26 |
util_model_kwargs: dict[str, str]
|
| 27 |
+
util_model_ctx_length: int
|
| 28 |
+
util_model_ctx_input: float
|
| 29 |
+
util_model_rl_requests: int
|
| 30 |
+
util_model_rl_input: int
|
| 31 |
+
util_model_rl_output: int
|
| 32 |
|
| 33 |
+
|
| 34 |
embed_model_provider: str
|
| 35 |
embed_model_name: str
|
| 36 |
embed_model_kwargs: dict[str, str]
|
| 37 |
+
embed_model_rl_requests: int
|
| 38 |
+
embed_model_rl_input: int
|
| 39 |
|
| 40 |
agent_prompts_subdir: str
|
| 41 |
agent_memory_subdir: str
|
|
|
|
| 73 |
id: str
|
| 74 |
title: str
|
| 75 |
description: str
|
| 76 |
+
type: Literal["text", "number", "select", "range", "textarea", "password"]
|
| 77 |
value: Any
|
| 78 |
min: float
|
| 79 |
max: float
|
|
|
|
| 98 |
|
| 99 |
|
| 100 |
def convert_out(settings: Settings) -> SettingsOutput:
|
| 101 |
+
from models import ModelProvider
|
| 102 |
+
|
| 103 |
|
| 104 |
# main model section
|
| 105 |
chat_model_fields: list[SettingsField] = []
|
|
|
|
| 118 |
"id": "chat_model_name",
|
| 119 |
"title": "Chat model name",
|
| 120 |
"description": "Exact name of model from selected provider",
|
| 121 |
+
"type": "text",
|
| 122 |
"value": settings["chat_model_name"],
|
| 123 |
}
|
| 124 |
)
|
|
|
|
| 141 |
"id": "chat_model_ctx_length",
|
| 142 |
"title": "Chat model context length",
|
| 143 |
"description": "Maximum number of tokens in the context window for LLM. System prompt, chat history, RAG and response all count towards this limit.",
|
| 144 |
+
"type": "number",
|
| 145 |
"value": settings["chat_model_ctx_length"],
|
| 146 |
}
|
| 147 |
)
|
|
|
|
| 159 |
}
|
| 160 |
)
|
| 161 |
|
| 162 |
+
chat_model_fields.append(
|
| 163 |
+
{
|
| 164 |
+
"id": "chat_model_rl_requests",
|
| 165 |
+
"title": "Requests per minute limit",
|
| 166 |
+
"description": "Limits the number of requests per minute to the chat model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 167 |
+
"type": "number",
|
| 168 |
+
"value": settings["chat_model_rl_requests"],
|
| 169 |
+
}
|
| 170 |
+
)
|
| 171 |
+
|
| 172 |
+
chat_model_fields.append(
|
| 173 |
+
{
|
| 174 |
+
"id": "chat_model_rl_input",
|
| 175 |
+
"title": "Input tokens per minute limit",
|
| 176 |
+
"description": "Limits the number of input tokens per minute to the chat model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 177 |
+
"type": "number",
|
| 178 |
+
"value": settings["chat_model_rl_input"],
|
| 179 |
+
}
|
| 180 |
+
)
|
| 181 |
+
|
| 182 |
+
chat_model_fields.append(
|
| 183 |
+
{
|
| 184 |
+
"id": "chat_model_rl_output",
|
| 185 |
+
"title": "Output tokens per minute limit",
|
| 186 |
+
"description": "Limits the number of output tokens per minute to the chat model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 187 |
+
"type": "number",
|
| 188 |
+
"value": settings["chat_model_rl_output"],
|
| 189 |
+
}
|
| 190 |
+
)
|
| 191 |
+
|
| 192 |
chat_model_fields.append(
|
| 193 |
{
|
| 194 |
"id": "chat_model_kwargs",
|
|
|
|
| 222 |
"id": "util_model_name",
|
| 223 |
"title": "Utility model name",
|
| 224 |
"description": "Exact name of model from selected provider",
|
| 225 |
+
"type": "text",
|
| 226 |
"value": settings["util_model_name"],
|
| 227 |
}
|
| 228 |
)
|
|
|
|
| 239 |
"value": settings["util_model_temperature"],
|
| 240 |
}
|
| 241 |
)
|
| 242 |
+
|
| 243 |
+
# util_model_fields.append(
|
| 244 |
+
# {
|
| 245 |
+
# "id": "util_model_ctx_length",
|
| 246 |
+
# "title": "Utility model context length",
|
| 247 |
+
# "description": "Maximum number of tokens in the context window for LLM. System prompt, message and response all count towards this limit.",
|
| 248 |
+
# "type": "number",
|
| 249 |
+
# "value": settings["util_model_ctx_length"],
|
| 250 |
+
# }
|
| 251 |
+
# )
|
| 252 |
+
# util_model_fields.append(
|
| 253 |
+
# {
|
| 254 |
+
# "id": "util_model_ctx_input",
|
| 255 |
+
# "title": "Context window space for input tokens",
|
| 256 |
+
# "description": "Portion of context window dedicated to input tokens. The remaining space can be filled with response.",
|
| 257 |
+
# "type": "range",
|
| 258 |
+
# "min": 0.01,
|
| 259 |
+
# "max": 1,
|
| 260 |
+
# "step": 0.01,
|
| 261 |
+
# "value": settings["util_model_ctx_input"],
|
| 262 |
+
# }
|
| 263 |
+
# )
|
| 264 |
+
|
| 265 |
+
util_model_fields.append(
|
| 266 |
+
{
|
| 267 |
+
"id": "util_model_rl_requests",
|
| 268 |
+
"title": "Requests per minute limit",
|
| 269 |
+
"description": "Limits the number of requests per minute to the utility model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 270 |
+
"type": "number",
|
| 271 |
+
"value": settings["util_model_rl_requests"],
|
| 272 |
+
}
|
| 273 |
+
)
|
| 274 |
+
|
| 275 |
+
util_model_fields.append(
|
| 276 |
+
{
|
| 277 |
+
"id": "util_model_rl_input",
|
| 278 |
+
"title": "Input tokens per minute limit",
|
| 279 |
+
"description": "Limits the number of input tokens per minute to the utility model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 280 |
+
"type": "number",
|
| 281 |
+
"value": settings["util_model_rl_input"],
|
| 282 |
+
}
|
| 283 |
+
)
|
| 284 |
+
|
| 285 |
+
util_model_fields.append(
|
| 286 |
+
{
|
| 287 |
+
"id": "util_model_rl_output",
|
| 288 |
+
"title": "Output tokens per minute limit",
|
| 289 |
+
"description": "Limits the number of output tokens per minute to the utility model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 290 |
+
"type": "number",
|
| 291 |
+
"value": settings["util_model_rl_output"],
|
| 292 |
+
}
|
| 293 |
+
)
|
| 294 |
|
| 295 |
util_model_fields.append(
|
| 296 |
{
|
|
|
|
| 325 |
"id": "embed_model_name",
|
| 326 |
"title": "Embedding model name",
|
| 327 |
"description": "Exact name of model from selected provider",
|
| 328 |
+
"type": "text",
|
| 329 |
"value": settings["embed_model_name"],
|
| 330 |
}
|
| 331 |
)
|
| 332 |
+
|
| 333 |
embed_model_fields.append(
|
| 334 |
{
|
| 335 |
+
"id": "embed_model_rl_requests",
|
| 336 |
+
"title": "Requests per minute limit",
|
| 337 |
+
"description": "Limits the number of requests per minute to the embedding model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 338 |
+
"type": "number",
|
| 339 |
+
"value": settings["embed_model_rl_requests"],
|
| 340 |
}
|
| 341 |
)
|
| 342 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 343 |
embed_model_fields.append(
|
| 344 |
{
|
| 345 |
+
"id": "embed_model_rl_input",
|
| 346 |
+
"title": "Input tokens per minute limit",
|
| 347 |
+
"description": "Limits the number of input tokens per minute to the embedding model. Waits if the limit is exceeded. Set to 0 to disable rate limiting.",
|
| 348 |
+
"type": "number",
|
| 349 |
+
"value": settings["embed_model_rl_input"],
|
| 350 |
}
|
| 351 |
)
|
| 352 |
|
|
|
|
| 374 |
"id": "auth_login",
|
| 375 |
"title": "UI Login",
|
| 376 |
"description": "Set user name for web UI",
|
| 377 |
+
"type": "text",
|
| 378 |
"value": dotenv.get_dotenv_value(dotenv.KEY_AUTH_LOGIN) or "",
|
| 379 |
}
|
| 380 |
)
|
|
|
|
| 496 |
# "id": "rfc_auto_docker",
|
| 497 |
# "title": "RFC Auto Docker Management",
|
| 498 |
# "description": "Automatically create dockerized instance of A0 for RFCs using this instance's code base and, settings and .env.",
|
| 499 |
+
# "type": "text",
|
| 500 |
# "value": settings["rfc_auto_docker"],
|
| 501 |
# }
|
| 502 |
# )
|
|
|
|
| 506 |
"id": "rfc_url",
|
| 507 |
"title": "RFC Destination URL",
|
| 508 |
"description": "URL of dockerized A0 instance for remote function calls. Do not specify port here.",
|
| 509 |
+
"type": "text",
|
| 510 |
"value": settings["rfc_url"],
|
| 511 |
}
|
| 512 |
)
|
|
|
|
| 531 |
"id": "rfc_port_http",
|
| 532 |
"title": "RFC HTTP port",
|
| 533 |
"description": "HTTP port for dockerized instance of A0.",
|
| 534 |
+
"type": "text",
|
| 535 |
"value": settings["rfc_port_http"],
|
| 536 |
}
|
| 537 |
)
|
|
|
|
| 541 |
"id": "rfc_port_ssh",
|
| 542 |
"title": "RFC SSH port",
|
| 543 |
"description": "SSH port for dockerized instance of A0.",
|
| 544 |
+
"type": "text",
|
| 545 |
"value": settings["rfc_port_ssh"],
|
| 546 |
}
|
| 547 |
)
|
|
|
|
| 578 |
"id": "stt_language",
|
| 579 |
"title": "Language Code",
|
| 580 |
"description": "Language code (e.g. en, fr, it)",
|
| 581 |
+
"type": "text",
|
| 582 |
"value": settings["stt_language"],
|
| 583 |
}
|
| 584 |
)
|
|
|
|
| 601 |
"id": "stt_silence_duration",
|
| 602 |
"title": "Silence duration (ms)",
|
| 603 |
"description": "Duration of silence before the server considers speaking to have ended.",
|
| 604 |
+
"type": "text",
|
| 605 |
"value": settings["stt_silence_duration"],
|
| 606 |
}
|
| 607 |
)
|
|
|
|
| 611 |
"id": "stt_waiting_timeout",
|
| 612 |
"title": "Waiting timeout (ms)",
|
| 613 |
"description": "Duration before the server closes the microphone.",
|
| 614 |
+
"type": "text",
|
| 615 |
"value": settings["stt_waiting_timeout"],
|
| 616 |
}
|
| 617 |
)
|
|
|
|
| 690 |
try:
|
| 691 |
copy[key] = type(value)(copy[key]) # type: ignore
|
| 692 |
except (ValueError, TypeError):
|
| 693 |
+
copy[key] = value # make default instead
|
| 694 |
return copy
|
| 695 |
|
| 696 |
|
| 697 |
+
# def get_chat_model(settings: Settings | None = None) -> BaseChatModel:
|
| 698 |
+
# if not settings:
|
| 699 |
+
# settings = get_settings()
|
| 700 |
+
# return get_model(
|
| 701 |
+
# type=ModelType.CHAT,
|
| 702 |
+
# provider=ModelProvider[settings["chat_model_provider"]],
|
| 703 |
+
# name=settings["chat_model_name"],
|
| 704 |
+
# temperature=settings["chat_model_temperature"],
|
| 705 |
+
# **settings["chat_model_kwargs"],
|
| 706 |
+
# )
|
| 707 |
+
|
| 708 |
+
|
| 709 |
+
# def get_utility_model(settings: Settings | None = None) -> BaseChatModel:
|
| 710 |
+
# if not settings:
|
| 711 |
+
# settings = get_settings()
|
| 712 |
+
# return get_model(
|
| 713 |
+
# type=ModelType.CHAT,
|
| 714 |
+
# provider=ModelProvider[settings["util_model_provider"]],
|
| 715 |
+
# name=settings["util_model_name"],
|
| 716 |
+
# temperature=settings["util_model_temperature"],
|
| 717 |
+
# **settings["util_model_kwargs"],
|
| 718 |
+
# )
|
| 719 |
+
|
| 720 |
+
|
| 721 |
+
# def get_embedding_model(settings: Settings | None = None) -> Embeddings:
|
| 722 |
+
# if not settings:
|
| 723 |
+
# settings = get_settings()
|
| 724 |
+
# return get_model(
|
| 725 |
+
# type=ModelType.EMBEDDING,
|
| 726 |
+
# provider=ModelProvider[settings["embed_model_provider"]],
|
| 727 |
+
# name=settings["embed_model_name"],
|
| 728 |
+
# **settings["embed_model_kwargs"],
|
| 729 |
+
# )
|
| 730 |
|
| 731 |
|
| 732 |
def _read_settings_file() -> Settings | None:
|
|
|
|
| 770 |
|
| 771 |
|
| 772 |
def get_default_settings() -> Settings:
|
| 773 |
+
from models import ModelProvider
|
| 774 |
+
|
| 775 |
return Settings(
|
| 776 |
chat_model_provider=ModelProvider.OPENAI.name,
|
| 777 |
chat_model_name="gpt-4o-mini",
|
| 778 |
+
chat_model_temperature=0.0,
|
| 779 |
chat_model_kwargs={},
|
| 780 |
+
chat_model_ctx_length=120000,
|
| 781 |
chat_model_ctx_history=0.7,
|
| 782 |
+
chat_model_rl_requests=0,
|
| 783 |
+
chat_model_rl_input=0,
|
| 784 |
+
chat_model_rl_output=0,
|
| 785 |
util_model_provider=ModelProvider.OPENAI.name,
|
| 786 |
util_model_name="gpt-4o-mini",
|
| 787 |
+
util_model_temperature=0.0,
|
| 788 |
+
util_model_ctx_length=120000,
|
| 789 |
+
util_model_ctx_input=0.7,
|
| 790 |
util_model_kwargs={},
|
| 791 |
+
util_model_rl_requests=60,
|
| 792 |
+
util_model_rl_input=0,
|
| 793 |
+
util_model_rl_output=0,
|
| 794 |
embed_model_provider=ModelProvider.OPENAI.name,
|
| 795 |
embed_model_name="text-embedding-3-small",
|
| 796 |
embed_model_kwargs={},
|
| 797 |
+
embed_model_rl_requests=0,
|
| 798 |
+
embed_model_rl_input=0,
|
| 799 |
api_keys={},
|
| 800 |
auth_login="",
|
| 801 |
auth_password="",
|
|
@@ -21,15 +21,15 @@ async def update_behaviour(agent: Agent, log_item: LogItem, adjustments: str):
|
|
| 21 |
current_rules = read_rules(agent)
|
| 22 |
|
| 23 |
# log query streamed by LLM
|
| 24 |
-
def log_callback(content):
|
| 25 |
log_item.stream(ruleset=content)
|
| 26 |
|
| 27 |
msg = agent.read_prompt("behaviour.merge.msg.md", current_rules=current_rules, adjustments=adjustments)
|
| 28 |
|
| 29 |
# call util llm to find solutions in history
|
| 30 |
-
adjustments_merge = await agent.
|
| 31 |
system=system,
|
| 32 |
-
|
| 33 |
callback=log_callback,
|
| 34 |
)
|
| 35 |
|
|
|
|
| 21 |
current_rules = read_rules(agent)
|
| 22 |
|
| 23 |
# log query streamed by LLM
|
| 24 |
+
async def log_callback(content):
|
| 25 |
log_item.stream(ruleset=content)
|
| 26 |
|
| 27 |
msg = agent.read_prompt("behaviour.merge.msg.md", current_rules=current_rules, adjustments=adjustments)
|
| 28 |
|
| 29 |
# call util llm to find solutions in history
|
| 30 |
+
adjustments_merge = await agent.call_utility_model(
|
| 31 |
system=system,
|
| 32 |
+
message=msg,
|
| 33 |
callback=log_callback,
|
| 34 |
)
|
| 35 |
|
|
@@ -3,7 +3,6 @@ from python.helpers.tool import Tool, Response
|
|
| 3 |
class ResponseTool(Tool):
|
| 4 |
|
| 5 |
async def execute(self,**kwargs):
|
| 6 |
-
self.agent.set_data("timeout", self.agent.config.response_timeout_seconds)
|
| 7 |
return Response(message=self.args["text"], break_loop=True)
|
| 8 |
|
| 9 |
async def before_execution(self, **kwargs):
|
|
|
|
| 3 |
class ResponseTool(Tool):
|
| 4 |
|
| 5 |
async def execute(self,**kwargs):
|
|
|
|
| 6 |
return Response(message=self.args["text"], break_loop=True)
|
| 7 |
|
| 8 |
async def before_execution(self, **kwargs):
|
|
@@ -92,7 +92,7 @@ def run():
|
|
| 92 |
|
| 93 |
server = None
|
| 94 |
|
| 95 |
-
def register_api_handler(app, handler):
|
| 96 |
name = handler.__module__.split(".")[-1]
|
| 97 |
instance = handler(app, lock)
|
| 98 |
@requires_auth
|
|
|
|
| 92 |
|
| 93 |
server = None
|
| 94 |
|
| 95 |
+
def register_api_handler(app, handler: type[ApiHandler]):
|
| 96 |
name = handler.__module__.split(".")[-1]
|
| 97 |
instance = handler(app, lock)
|
| 98 |
@requires_auth
|
|
@@ -42,6 +42,7 @@
|
|
| 42 |
/* Input Styles */
|
| 43 |
input[type="text"],
|
| 44 |
input[type="password"],
|
|
|
|
| 45 |
textarea,
|
| 46 |
select {
|
| 47 |
width: 100%;
|
|
|
|
| 42 |
/* Input Styles */
|
| 43 |
input[type="text"],
|
| 44 |
input[type="password"],
|
| 45 |
+
input[type="number"],
|
| 46 |
textarea,
|
| 47 |
select {
|
| 48 |
width: 100%;
|
|
@@ -451,12 +451,21 @@
|
|
| 451 |
|
| 452 |
<div class="field-control">
|
| 453 |
<!-- Input field -->
|
| 454 |
-
<template x-if="field.type === '
|
| 455 |
<input type="text" :class="field.classes" :value="field.value"
|
| 456 |
:readonly="field.readonly === true"
|
| 457 |
@input="field.value = $event.target.value">
|
| 458 |
</template>
|
| 459 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 460 |
<!-- Password field -->
|
| 461 |
<template x-if="field.type === 'password'">
|
| 462 |
<input type="password" :class="field.classes" :value="field.value"
|
|
|
|
| 451 |
|
| 452 |
<div class="field-control">
|
| 453 |
<!-- Input field -->
|
| 454 |
+
<template x-if="field.type === 'text'">
|
| 455 |
<input type="text" :class="field.classes" :value="field.value"
|
| 456 |
:readonly="field.readonly === true"
|
| 457 |
@input="field.value = $event.target.value">
|
| 458 |
</template>
|
| 459 |
|
| 460 |
+
<!-- Number field -->
|
| 461 |
+
<template x-if="field.type === 'number'">
|
| 462 |
+
<input type="number" :class="field.classes" :value="field.value"
|
| 463 |
+
:readonly="field.readonly === true"
|
| 464 |
+
@input="field.value = $event.target.value"
|
| 465 |
+
:min="field.min" :max="field.max" :step="field.step">
|
| 466 |
+
</template>
|
| 467 |
+
|
| 468 |
+
|
| 469 |
<!-- Password field -->
|
| 470 |
<template x-if="field.type === 'password'">
|
| 471 |
<input type="password" :class="field.classes" :value="field.value"
|