Spaces:

Tsitsi19
/

OpenManus-Gemini

Runtime error

App Files Files Community

Tsitsi19 commited on Jan 24

Commit

22df7bd

verified ·

1 Parent(s): c5440f3

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +32 -35
.github/ISSUE_TEMPLATE/config.yml +5 -0
.github/ISSUE_TEMPLATE/request_new_features.yaml +21 -0
.github/ISSUE_TEMPLATE/show_me_the_bug.yaml +44 -0
.github/PULL_REQUEST_TEMPLATE.md +17 -0
.github/dependabot.yml +58 -0
.github/workflows/build-package.yaml +33 -0
.github/workflows/environment-corrupt-check.yaml +33 -0
.github/workflows/pr-autodiff.yaml +138 -0
.github/workflows/pre-commit.yaml +26 -0
.github/workflows/stale.yaml +23 -0
.github/workflows/top-issues.yaml +29 -0
.gitignore +202 -0
.pre-commit-config.yaml +37 -0
.vscode/extensions.json +8 -0
.vscode/settings.json +20 -0
CODE_OF_CONDUCT.md +162 -0
Dockerfile +67 -0
LICENSE +21 -0
README.md +195 -10
README_ja.md +193 -0
README_ko.md +192 -0
README_zh.md +198 -0
app/__init__.py +10 -0
app/agent/__init__.py +16 -0
app/agent/base.py +196 -0
app/agent/browser.py +129 -0
app/agent/data_analysis.py +37 -0
app/agent/manus.py +165 -0
app/agent/mcp.py +185 -0
app/agent/react.py +38 -0
app/agent/sandbox_agent.py +223 -0
app/agent/swe.py +24 -0
app/agent/toolcall.py +250 -0
app/bedrock.py +334 -0
app/config.py +384 -0
app/daytona/README.md +57 -0
app/daytona/sandbox.py +165 -0
app/daytona/tool_base.py +138 -0
app/exceptions.py +13 -0
app/flow/__init__.py +0 -0
app/flow/base.py +57 -0
app/flow/flow_factory.py +30 -0
app/flow/planning.py +442 -0
app/llm.py +766 -0
app/logger.py +42 -0
app/mcp/__init__.py +0 -0
app/mcp/server.py +180 -0
app/prompt/__init__.py +0 -0
app/prompt/browser.py +94 -0

.gitattributes CHANGED Viewed

@@ -1,35 +1,32 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text

+# HTML code is incorrectly calculated into statistics, so ignore them
+*.html linguist-detectable=false
+# Auto detect text files and perform LF normalization
+* text=auto eol=lf
+# Ensure shell scripts use LF (Linux style) line endings on Windows
+*.sh text eol=lf
+# Treat specific binary files as binary and prevent line ending conversion
+*.png binary
+*.jpg binary
+*.gif binary
+*.ico binary
+*.jpeg binary
+*.mp3 binary
+*.zip binary
+*.bin binary
+# Preserve original line endings for specific document files
+*.doc text eol=crlf
+*.docx text eol=crlf
+*.pdf binary
+# Ensure source code and script files use LF line endings
+*.py text eol=lf
+*.js text eol=lf
+*.html text eol=lf
+*.css text eol=lf
+# Specify custom diff driver for specific file types
+*.md diff=markdown
+*.json diff=json
+*.mp4 filter=lfs diff=lfs merge=lfs -text
+*.mov filter=lfs diff=lfs merge=lfs -text
+*.webm filter=lfs diff=lfs merge=lfs -text
+assets/community_group.png filter=lfs diff=lfs merge=lfs -text
+examples/use_case/pictures/japan-travel-plan-1.png filter=lfs diff=lfs merge=lfs -text

.github/ISSUE_TEMPLATE/config.yml ADDED Viewed

	@@ -0,0 +1,5 @@

+blank_issues_enabled: false
+contact_links:
+  - name: "Join the Community Group"
+    about: Join the OpenManus community to discuss and get help from others
+    url: https://github.com/FoundationAgents/OpenManus?tab=readme-ov-file#community-group

.github/ISSUE_TEMPLATE/request_new_features.yaml ADDED Viewed

	@@ -0,0 +1,21 @@

+name: "🤔 Request new features"
+description: Suggest ideas or features you’d like to see implemented in OpenManus.
+labels: enhancement
+body:
+  - type: textarea
+    id: feature-description
+    attributes:
+      label: Feature description
+      description: |
+        Provide a clear and concise description of the proposed feature
+    validations:
+      required: true
+  - type: textarea
+    id: your-feature
+    attributes:
+      label: Your Feature
+      description: |
+        Explain your idea or implementation process, if any. Optionally, include a Pull Request URL.
+        Ensure accompanying docs/tests/examples are provided for review.
+    validations:
+      required: false

.github/ISSUE_TEMPLATE/show_me_the_bug.yaml ADDED Viewed

	@@ -0,0 +1,44 @@

+name: "🪲 Show me the Bug"
+description: Report a bug encountered while using OpenManus and seek assistance.
+labels: bug
+body:
+  - type: textarea
+    id: bug-description
+    attributes:
+      label: Bug Description
+      description: |
+        Clearly describe the bug you encountered
+    validations:
+      required: true
+  - type: textarea
+    id: solve-method
+    attributes:
+      label: Bug solved method
+      description: |
+        If resolved, explain the solution. Optionally, include a Pull Request URL.
+        If unresolved, provide additional details to aid investigation
+    validations:
+      required: true
+  - type: textarea
+    id: environment-information
+    attributes:
+      label: Environment information
+      description: |
+        System: e.g., Ubuntu 22.04
+        Python: e.g., 3.12
+        OpenManus version: e.g., 0.1.0
+      value: |
+        - System version:
+        - Python version:
+        - OpenManus version or branch:
+        - Installation method (e.g., `pip install -r requirements.txt` or `pip install -e .`):
+    validations:
+      required: true
+  - type: textarea
+    id: extra-information
+    attributes:
+      label: Extra information
+      description: |
+        For example, attach screenshots or logs to help diagnose the issue
+    validations:
+      required: false

.github/PULL_REQUEST_TEMPLATE.md ADDED Viewed

	@@ -0,0 +1,17 @@

+**Features**
+<!-- Describe the features or bug fixes in this PR. For bug fixes, link to the issue. -->
+- Feature 1
+- Feature 2
+**Feature Docs**
+<!-- Provide RFC, tutorial, or use case links for significant updates. Optional for minor changes. -->
+**Influence**
+<!-- Explain the impact of these changes for reviewer focus. -->
+**Result**
+<!-- Include screenshots or logs of unit tests or running results. -->
+**Other**
+<!-- Additional notes about this PR. -->

.github/dependabot.yml ADDED Viewed

	@@ -0,0 +1,58 @@

+version: 2
+updates:
+  - package-ecosystem: "pip"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 4
+    groups:
+      # Group critical packages that might need careful review
+      core-dependencies:
+        patterns:
+          - "pydantic*"
+          - "openai"
+          - "fastapi"
+          - "tiktoken"
+      browsergym-related:
+        patterns:
+          - "browsergym*"
+          - "browser-use"
+          - "playwright"
+      search-tools:
+        patterns:
+          - "googlesearch-python"
+          - "baidusearch"
+          - "duckduckgo_search"
+      pre-commit:
+        patterns:
+          - "pre-commit"
+      security-all:
+        applies-to: "security-updates"
+        patterns:
+          - "*"
+      version-all:
+        applies-to: "version-updates"
+        patterns:
+          - "*"
+        exclude-patterns:
+          - "pydantic*"
+          - "openai"
+          - "fastapi"
+          - "tiktoken"
+          - "browsergym*"
+          - "browser-use"
+          - "playwright"
+          - "googlesearch-python"
+          - "baidusearch"
+          - "duckduckgo_search"
+          - "pre-commit"
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 4
+    groups:
+      actions:
+        patterns:
+          - "*"

.github/workflows/build-package.yaml ADDED Viewed

	@@ -0,0 +1,33 @@

+name: Build and upload Python package
+on:
+  workflow_dispatch:
+  release:
+    types: [created, published]
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+          cache: 'pip'
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r requirements.txt
+          pip install setuptools wheel twine
+      - name: Set package version
+        run: |
+          export VERSION="${GITHUB_REF#refs/tags/v}"
+          sed -i "s/version=.*/version=\"${VERSION}\",/" setup.py
+      - name: Build and publish
+        env:
+          TWINE_USERNAME: __token__
+          TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
+        run: |
+          python setup.py bdist_wheel sdist
+          twine upload dist/*

.github/workflows/environment-corrupt-check.yaml ADDED Viewed

	@@ -0,0 +1,33 @@

+name: Environment Corruption Check
+on:
+  push:
+    branches: ["main"]
+    paths:
+      - requirements.txt
+  pull_request:
+    branches: ["main"]
+    paths:
+      - requirements.txt
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  test-python-versions:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.11.11", "3.12.8", "3.13.2"]
+      fail-fast: false
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Upgrade pip
+        run: |
+          python -m pip install --upgrade pip
+      - name: Install dependencies
+        run: |
+          pip install -r requirements.txt

.github/workflows/pr-autodiff.yaml ADDED Viewed

	@@ -0,0 +1,138 @@

+name: PR Diff Summarization
+on:
+  # pull_request:
+  #   branches: [main]
+  #   types: [opened, ready_for_review, reopened]
+  issue_comment:
+    types: [created]
+permissions:
+  contents: read
+  pull-requests: write
+jobs:
+  pr-diff-summarization:
+    runs-on: ubuntu-latest
+    if: |
+      (github.event_name == 'pull_request') ||
+      (github.event_name == 'issue_comment' &&
+       contains(github.event.comment.body, '!pr-diff') &&
+       (github.event.comment.author_association == 'CONTRIBUTOR' || github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') &&
+       github.event.issue.pull_request)
+    steps:
+      - name: Get PR head SHA
+        id: get-pr-sha
+        run: |
+          PR_URL="${{ github.event.issue.pull_request.url || github.event.pull_request.url }}"
+          # https://api.github.com/repos/OpenManus/pulls/1
+          RESPONSE=$(curl -s -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" $PR_URL)
+          SHA=$(echo $RESPONSE | jq -r '.head.sha')
+          TARGET_BRANCH=$(echo $RESPONSE | jq -r '.base.ref')
+          echo "pr_sha=$SHA" >> $GITHUB_OUTPUT
+          echo "target_branch=$TARGET_BRANCH" >> $GITHUB_OUTPUT
+          echo "Retrieved PR head SHA from API: $SHA, target branch: $TARGET_BRANCH"
+      - name: Check out code
+        uses: actions/checkout@v4
+        with:
+          ref: ${{ steps.get-pr-sha.outputs.pr_sha }}
+          fetch-depth: 0
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install openai requests
+      - name: Create and run Python script
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+          OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
+          GH_TOKEN: ${{ github.token }}
+          PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
+          TARGET_BRANCH: ${{ steps.get-pr-sha.outputs.target_branch }}
+        run: |-
+          cat << 'EOF' > /tmp/_workflow_core.py
+          import os
+          import subprocess
+          import json
+          import requests
+          from openai import OpenAI
+          def get_diff():
+              result = subprocess.run(
+                  ['git', 'diff', 'origin/' + os.getenv('TARGET_BRANCH') + '...HEAD'],
+                  capture_output=True, text=True, check=True)
+              return '\n'.join(
+                  line for line in result.stdout.split('\n')
+                  if any(line.startswith(c) for c in ('+', '-'))
+                  and not line.startswith(('---', '+++'))
+              )[:round(200000 * 0.4)]  # Truncate to prevent overflow
+          def generate_comment(diff_content):
+              client = OpenAI(
+                  base_url=os.getenv("OPENAI_BASE_URL"),
+                  api_key=os.getenv("OPENAI_API_KEY")
+              )
+              guidelines = '''
+          1. English version first, Chinese Simplified version after
+          2. Example format:
+              # Diff Report
+              ## English
+              - Added `ABC` class
+              - Fixed `f()` behavior in `foo` module
+              ### Comments Highlight
+              - `config.toml` needs to be configured properly to make sure new features work as expected.
+              ### Spelling/Offensive Content Check
+              - No spelling mistakes or offensive content found in the code or comments.
+              ## 中文（简体）
+              - 新增了 `ABC` 类
+              - `foo` 模块中的 `f()` 行为已修复
+              ### 评论高亮
+              - `config.toml` 需要正确配置才能确保新功能正常运行。
+              ### 内容检查
+              - 没有发现代码或注释中的拼写错误或不当措辞。
+          3. Highlight non-English comments
+          4. Check for spelling/offensive content'''
+              response = client.chat.completions.create(
+                  model="o3-mini",
+                  messages=[{
+                      "role": "system",
+                      "content": "Generate bilingual code review feedback."
+                  }, {
+                      "role": "user",
+                      "content": f"Review these changes per guidelines:\n{guidelines}\n\nDIFF:\n{diff_content}"
+                  }]
+              )
+              return response.choices[0].message.content
+          def post_comment(comment):
+              repo = os.getenv("GITHUB_REPOSITORY")
+              pr_number = os.getenv("PR_NUMBER")
+              headers = {
+                  "Authorization": f"Bearer {os.getenv('GH_TOKEN')}",
+                  "Accept": "application/vnd.github.v3+json"
+              }
+              url = f"https://api.github.com/repos/{repo}/issues/{pr_number}/comments"
+              requests.post(url, json={"body": comment}, headers=headers)
+          if __name__ == "__main__":
+              diff_content = get_diff()
+              if not diff_content.strip():
+                  print("No meaningful diff detected.")
+                  exit(0)
+              comment = generate_comment(diff_content)
+              post_comment(comment)
+              print("Comment posted successfully.")
+          EOF
+          python /tmp/_workflow_core.py

.github/workflows/pre-commit.yaml ADDED Viewed

	@@ -0,0 +1,26 @@

+name: Pre-commit checks
+on:
+  pull_request:
+    branches:
+      - '**'
+  push:
+    branches:
+      - '**'
+jobs:
+  pre-commit-check:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout Source Code
+        uses: actions/checkout@v4
+      - name: Set up Python 3.12
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+      - name: Install pre-commit and tools
+        run: |
+          python -m pip install --upgrade pip
+          pip install pre-commit black==23.1.0 isort==5.12.0 autoflake==2.0.1
+      - name: Run pre-commit hooks
+        run: pre-commit run --all-files

.github/workflows/stale.yaml ADDED Viewed

	@@ -0,0 +1,23 @@

+name: Close inactive issues
+on:
+  schedule:
+    - cron: "5 0 * * *"
+jobs:
+  close-issues:
+    runs-on: ubuntu-latest
+    permissions:
+      issues: write
+      pull-requests: write
+    steps:
+      - uses: actions/stale@v9
+        with:
+          days-before-issue-stale: 30
+          days-before-issue-close: 14
+          stale-issue-label: "inactive"
+          stale-issue-message: "This issue has been inactive for 30 days. Please comment if you have updates."
+          close-issue-message: "This issue was closed due to 45 days of inactivity. Reopen if still relevant."
+          days-before-pr-stale: -1
+          days-before-pr-close: -1
+          repo-token: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/top-issues.yaml ADDED Viewed

	@@ -0,0 +1,29 @@

+name: Top issues
+on:
+  schedule:
+    - cron: '0 0/2 * * *'
+  workflow_dispatch:
+jobs:
+  ShowAndLabelTopIssues:
+    permissions:
+      issues: write
+      pull-requests: write
+      actions: read
+      contents: read
+    name: Display and label top issues
+    runs-on: ubuntu-latest
+    if: github.repository == 'FoundationAgents/OpenManus'
+    steps:
+      - name: Run top issues action
+        uses: rickstaa/top-issues-action@7e8dda5d5ae3087670f9094b9724a9a091fc3ba1 # v1.3.101
+        env:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+        with:
+          label: true
+          dashboard: true
+          dashboard_show_total_reactions: true
+          top_issues: true
+          top_features: true
+          top_bugs: true
+          top_pull_requests: true
+          top_list_size: 14

.gitignore ADDED Viewed

	@@ -0,0 +1,202 @@

+### Project-specific ###
+# Logs
+logs/
+# Data
+data/
+# Workspace
+workspace/
+### Python ###
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# UV
+#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#uv.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
+.pdm.toml
+.pdm-python
+.pdm-build/
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+.idea/
+# PyPI configuration file
+.pypirc
+### Visual Studio Code ###
+.vscode/*
+!.vscode/settings.json
+!.vscode/tasks.json
+!.vscode/launch.json
+!.vscode/extensions.json
+!.vscode/*.code-snippets
+# Local History for Visual Studio Code
+.history/
+# Built Visual Studio Code Extensions
+*.vsix
+# OSX
+.DS_Store
+# node
+node_modules

.pre-commit-config.yaml ADDED Viewed

	@@ -0,0 +1,37 @@

+repos:
+  - repo: https://github.com/psf/black
+    rev: 23.1.0
+    hooks:
+      - id: black
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.4.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-yaml
+      - id: check-added-large-files
+  - repo: https://github.com/PyCQA/autoflake
+    rev: v2.0.1
+    hooks:
+      - id: autoflake
+        args:
+          [
+            --remove-all-unused-imports,
+            --ignore-init-module-imports,
+            --expand-star-imports,
+            --remove-duplicate-keys,
+            --remove-unused-variables,
+            --recursive,
+            --in-place,
+            --exclude=__init__.py,
+          ]
+        files: \.py$
+  - repo: https://github.com/pycqa/isort
+    rev: 5.12.0
+    hooks:
+      - id: isort
+        args:
+          ["--profile", "black", "--filter-files", "--lines-after-imports=2"]

.vscode/extensions.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+	"recommendations": [
+		"tamasfe.even-better-toml",
+		"ms-python.black-formatter",
+		"ms-python.isort"
+	],
+	"unwantedRecommendations": []
+}

.vscode/settings.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+    "[python]": {
+        "editor.defaultFormatter": "ms-python.black-formatter",
+        "editor.codeActionsOnSave": {
+            "source.organizeImports": "always"
+        }
+    },
+    "[toml]": {
+        "editor.defaultFormatter": "tamasfe.even-better-toml",
+    },
+    "pre-commit-helper.runOnSave": "none",
+    "pre-commit-helper.config": ".pre-commit-config.yaml",
+    "evenBetterToml.schema.enabled": true,
+    "evenBetterToml.schema.associations": {
+        "^.+config[/\\\\].+\\.toml$": "../config/schema.config.json"
+    },
+    "files.insertFinalNewline": true,
+    "files.trimTrailingWhitespace": true,
+    "editor.formatOnSave": true
+}

CODE_OF_CONDUCT.md ADDED Viewed

	@@ -0,0 +1,162 @@

+# Contributor Covenant Code of Conduct
+## Our Pledge
+We as members, contributors, and leaders pledge to make participation in our
+community a harassment-free experience for everyone, regardless of age, body
+size, visible or invisible disability, ethnicity, sex characteristics, gender
+identity and expression, level of experience, education, socio-economic status,
+nationality, personal appearance, race, caste, color, religion, or sexual
+identity and orientation.
+We pledge to act and interact in ways that contribute to an open, welcoming,
+diverse, inclusive, and healthy community.
+## Our Standards
+Examples of behavior that contributes to a positive environment for our
+community include:
+* Demonstrating empathy and kindness toward other people.
+* Being respectful of differing opinions, viewpoints, and experiences.
+* Giving and gracefully accepting constructive feedback.
+* Accepting responsibility and apologizing to those affected by our mistakes,
+  and learning from the experience.
+* Focusing on what is best not just for us as individuals, but for the overall
+  community.
+Examples of unacceptable behavior include:
+* The use of sexualized language or imagery, and sexual attention or advances of
+  any kind.
+* Trolling, insulting or derogatory comments, and personal or political attacks.
+* Public or private harassment.
+* Publishing others' private information, such as a physical or email address,
+  without their explicit permission.
+* Other conduct which could reasonably be considered inappropriate in a
+  professional setting.
+## Enforcement Responsibilities
+Community leaders are responsible for clarifying and enforcing our standards of
+acceptable behavior and will take appropriate and fair corrective action in
+response to any behavior that they deem inappropriate, threatening, offensive,
+or harmful.
+Community leaders have the right and responsibility to remove, edit, or reject
+comments, commits, code, wiki edits, issues, and other contributions that are
+not aligned to this Code of Conduct, and will communicate reasons for moderation
+decisions when appropriate.
+## Scope
+This Code of Conduct applies within all community spaces, and also applies when
+an individual is officially representing the community in public spaces.
+Examples of representing our community include using an official email address,
+posting via an official social media account, or acting as an appointed
+representative at an online or offline event.
+## Enforcement
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported to the community leaders responsible for enforcement at
+mannaandpoem@gmail.com
+All complaints will be reviewed and investigated promptly and fairly.
+All community leaders are obligated to respect the privacy and security of the
+reporter of any incident.
+## Enforcement Guidelines
+Community leaders will follow these Community Impact Guidelines in determining
+the consequences for any action they deem in violation of this Code of Conduct:
+### 1. Correction
+**Community Impact**: Use of inappropriate language or other behavior deemed
+unprofessional or unwelcome in the community.
+**Consequence**: A private, written warning from community leaders, providing
+clarity around the nature of the violation and an explanation of why the
+behavior was inappropriate. A public apology may be requested.
+### 2. Warning
+**Community Impact**: A violation through a single incident or series of
+actions.
+**Consequence**: A warning with consequences for continued behavior. No
+interaction with the people involved, including unsolicited interaction with
+those enforcing the Code of Conduct, for a specified period of time. This
+includes avoiding interactions in community spaces as well as external channels
+like social media. Violating these terms may lead to a temporary or permanent
+ban.
+### 3. Temporary Ban
+**Community Impact**: A serious violation of community standards, including
+sustained inappropriate behavior.
+**Consequence**: A temporary ban from any sort of interaction or public
+communication with the community for a specified period of time. No public or
+private interaction with the people involved, including unsolicited interaction
+with those enforcing the Code of Conduct, is allowed during this period.
+Violating these terms may lead to a permanent ban.
+### 4. Permanent Ban
+**Community Impact**: Demonstrating a pattern of violation of community
+standards, including sustained inappropriate behavior, harassment of an
+individual, or aggression toward or disparagement of classes of individuals.
+**Consequence**: A permanent ban from any sort of public interaction within the
+community.
+### Slack and Discord Etiquettes
+These Slack and Discord etiquette guidelines are designed to foster an inclusive, respectful, and productive environment
+for all community members. By following these best practices, we ensure effective communication and collaboration while
+minimizing disruptions. Let’s work together to build a supportive and welcoming community!
+- Communicate respectfully and professionally, avoiding sarcasm or harsh language, and remember that tone can be
+  difficult to interpret in text.
+- Use threads for specific discussions to keep channels organized and easier to follow.
+- Tag others only when their input is critical or urgent, and use @here, @channel or @everyone sparingly to minimize
+  disruptions.
+- Be patient, as open-source contributors and maintainers often have other commitments and may need time to respond.
+- Post questions or discussions in the most relevant
+  channel ([discord - #general](https://discord.com/channels/1125308739348594758/1138430348557025341)).
+- When asking for help or raising issues, include necessary details like links, screenshots, or clear explanations to
+  provide context.
+- Keep discussions in public channels whenever possible to allow others to benefit from the conversation, unless the
+  matter is sensitive or private.
+- Always adhere to [our standards](https://github.com/FoundationAgents/OpenManus/blob/main/CODE_OF_CONDUCT.md#our-standards)
+  to ensure a welcoming and collaborative environment.
+- If you choose to mute a channel, consider setting up alerts for topics that still interest you to stay engaged. For
+  Slack, Go to Settings → Notifications → My Keywords to add specific keywords that will notify you when mentioned. For
+  example, if you're here for discussions about LLMs, mute the channel if it’s too busy, but set notifications to alert
+  you only when “LLMs” appears in messages. Also for Discord, go to the channel notifications and choose the option that
+  best describes your need.
+## Attribution
+This Code of Conduct is adapted from the [Contributor Covenant][homepage],
+version 2.1, available at
+[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
+Community Impact Guidelines were inspired by
+[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
+For answers to common questions about this code of conduct, see the FAQ at
+[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
+[https://www.contributor-covenant.org/translations][translations].
+[homepage]: https://www.contributor-covenant.org
+[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
+[Mozilla CoC]: https://github.com/mozilla/diversity
+[FAQ]: https://www.contributor-covenant.org/faq
+[translations]: https://www.contributor-covenant.org/translations

Dockerfile ADDED Viewed

	@@ -0,0 +1,67 @@

+FROM python:3.12-slim
+# Éviter les questions lors de l'installation des paquets
+ENV DEBIAN_FRONTEND=noninteractive
+ENV PYTHONUNBUFFERED=1
+# Créer un utilisateur non-root pour Hugging Face Spaces
+RUN useradd -m -u 1000 user
+WORKDIR /home/user/app
+# Installer les dépendances système nécessaires
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    git \
+    curl \
+    wget \
+    gnupg \
+    ca-certificates \
+    libglib2.0-0 \
+    libnss3 \
+    libnspr4 \
+    libatk1.0-0 \
+    libatk-bridge2.0-0 \
+    libcups2 \
+    libdrm2 \
+    libdbus-1-3 \
+    libxcb1 \
+    libxkbcommon0 \
+    libx11-6 \
+    libxcomposite1 \
+    libxdamage1 \
+    libxext6 \
+    libxfixes3 \
+    librandr2 \
+    libgbm1 \
+    libpango-1.0-0 \
+    libcairo2 \
+    libasound2 \
+    && rm -rf /var/lib/apt/lists/*
+# Copier les fichiers du projet
+COPY --chown=user:user . .
+# Installer les dépendances Python
+RUN pip install --no-cache-dir -r requirements.txt
+# Installer Playwright et ses navigateurs
+RUN pip install playwright && playwright install --with-deps chromium
+# S'assurer que le répertoire de travail appartient à l'utilisateur
+RUN chown -R user:user /home/user/app
+# Passer à l'utilisateur non-root
+USER user
+# Exposer le port (Hugging Face utilise souvent 7860 par défaut pour Gradio/Streamlit,
+# mais ici c'est un agent CLI. On peut ajouter une interface simple si besoin,
+# mais pour l'instant on suit les instructions de déploiement Docker standard.)
+EXPOSE 7860
+# Commande par défaut (on peut lancer main.py ou un script d'attente)
+# Pour Hugging Face Spaces, il faut souvent un service qui écoute sur un port.
+# Si OpenManus est purement CLI, on pourrait avoir besoin d'un wrapper web.
+# Cependant, l'utilisateur demande de vérifier que l'agent répond via l'URL du Space.
+# Je vais ajouter un petit script serveur web minimal pour maintenir le Space actif
+# et éventuellement fournir une interface de chat basique si OpenManus n'en a pas.
+CMD ["python", "app_hf.py"]

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 manna_and_poem
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,10 +1,195 @@
----
-title: OpenManus Gemini
-emoji: 🏢
-colorFrom: green
-colorTo: pink
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+<p align="center">
+  <img src="assets/logo.jpg" width="200"/>
+</p>
+English | [中文](README_zh.md) | [한국어](README_ko.md) | [日本語](README_ja.md)
+[![GitHub stars](https://img.shields.io/github/stars/FoundationAgents/OpenManus?style=social)](https://github.com/FoundationAgents/OpenManus/stargazers)
+&ensp;
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) &ensp;
+[![Discord Follow](https://dcbadge.vercel.app/api/server/DYn29wFk9z?style=flat)](https://discord.gg/DYn29wFk9z)
+[![Demo](https://img.shields.io/badge/Demo-Hugging%20Face-yellow)](https://huggingface.co/spaces/lyh-917/OpenManusDemo)
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15186407.svg)](https://doi.org/10.5281/zenodo.15186407)
+# 👋 OpenManus
+Manus is incredible, but OpenManus can achieve any idea without an *Invite Code* 🛫!
+Our team members [@Xinbin Liang](https://github.com/mannaandpoem) and [@Jinyu Xiang](https://github.com/XiangJinyu) (core authors), along with [@Zhaoyang Yu](https://github.com/MoshiQAQ), [@Jiayi Zhang](https://github.com/didiforgithub), and [@Sirui Hong](https://github.com/stellaHSR), we are from [@MetaGPT](https://github.com/geekan/MetaGPT). The prototype is launched within 3 hours and we are keeping building!
+It's a simple implementation, so we welcome any suggestions, contributions, and feedback!
+Enjoy your own agent with OpenManus!
+We're also excited to introduce [OpenManus-RL](https://github.com/OpenManus/OpenManus-RL), an open-source project dedicated to reinforcement learning (RL)- based (such as GRPO) tuning methods for LLM agents, developed collaboratively by researchers from UIUC and OpenManus.
+## Project Demo
+<video src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" data-canonical-src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video>
+## Installation
+We provide two installation methods. Method 2 (using uv) is recommended for faster installation and better dependency management.
+### Method 1: Using conda
+1. Create a new conda environment:
+```bash
+conda create -n open_manus python=3.12
+conda activate open_manus
+```
+2. Clone the repository:
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+### Method 2: Using uv (Recommended)
+1. Install uv (A fast Python package installer and resolver):
+```bash
+curl -LsSf https://astral.sh/uv/install.sh | sh
+```
+2. Clone the repository:
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. Create a new virtual environment and activate it:
+```bash
+uv venv --python 3.12
+source .venv/bin/activate  # On Unix/macOS
+# Or on Windows:
+# .venv\Scripts\activate
+```
+4. Install dependencies:
+```bash
+uv pip install -r requirements.txt
+```
+### Browser Automation Tool (Optional)
+```bash
+playwright install
+```
+## Configuration
+OpenManus requires configuration for the LLM APIs it uses. Follow these steps to set up your configuration:
+1. Create a `config.toml` file in the `config` directory (you can copy from the example):
+```bash
+cp config/config.example.toml config/config.toml
+```
+2. Edit `config/config.toml` to add your API keys and customize settings:
+```toml
+# Global LLM configuration
+[llm]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # Replace with your actual API key
+max_tokens = 4096
+temperature = 0.0
+# Optional configuration for specific LLM models
+[llm.vision]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # Replace with your actual API key
+```
+## Quick Start
+One line for run OpenManus:
+```bash
+python main.py
+```
+Then input your idea via terminal!
+For MCP tool version, you can run:
+```bash
+python run_mcp.py
+```
+For unstable multi-agent version, you also can run:
+```bash
+python run_flow.py
+```
+### Custom Adding Multiple Agents
+Currently, besides the general OpenManus Agent, we have also integrated the DataAnalysis Agent, which is suitable for data analysis and data visualization tasks. You can add this agent to `run_flow` in `config.toml`.
+```toml
+# Optional configuration for run-flow
+[runflow]
+use_data_analysis_agent = true     # Disabled by default, change to true to activate
+```
+In addition, you need to install the relevant dependencies to ensure the agent runs properly: [Detailed Installation Guide](app/tool/chart_visualization/README.md##Installation)
+## How to contribute
+We welcome any friendly suggestions and helpful contributions! Just create issues or submit pull requests.
+Or contact @mannaandpoem via 📧email: mannaandpoem@gmail.com
+**Note**: Before submitting a pull request, please use the pre-commit tool to check your changes. Run `pre-commit run --all-files` to execute the checks.
+## Community Group
+Join our networking group on Feishu and share your experience with other developers!
+<div align="center" style="display: flex; gap: 20px;">
+    <img src="assets/community_group.jpg" alt="OpenManus 交流群" width="300" />
+</div>
+## Star History
+[![Star History Chart](https://api.star-history.com/svg?repos=FoundationAgents/OpenManus&type=Date)](https://star-history.com/#FoundationAgents/OpenManus&Date)
+## Sponsors
+Thanks to [PPIO](https://ppinfra.com/user/register?invited_by=OCPKCN&utm_source=github_openmanus&utm_medium=github_readme&utm_campaign=link) for computing source support.
+> PPIO: The most affordable and easily-integrated MaaS and GPU cloud solution.
+## Acknowledgement
+Thanks to [anthropic-computer-use](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo), [browser-use](https://github.com/browser-use/browser-use) and [crawl4ai](https://github.com/unclecode/crawl4ai) for providing basic support for this project!
+Additionally, we are grateful to [AAAJ](https://github.com/metauto-ai/agent-as-a-judge), [MetaGPT](https://github.com/geekan/MetaGPT), [OpenHands](https://github.com/All-Hands-AI/OpenHands) and [SWE-agent](https://github.com/SWE-agent/SWE-agent).
+We also thank stepfun(阶跃星辰) for supporting our Hugging Face demo space.
+OpenManus is built by contributors from MetaGPT. Huge thanks to this agent community!
+## Cite
+```bibtex
+@misc{openmanus2025,
+  author = {Xinbin Liang and Jinyu Xiang and Zhaoyang Yu and Jiayi Zhang and Sirui Hong and Sheng Fan and Xiao Tang and Bang Liu and Yuyu Luo and Chenglin Wu},
+  title = {OpenManus: An open-source framework for building general AI agents},
+  year = {2025},
+  publisher = {Zenodo},
+  doi = {10.5281/zenodo.15186407},
+  url = {https://doi.org/10.5281/zenodo.15186407},
+}
+```

README_ja.md ADDED Viewed

	@@ -0,0 +1,193 @@

+<p align="center">
+  <img src="assets/logo.jpg" width="200"/>
+</p>
+[English](README.md) | [中文](README_zh.md) | [한국어](README_ko.md) | 日本語
+[![GitHub stars](https://img.shields.io/github/stars/FoundationAgents/OpenManus?style=social)](https://github.com/FoundationAgents/OpenManus/stargazers)
+&ensp;
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) &ensp;
+[![Discord Follow](https://dcbadge.vercel.app/api/server/DYn29wFk9z?style=flat)](https://discord.gg/DYn29wFk9z)
+[![Demo](https://img.shields.io/badge/Demo-Hugging%20Face-yellow)](https://huggingface.co/spaces/lyh-917/OpenManusDemo)
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15186407.svg)](https://doi.org/10.5281/zenodo.15186407)
+# 👋 OpenManus
+Manusは素晴らしいですが、OpenManusは*招待コード*なしでどんなアイデアも実現できます！🛫
+私たちのチームメンバー [@Xinbin Liang](https://github.com/mannaandpoem) と [@Jinyu Xiang](https://github.com/XiangJinyu)（主要開発者）、そして [@Zhaoyang Yu](https://github.com/MoshiQAQ)、[@Jiayi Zhang](https://github.com/didiforgithub)、[@Sirui Hong](https://github.com/stellaHSR) は [@MetaGPT](https://github.com/geekan/MetaGPT) から来ました。プロトタイプは3時間以内に立ち上げられ、継続的に開発を進めています！
+これはシンプルな実装ですので、どんな提案、貢献、フィードバックも歓迎します！
+OpenManusで自分だけのエージェントを楽しみましょう！
+また、UIUCとOpenManusの研究者が共同開発した[OpenManus-RL](https://github.com/OpenManus/OpenManus-RL)をご紹介できることを嬉しく思います。これは強化学習（RL）ベース（GRPOなど）のLLMエージェントチューニング手法に特化したオープンソースプロジェクトです。
+## プロジェクトデモ
+<video src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" data-canonical-src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video>
+## インストール方法
+インストール方法は2つ提供しています。方法2（uvを使用）は、より高速なインストールと優れた依存関係管理のため推奨されています。
+### 方法1：condaを使用
+1. 新しいconda環境を作成します：
+```bash
+conda create -n open_manus python=3.12
+conda activate open_manus
+```
+2. リポジトリをクローンします：
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 依存関係をインストールします：
+```bash
+pip install -r requirements.txt
+```
+### 方法2：uvを使用（推奨）
+1. uv（高速なPythonパッケージインストーラーと管理機能）をインストールします：
+```bash
+curl -LsSf https://astral.sh/uv/install.sh | sh
+```
+2. リポジトリをクローンします：
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 新しい仮想環境を作成してアクティベートします：
+```bash
+uv venv --python 3.12
+source .venv/bin/activate  # Unix/macOSの場合
+# Windowsの場合：
+# .venv\Scripts\activate
+```
+4. 依存関係をインストールします：
+```bash
+uv pip install -r requirements.txt
+```
+### ブラウザ自動化ツール（オプション）
+```bash
+playwright install
+```
+## 設定
+OpenManusを使用するには、LLM APIの設定が必要です。以下の手順に従って設定してください：
+1. `config`ディレクトリに`config.toml`ファイルを作成します（サンプルからコピーできます）：
+```bash
+cp config/config.example.toml config/config.toml
+```
+2. `config/config.toml`を編集してAPIキーを追加し、設定をカスタマイズします：
+```toml
+# グローバルLLM設定
+[llm]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 実際のAPIキーに置き換えてください
+max_tokens = 4096
+temperature = 0.0
+# 特定のLLMモデル用のオプション設定
+[llm.vision]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 実際のAPIキーに置き換えてください
+```
+## クイックスタート
+OpenManusを実行する一行コマンド：
+```bash
+python main.py
+```
+その後、ターミナルからプロンプトを入力してください！
+MCP ツールバージョンを使用する場合は、以下を実行します：
+```bash
+python run_mcp.py
+```
+開発中のマルチエージェントバージョンを試すには、以下を実行します：
+```bash
+python run_flow.py
+```
+## カスタムマルチエージェントの追加
+現在、一般的なOpenManusエージェントに加えて、データ分析とデータ可視化タスクに適したDataAnalysisエージェントが組み込まれています。このエージェントを`config.toml`の`run_flow`に追加することができます。
+```toml
+# run-flowのオプション設定
+[runflow]
+use_data_analysis_agent = true     # デフォルトでは無効、trueに変更すると有効化されます
+```
+これに加えて、エージェントが正常に動作するために必要な依存関係をインストールする必要があります：[具体的なインストールガイド](app/tool/chart_visualization/README_ja.md##インストール)
+## 貢献方法
+我々は建設的な意見や有益な貢献を歓迎します！issueを作成するか、プルリクエストを提出してください。
+または @mannaandpoem に📧メールでご連絡ください：mannaandpoem@gmail.com
+**注意**: プルリクエストを送信する前に、pre-commitツールを使用して変更を確認してください。`pre-commit run --all-files`を実行してチェックを実行します。
+## コミュニティグループ
+Feishuのネットワーキンググループに参加して、他の開発者と経験を共有しましょう！
+<div align="center" style="display: flex; gap: 20px;">
+    <img src="assets/community_group.jpg" alt="OpenManus 交流群" width="300" />
+</div>
+## スター履歴
+[![Star History Chart](https://api.star-history.com/svg?repos=FoundationAgents/OpenManus&type=Date)](https://star-history.com/#FoundationAgents/OpenManus&Date)
+## 謝辞
+このプロジェクトの基本的なサポートを提供してくれた[anthropic-computer-use](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo)
+と[browser-use](https://github.com/browser-use/browser-use)に感謝します！
+さらに、[AAAJ](https://github.com/metauto-ai/agent-as-a-judge)、[MetaGPT](https://github.com/geekan/MetaGPT)、[OpenHands](https://github.com/All-Hands-AI/OpenHands)、[SWE-agent](https://github.com/SWE-agent/SWE-agent)にも感謝します。
+また、Hugging Face デモスペースをサポートしてくださった阶跃星辰 (stepfun)にも感謝いたします。
+OpenManusはMetaGPTのコントリビューターによって構築されました。このエージェントコミュニティに大きな感謝を！
+## 引用
+```bibtex
+@misc{openmanus2025,
+  author = {Xinbin Liang and Jinyu Xiang and Zhaoyang Yu and Jiayi Zhang and Sirui Hong and Sheng Fan and Xiao Tang},
+  title = {OpenManus: An open-source framework for building general AI agents},
+  year = {2025},
+  publisher = {Zenodo},
+  doi = {10.5281/zenodo.15186407},
+  url = {https://doi.org/10.5281/zenodo.15186407},
+}
+```

README_ko.md ADDED Viewed

	@@ -0,0 +1,192 @@

+<p align="center">
+  <img src="assets/logo.jpg" width="200"/>
+</p>
+[English](README.md) | [中文](README_zh.md) | 한국어 | [日本語](README_ja.md)
+[![GitHub stars](https://img.shields.io/github/stars/FoundationAgents/OpenManus?style=social)](https://github.com/FoundationAgents/OpenManus/stargazers)
+&ensp;
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) &ensp;
+[![Discord Follow](https://dcbadge.vercel.app/api/server/DYn29wFk9z?style=flat)](https://discord.gg/DYn29wFk9z)
+[![Demo](https://img.shields.io/badge/Demo-Hugging%20Face-yellow)](https://huggingface.co/spaces/lyh-917/OpenManusDemo)
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15186407.svg)](https://doi.org/10.5281/zenodo.15186407)
+# 👋 OpenManus
+Manus는 놀라운 도구지만, OpenManus는 *초대 코드* 없이도 모든 아이디어를 실현할 수 있습니다! 🛫
+우리 팀의 멤버인 [@Xinbin Liang](https://github.com/mannaandpoem)와 [@Jinyu Xiang](https://github.com/XiangJinyu) (핵심 작성자), 그리고 [@Zhaoyang Yu](https://github.com/MoshiQAQ), [@Jiayi Zhang](https://github.com/didiforgithub), [@Sirui Hong](https://github.com/stellaHSR)이 함께 했습니다. 우리는 [@MetaGPT](https://github.com/geekan/MetaGPT)로부터 왔습니다. 프로토타입은 단 3시간 만에 출시되었으며, 계속해서 발전하고 있습니다!
+이 프로젝트는 간단한 구현에서 시작되었으며, 여러분의 제안, 기여 및 피드백을 환영합니다!
+OpenManus를 통해 여러분만의 에이전트를 즐겨보세요!
+또한 [OpenManus-RL](https://github.com/OpenManus/OpenManus-RL)을 소개하게 되어 기쁩니다. OpenManus와 UIUC 연구자들이 공동 개발한 이 오픈소스 프로젝트는 LLM 에이전트에 대해 강화 학습(RL) 기반 (예: GRPO) 튜닝 방법을 제공합니다.
+## 프로젝트 데모
+<video src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" data-canonical-src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video>
+## 설치 방법
+두 가지 설치 방법을 제공합니다. **방법 2 (uv 사용)** 이 더 빠른 설치와 효율적인 종속성 관리를 위해 권장됩니다.
+### 방법 1: conda 사용
+1. 새로운 conda 환경을 생성합니다:
+```bash
+conda create -n open_manus python=3.12
+conda activate open_manus
+```
+2. 저장소를 클론합니다:
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 종속성을 설치합니다:
+```bash
+pip install -r requirements.txt
+```
+### 방법 2: uv 사용 (권장)
+1. uv를 설치합니다. (빠른 Python 패키지 설치 및 종속성 관리 도구):
+```bash
+curl -LsSf https://astral.sh/uv/install.sh | sh
+```
+2. 저장소를 클론합니다:
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 새로운 가상 환경을 생성하고 활성화합니다:
+```bash
+uv venv --python 3.12
+source .venv/bin/activate  # Unix/macOS의 경우
+# Windows의 경우:
+# .venv\Scripts\activate
+```
+4. 종속성을 설치합니다:
+```bash
+uv pip install -r requirements.txt
+```
+### 브라우저 자동화 도구 (선택사항)
+```bash
+playwright install
+```
+## 설정 방법
+OpenManus를 사용하려면 사용하는 LLM API에 대한 설정이 필요합니다. 아래 단계를 따라 설정을 완료하세요:
+1. `config` 디렉토리에 `config.toml` 파일을 생성하세요 (예제 파일을 복사하여 사용할 수 있습니다):
+```bash
+cp config/config.example.toml config/config.toml
+```
+2. `config/config.toml` 파일을 편집하여 API 키를 추가하고 설정을 커스터마이징하세요:
+```toml
+# 전역 LLM 설정
+[llm]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 실제 API 키로 변경하세요
+max_tokens = 4096
+temperature = 0.0
+# 특정 LLM 모델에 대한 선택적 설정
+[llm.vision]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 실제 API 키로 변경하세요
+```
+## 빠른 시작
+OpenManus를 실행하는 한 줄 명령어:
+```bash
+python main.py
+```
+이후 터미널에서 아이디어를 작성하세요!
+MCP 도구 버전을 사용하려면 다음을 실행하세요:
+```bash
+python run_mcp.py
+```
+불안정한 멀티 에이전트 버전을 실행하려면 다음을 실행할 수 있습니다:
+```bash
+python run_flow.py
+```
+### 사용자 정의 다중 에이전트 추가
+현재 일반 OpenManus 에이전트 외에도 데이터 분석 및 데이터 시각화 작업에 적합한 DataAnalysis 에이전트를 통합했습니다. 이 에이전트를 `config.toml`의 `run_flow`에 추가할 수 있습니다.
+```toml
+# run-flow에 대한 선택적 구성
+[runflow]
+use_data_analysis_agent = true     # 기본적으로 비활성화되어 있으며, 활성화하려면 true로 변경
+```
+또한, 에이전트가 제대로 작동하도록 관련 종속성을 설치해야 합니다: [상세 설치 가이드](app/tool/chart_visualization/README.md##Installation)
+## 기여 방법
+모든 친절한 제안과 유용한 기여를 환영합니다! 이슈를 생성하거나 풀 리퀘스트를 제출해 주세요.
+또는 📧 메일로 연락주세요. @mannaandpoem : mannaandpoem@gmail.com
+**참고**: pull request를 제출하기 전에 pre-commit 도구를 사용하여 변경 사항을 확인하십시오. `pre-commit run --all-files`를 실행하여 검사를 실행합니다.
+## 커뮤니티 그룹
+Feishu 네트워킹 그룹에 참여하여 다른 개발자들과 경험을 공유하세요!
+<div align="center" style="display: flex; gap: 20px;">
+    <img src="assets/community_group.jpg" alt="OpenManus 交流群" width="300" />
+</div>
+## Star History
+[![Star History Chart](https://api.star-history.com/svg?repos=FoundationAgents/OpenManus&type=Date)](https://star-history.com/#FoundationAgents/OpenManus&Date)
+## 감사의 글
+이 프로젝트에 기본적인 지원을 제공해 주신 [anthropic-computer-use](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo)와
+[browser-use](https://github.com/browser-use/browser-use)에게 감사드립니다!
+또한, [AAAJ](https://github.com/metauto-ai/agent-as-a-judge), [MetaGPT](https://github.com/geekan/MetaGPT), [OpenHands](https://github.com/All-Hands-AI/OpenHands), [SWE-agent](https://github.com/SWE-agent/SWE-agent)에 깊은 감사를 드립니다.
+또한 Hugging Face 데모 공간을 지원해 주신 阶跃星辰 (stepfun)에게 감사드립니다.
+OpenManus는 MetaGPT 기여자들에 의해 개발되었습니다. 이 에이전트 커뮤니티에 깊은 감사를 전합니다!
+## 인용
+```bibtex
+@misc{openmanus2025,
+  author = {Xinbin Liang and Jinyu Xiang and Zhaoyang Yu and Jiayi Zhang and Sirui Hong and Sheng Fan and Xiao Tang},
+  title = {OpenManus: An open-source framework for building general AI agents},
+  year = {2025},
+  publisher = {Zenodo},
+  doi = {10.5281/zenodo.15186407},
+  url = {https://doi.org/10.5281/zenodo.15186407},
+}
+```

README_zh.md ADDED Viewed

	@@ -0,0 +1,198 @@

+<p align="center">
+  <img src="assets/logo.jpg" width="200"/>
+</p>
+[English](README.md) | 中文 | [한국어](README_ko.md) | [日本語](README_ja.md)
+[![GitHub stars](https://img.shields.io/github/stars/FoundationAgents/OpenManus?style=social)](https://github.com/FoundationAgents/OpenManus/stargazers)
+&ensp;
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) &ensp;
+[![Discord Follow](https://dcbadge.vercel.app/api/server/DYn29wFk9z?style=flat)](https://discord.gg/DYn29wFk9z)
+[![Demo](https://img.shields.io/badge/Demo-Hugging%20Face-yellow)](https://huggingface.co/spaces/lyh-917/OpenManusDemo)
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15186407.svg)](https://doi.org/10.5281/zenodo.15186407)
+# 👋 OpenManus
+Manus 非常棒，但 OpenManus 无需邀请码即可实现任何创意 🛫！
+我们的团队成员 [@Xinbin Liang](https://github.com/mannaandpoem) 和 [@Jinyu Xiang](https://github.com/XiangJinyu)（核心作者），以及 [@Zhaoyang Yu](https://github.com/MoshiQAQ)、[@Jiayi Zhang](https://github.com/didiforgithub) 和 [@Sirui Hong](https://github.com/stellaHSR)，来自 [@MetaGPT](https://github.com/geekan/MetaGPT)团队。我们在 3
+小时内完成了开发并持续迭代中！
+这是一个简洁的实现方案，欢迎任何建议、贡献和反馈！
+用 OpenManus 开启你的智能体之旅吧！
+我们也非常高兴地向大家介绍 [OpenManus-RL](https://github.com/OpenManus/OpenManus-RL)，这是一个专注于基于强化学习（RL，例如 GRPO）的方法来优化大语言模型（LLM）智能体的开源项目，由来自UIUC 和 OpenManus 的研究人员合作开发。
+## 项目演示
+<video src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" data-canonical-src="https://private-user-images.githubusercontent.com/61239030/420168772-6dcfd0d2-9142-45d9-b74e-d10aa75073c6.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDEzMTgwNTksIm5iZiI6MTc0MTMxNzc1OSwicGF0aCI6Ii82MTIzOTAzMC80MjAxNjg3NzItNmRjZmQwZDItOTE0Mi00NWQ5LWI3NGUtZDEwYWE3NTA3M2M2Lm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAzMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMzA3VDAzMjIzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdiZjFkNjlmYWNjMmEzOTliM2Y3M2VlYjgyNDRlZDJmOWE3NWZhZjE1MzhiZWY4YmQ3NjdkNTYwYTU5ZDA2MzYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.UuHQCgWYkh0OQq9qsUWqGsUbhG3i9jcZDAMeHjLt5T4" controls="controls" muted="muted" class="d-block rounded-bottom-2 border-top width-fit" style="max-height:640px; min-height: 200px"></video>
+## 安装指南
+我们提供两种安装方式。推荐使用方式二（uv），因为它能提供更快的安装速度和更好的依赖管理。
+### 方式一：使用 conda
+1. 创建新的 conda 环境：
+```bash
+conda create -n open_manus python=3.12
+conda activate open_manus
+```
+2. 克隆仓库：
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 安装依赖：
+```bash
+pip install -r requirements.txt
+```
+### 方式二：使用 uv（推荐）
+1. 安装 uv（一个快速的 Python 包管理器）：
+```bash
+curl -LsSf https://astral.sh/uv/install.sh | sh
+```
+2. 克隆仓库：
+```bash
+git clone https://github.com/FoundationAgents/OpenManus.git
+cd OpenManus
+```
+3. 创建并激活虚拟环境：
+```bash
+uv venv --python 3.12
+source .venv/bin/activate  # Unix/macOS 系统
+# Windows 系统使用：
+# .venv\Scripts\activate
+```
+4. 安装依赖：
+```bash
+uv pip install -r requirements.txt
+```
+### 浏览器自动化工具（可选）
+```bash
+playwright install
+```
+## 配置说明
+OpenManus 需要配置使用的 LLM API，请按以下步骤设置：
+1. 在 `config` 目录创建 `config.toml` 文件（可从示例复制）：
+```bash
+cp config/config.example.toml config/config.toml
+```
+2. 编辑 `config/config.toml` 添加 API 密钥和自定义设置：
+```toml
+# 全局 LLM 配置
+[llm]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 替换为真实 API 密钥
+max_tokens = 4096
+temperature = 0.0
+# 可选特定 LLM 模型配���
+[llm.vision]
+model = "gpt-4o"
+base_url = "https://api.openai.com/v1"
+api_key = "sk-..."  # 替换为真实 API 密钥
+```
+## 快速启动
+一行命令运行 OpenManus：
+```bash
+python main.py
+```
+然后通过终端输入你的创意！
+如需使用 MCP 工具版本，可运行：
+```bash
+python run_mcp.py
+```
+如需体验不稳定的多智能体版本，可运行：
+```bash
+python run_flow.py
+```
+## 添加自定义多智能体
+目前除了通用的 OpenManus Agent, 我们还内置了DataAnalysis Agent，适用于数据分析和数据可视化任务，你可以在`config.toml`中将这个智能体加入到`run_flow`中
+```toml
+# run-flow可选配置
+[runflow]
+use_data_analysis_agent = true     # 默认关闭，将其改为true则为激活
+```
+除此之外，你还需要安装相关的依赖来确保智能体正常运行：[具体安装指南](app/tool/chart_visualization/README_zh.md##安装)
+## 贡献指南
+我们欢迎任何友好的建议和有价值的贡献！可以直接创建 issue 或提交 pull request。
+或通过 📧 邮件联系 @mannaandpoem：mannaandpoem@gmail.com
+**注意**: 在提交 pull request 之前，请使用 pre-commit 工具检查您的更改。运行 `pre-commit run --all-files` 来执行检查。
+## 交流群
+加入我们的飞书交流群，与其他开发者分享经验！
+<div align="center" style="display: flex; gap: 20px;">
+    <img src="assets/community_group.jpg" alt="OpenManus 交流群" width="300" />
+</div>
+## Star 数量
+[![Star History Chart](https://api.star-history.com/svg?repos=FoundationAgents/OpenManus&type=Date)](https://star-history.com/#FoundationAgents/OpenManus&Date)
+## 赞助商
+感谢[PPIO](https://ppinfra.com/user/register?invited_by=OCPKCN&utm_source=github_openmanus&utm_medium=github_readme&utm_campaign=link) 提供的算力支持。
+> PPIO派欧云：一键调用高性价比的开源模型API和GPU容器
+## 致谢
+特别感谢 [anthropic-computer-use](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo)
+和 [browser-use](https://github.com/browser-use/browser-use) 为本项目提供的基础支持！
+此外，我们感谢 [AAAJ](https://github.com/metauto-ai/agent-as-a-judge)，[MetaGPT](https://github.com/geekan/MetaGPT)，[OpenHands](https://github.com/All-Hands-AI/OpenHands) 和 [SWE-agent](https://github.com/SWE-agent/SWE-agent).
+我们也感谢阶跃星辰 (stepfun) 提供的 Hugging Face 演示空间支持。
+OpenManus 由 MetaGPT 社区的贡献者共同构建，感谢这个充满活力的智能体开发者社区！
+## 引用
+```bibtex
+@misc{openmanus2025,
+  author = {Xinbin Liang and Jinyu Xiang and Zhaoyang Yu and Jiayi Zhang and Sirui Hong and Sheng Fan and Xiao Tang},
+  title = {OpenManus: An open-source framework for building general AI agents},
+  year = {2025},
+  publisher = {Zenodo},
+  doi = {10.5281/zenodo.15186407},
+  url = {https://doi.org/10.5281/zenodo.15186407},
+}
+```

app/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+# Python version check: 3.11-3.13
+import sys
+if sys.version_info < (3, 11) or sys.version_info > (3, 13):
+    print(
+        "Warning: Unsupported Python version {ver}, please use 3.11-3.13".format(
+            ver=".".join(map(str, sys.version_info))
+        )
+    )

app/agent/__init__.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from app.agent.base import BaseAgent
+from app.agent.browser import BrowserAgent
+from app.agent.mcp import MCPAgent
+from app.agent.react import ReActAgent
+from app.agent.swe import SWEAgent
+from app.agent.toolcall import ToolCallAgent
+__all__ = [
+    "BaseAgent",
+    "BrowserAgent",
+    "ReActAgent",
+    "SWEAgent",
+    "ToolCallAgent",
+    "MCPAgent",
+]

app/agent/base.py ADDED Viewed

	@@ -0,0 +1,196 @@

+from abc import ABC, abstractmethod
+from contextlib import asynccontextmanager
+from typing import List, Optional
+from pydantic import BaseModel, Field, model_validator
+from app.llm import LLM
+from app.logger import logger
+from app.sandbox.client import SANDBOX_CLIENT
+from app.schema import ROLE_TYPE, AgentState, Memory, Message
+class BaseAgent(BaseModel, ABC):
+    """Abstract base class for managing agent state and execution.
+    Provides foundational functionality for state transitions, memory management,
+    and a step-based execution loop. Subclasses must implement the `step` method.
+    """
+    # Core attributes
+    name: str = Field(..., description="Unique name of the agent")
+    description: Optional[str] = Field(None, description="Optional agent description")
+    # Prompts
+    system_prompt: Optional[str] = Field(
+        None, description="System-level instruction prompt"
+    )
+    next_step_prompt: Optional[str] = Field(
+        None, description="Prompt for determining next action"
+    )
+    # Dependencies
+    llm: LLM = Field(default_factory=LLM, description="Language model instance")
+    memory: Memory = Field(default_factory=Memory, description="Agent's memory store")
+    state: AgentState = Field(
+        default=AgentState.IDLE, description="Current agent state"
+    )
+    # Execution control
+    max_steps: int = Field(default=10, description="Maximum steps before termination")
+    current_step: int = Field(default=0, description="Current step in execution")
+    duplicate_threshold: int = 2
+    class Config:
+        arbitrary_types_allowed = True
+        extra = "allow"  # Allow extra fields for flexibility in subclasses
+    @model_validator(mode="after")
+    def initialize_agent(self) -> "BaseAgent":
+        """Initialize agent with default settings if not provided."""
+        if self.llm is None or not isinstance(self.llm, LLM):
+            self.llm = LLM(config_name=self.name.lower())
+        if not isinstance(self.memory, Memory):
+            self.memory = Memory()
+        return self
+    @asynccontextmanager
+    async def state_context(self, new_state: AgentState):
+        """Context manager for safe agent state transitions.
+        Args:
+            new_state: The state to transition to during the context.
+        Yields:
+            None: Allows execution within the new state.
+        Raises:
+            ValueError: If the new_state is invalid.
+        """
+        if not isinstance(new_state, AgentState):
+            raise ValueError(f"Invalid state: {new_state}")
+        previous_state = self.state
+        self.state = new_state
+        try:
+            yield
+        except Exception as e:
+            self.state = AgentState.ERROR  # Transition to ERROR on failure
+            raise e
+        finally:
+            self.state = previous_state  # Revert to previous state
+    def update_memory(
+        self,
+        role: ROLE_TYPE,  # type: ignore
+        content: str,
+        base64_image: Optional[str] = None,
+        **kwargs,
+    ) -> None:
+        """Add a message to the agent's memory.
+        Args:
+            role: The role of the message sender (user, system, assistant, tool).
+            content: The message content.
+            base64_image: Optional base64 encoded image.
+            **kwargs: Additional arguments (e.g., tool_call_id for tool messages).
+        Raises:
+            ValueError: If the role is unsupported.
+        """
+        message_map = {
+            "user": Message.user_message,
+            "system": Message.system_message,
+            "assistant": Message.assistant_message,
+            "tool": lambda content, **kw: Message.tool_message(content, **kw),
+        }
+        if role not in message_map:
+            raise ValueError(f"Unsupported message role: {role}")
+        # Create message with appropriate parameters based on role
+        kwargs = {"base64_image": base64_image, **(kwargs if role == "tool" else {})}
+        self.memory.add_message(message_map[role](content, **kwargs))
+    async def run(self, request: Optional[str] = None) -> str:
+        """Execute the agent's main loop asynchronously.
+        Args:
+            request: Optional initial user request to process.
+        Returns:
+            A string summarizing the execution results.
+        Raises:
+            RuntimeError: If the agent is not in IDLE state at start.
+        """
+        if self.state != AgentState.IDLE:
+            raise RuntimeError(f"Cannot run agent from state: {self.state}")
+        if request:
+            self.update_memory("user", request)
+        results: List[str] = []
+        async with self.state_context(AgentState.RUNNING):
+            while (
+                self.current_step < self.max_steps and self.state != AgentState.FINISHED
+            ):
+                self.current_step += 1
+                logger.info(f"Executing step {self.current_step}/{self.max_steps}")
+                step_result = await self.step()
+                # Check for stuck state
+                if self.is_stuck():
+                    self.handle_stuck_state()
+                results.append(f"Step {self.current_step}: {step_result}")
+            if self.current_step >= self.max_steps:
+                self.current_step = 0
+                self.state = AgentState.IDLE
+                results.append(f"Terminated: Reached max steps ({self.max_steps})")
+        await SANDBOX_CLIENT.cleanup()
+        return "\n".join(results) if results else "No steps executed"
+    @abstractmethod
+    async def step(self) -> str:
+        """Execute a single step in the agent's workflow.
+        Must be implemented by subclasses to define specific behavior.
+        """
+    def handle_stuck_state(self):
+        """Handle stuck state by adding a prompt to change strategy"""
+        stuck_prompt = "\
+        Observed duplicate responses. Consider new strategies and avoid repeating ineffective paths already attempted."
+        self.next_step_prompt = f"{stuck_prompt}\n{self.next_step_prompt}"
+        logger.warning(f"Agent detected stuck state. Added prompt: {stuck_prompt}")
+    def is_stuck(self) -> bool:
+        """Check if the agent is stuck in a loop by detecting duplicate content"""
+        if len(self.memory.messages) < 2:
+            return False
+        last_message = self.memory.messages[-1]
+        if not last_message.content:
+            return False
+        # Count identical content occurrences
+        duplicate_count = sum(
+            1
+            for msg in reversed(self.memory.messages[:-1])
+            if msg.role == "assistant" and msg.content == last_message.content
+        )
+        return duplicate_count >= self.duplicate_threshold
+    @property
+    def messages(self) -> List[Message]:
+        """Retrieve a list of messages from the agent's memory."""
+        return self.memory.messages
+    @messages.setter
+    def messages(self, value: List[Message]):
+        """Set the list of messages in the agent's memory."""
+        self.memory.messages = value

app/agent/browser.py ADDED Viewed

	@@ -0,0 +1,129 @@

+import json
+from typing import TYPE_CHECKING, Optional
+from pydantic import Field, model_validator
+from app.agent.toolcall import ToolCallAgent
+from app.logger import logger
+from app.prompt.browser import NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.schema import Message, ToolChoice
+from app.tool import BrowserUseTool, Terminate, ToolCollection
+from app.tool.sandbox.sb_browser_tool import SandboxBrowserTool
+# Avoid circular import if BrowserAgent needs BrowserContextHelper
+if TYPE_CHECKING:
+    from app.agent.base import BaseAgent  # Or wherever memory is defined
+class BrowserContextHelper:
+    def __init__(self, agent: "BaseAgent"):
+        self.agent = agent
+        self._current_base64_image: Optional[str] = None
+    async def get_browser_state(self) -> Optional[dict]:
+        browser_tool = self.agent.available_tools.get_tool(BrowserUseTool().name)
+        if not browser_tool:
+            browser_tool = self.agent.available_tools.get_tool(
+                SandboxBrowserTool().name
+            )
+        if not browser_tool or not hasattr(browser_tool, "get_current_state"):
+            logger.warning("BrowserUseTool not found or doesn't have get_current_state")
+            return None
+        try:
+            result = await browser_tool.get_current_state()
+            if result.error:
+                logger.debug(f"Browser state error: {result.error}")
+                return None
+            if hasattr(result, "base64_image") and result.base64_image:
+                self._current_base64_image = result.base64_image
+            else:
+                self._current_base64_image = None
+            return json.loads(result.output)
+        except Exception as e:
+            logger.debug(f"Failed to get browser state: {str(e)}")
+            return None
+    async def format_next_step_prompt(self) -> str:
+        """Gets browser state and formats the browser prompt."""
+        browser_state = await self.get_browser_state()
+        url_info, tabs_info, content_above_info, content_below_info = "", "", "", ""
+        results_info = ""  # Or get from agent if needed elsewhere
+        if browser_state and not browser_state.get("error"):
+            url_info = f"\n   URL: {browser_state.get('url', 'N/A')}\n   Title: {browser_state.get('title', 'N/A')}"
+            tabs = browser_state.get("tabs", [])
+            if tabs:
+                tabs_info = f"\n   {len(tabs)} tab(s) available"
+            pixels_above = browser_state.get("pixels_above", 0)
+            pixels_below = browser_state.get("pixels_below", 0)
+            if pixels_above > 0:
+                content_above_info = f" ({pixels_above} pixels)"
+            if pixels_below > 0:
+                content_below_info = f" ({pixels_below} pixels)"
+            if self._current_base64_image:
+                image_message = Message.user_message(
+                    content="Current browser screenshot:",
+                    base64_image=self._current_base64_image,
+                )
+                self.agent.memory.add_message(image_message)
+                self._current_base64_image = None  # Consume the image after adding
+        return NEXT_STEP_PROMPT.format(
+            url_placeholder=url_info,
+            tabs_placeholder=tabs_info,
+            content_above_placeholder=content_above_info,
+            content_below_placeholder=content_below_info,
+            results_placeholder=results_info,
+        )
+    async def cleanup_browser(self):
+        browser_tool = self.agent.available_tools.get_tool(BrowserUseTool().name)
+        if browser_tool and hasattr(browser_tool, "cleanup"):
+            await browser_tool.cleanup()
+class BrowserAgent(ToolCallAgent):
+    """
+    A browser agent that uses the browser_use library to control a browser.
+    This agent can navigate web pages, interact with elements, fill forms,
+    extract content, and perform other browser-based actions to accomplish tasks.
+    """
+    name: str = "browser"
+    description: str = "A browser agent that can control a browser to accomplish tasks"
+    system_prompt: str = SYSTEM_PROMPT
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    max_observe: int = 10000
+    max_steps: int = 20
+    # Configure the available tools
+    available_tools: ToolCollection = Field(
+        default_factory=lambda: ToolCollection(BrowserUseTool(), Terminate())
+    )
+    # Use Auto for tool choice to allow both tool usage and free-form responses
+    tool_choices: ToolChoice = ToolChoice.AUTO
+    special_tool_names: list[str] = Field(default_factory=lambda: [Terminate().name])
+    browser_context_helper: Optional[BrowserContextHelper] = None
+    @model_validator(mode="after")
+    def initialize_helper(self) -> "BrowserAgent":
+        self.browser_context_helper = BrowserContextHelper(self)
+        return self
+    async def think(self) -> bool:
+        """Process current state and decide next actions using tools, with browser state info added"""
+        self.next_step_prompt = (
+            await self.browser_context_helper.format_next_step_prompt()
+        )
+        return await super().think()
+    async def cleanup(self):
+        """Clean up browser agent resources by calling parent cleanup."""
+        await self.browser_context_helper.cleanup_browser()

app/agent/data_analysis.py ADDED Viewed

	@@ -0,0 +1,37 @@

+from pydantic import Field
+from app.agent.toolcall import ToolCallAgent
+from app.config import config
+from app.prompt.visualization import NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.tool import Terminate, ToolCollection
+from app.tool.chart_visualization.chart_prepare import VisualizationPrepare
+from app.tool.chart_visualization.data_visualization import DataVisualization
+from app.tool.chart_visualization.python_execute import NormalPythonExecute
+class DataAnalysis(ToolCallAgent):
+    """
+    A data analysis agent that uses planning to solve various data analysis tasks.
+    This agent extends ToolCallAgent with a comprehensive set of tools and capabilities,
+    including Data Analysis, Chart Visualization, Data Report.
+    """
+    name: str = "Data_Analysis"
+    description: str = "An analytical agent that utilizes python and data visualization tools to solve diverse data analysis tasks"
+    system_prompt: str = SYSTEM_PROMPT.format(directory=config.workspace_root)
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    max_observe: int = 15000
+    max_steps: int = 20
+    # Add general-purpose tools to the tool collection
+    available_tools: ToolCollection = Field(
+        default_factory=lambda: ToolCollection(
+            NormalPythonExecute(),
+            VisualizationPrepare(),
+            DataVisualization(),
+            Terminate(),
+        )
+    )

app/agent/manus.py ADDED Viewed

	@@ -0,0 +1,165 @@

+from typing import Dict, List, Optional
+from pydantic import Field, model_validator
+from app.agent.browser import BrowserContextHelper
+from app.agent.toolcall import ToolCallAgent
+from app.config import config
+from app.logger import logger
+from app.prompt.manus import NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.tool import Terminate, ToolCollection
+from app.tool.ask_human import AskHuman
+from app.tool.browser_use_tool import BrowserUseTool
+from app.tool.mcp import MCPClients, MCPClientTool
+from app.tool.python_execute import PythonExecute
+from app.tool.str_replace_editor import StrReplaceEditor
+class Manus(ToolCallAgent):
+    """A versatile general-purpose agent with support for both local and MCP tools."""
+    name: str = "Manus"
+    description: str = "A versatile agent that can solve various tasks using multiple tools including MCP-based tools"
+    system_prompt: str = SYSTEM_PROMPT.format(directory=config.workspace_root)
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    max_observe: int = 10000
+    max_steps: int = 20
+    # MCP clients for remote tool access
+    mcp_clients: MCPClients = Field(default_factory=MCPClients)
+    # Add general-purpose tools to the tool collection
+    available_tools: ToolCollection = Field(
+        default_factory=lambda: ToolCollection(
+            PythonExecute(),
+            BrowserUseTool(),
+            StrReplaceEditor(),
+            AskHuman(),
+            Terminate(),
+        )
+    )
+    special_tool_names: list[str] = Field(default_factory=lambda: [Terminate().name])
+    browser_context_helper: Optional[BrowserContextHelper] = None
+    # Track connected MCP servers
+    connected_servers: Dict[str, str] = Field(
+        default_factory=dict
+    )  # server_id -> url/command
+    _initialized: bool = False
+    @model_validator(mode="after")
+    def initialize_helper(self) -> "Manus":
+        """Initialize basic components synchronously."""
+        self.browser_context_helper = BrowserContextHelper(self)
+        return self
+    @classmethod
+    async def create(cls, **kwargs) -> "Manus":
+        """Factory method to create and properly initialize a Manus instance."""
+        instance = cls(**kwargs)
+        await instance.initialize_mcp_servers()
+        instance._initialized = True
+        return instance
+    async def initialize_mcp_servers(self) -> None:
+        """Initialize connections to configured MCP servers."""
+        for server_id, server_config in config.mcp_config.servers.items():
+            try:
+                if server_config.type == "sse":
+                    if server_config.url:
+                        await self.connect_mcp_server(server_config.url, server_id)
+                        logger.info(
+                            f"Connected to MCP server {server_id} at {server_config.url}"
+                        )
+                elif server_config.type == "stdio":
+                    if server_config.command:
+                        await self.connect_mcp_server(
+                            server_config.command,
+                            server_id,
+                            use_stdio=True,
+                            stdio_args=server_config.args,
+                        )
+                        logger.info(
+                            f"Connected to MCP server {server_id} using command {server_config.command}"
+                        )
+            except Exception as e:
+                logger.error(f"Failed to connect to MCP server {server_id}: {e}")
+    async def connect_mcp_server(
+        self,
+        server_url: str,
+        server_id: str = "",
+        use_stdio: bool = False,
+        stdio_args: List[str] = None,
+    ) -> None:
+        """Connect to an MCP server and add its tools."""
+        if use_stdio:
+            await self.mcp_clients.connect_stdio(
+                server_url, stdio_args or [], server_id
+            )
+            self.connected_servers[server_id or server_url] = server_url
+        else:
+            await self.mcp_clients.connect_sse(server_url, server_id)
+            self.connected_servers[server_id or server_url] = server_url
+        # Update available tools with only the new tools from this server
+        new_tools = [
+            tool for tool in self.mcp_clients.tools if tool.server_id == server_id
+        ]
+        self.available_tools.add_tools(*new_tools)
+    async def disconnect_mcp_server(self, server_id: str = "") -> None:
+        """Disconnect from an MCP server and remove its tools."""
+        await self.mcp_clients.disconnect(server_id)
+        if server_id:
+            self.connected_servers.pop(server_id, None)
+        else:
+            self.connected_servers.clear()
+        # Rebuild available tools without the disconnected server's tools
+        base_tools = [
+            tool
+            for tool in self.available_tools.tools
+            if not isinstance(tool, MCPClientTool)
+        ]
+        self.available_tools = ToolCollection(*base_tools)
+        self.available_tools.add_tools(*self.mcp_clients.tools)
+    async def cleanup(self):
+        """Clean up Manus agent resources."""
+        if self.browser_context_helper:
+            await self.browser_context_helper.cleanup_browser()
+        # Disconnect from all MCP servers only if we were initialized
+        if self._initialized:
+            await self.disconnect_mcp_server()
+            self._initialized = False
+    async def think(self) -> bool:
+        """Process current state and decide next actions with appropriate context."""
+        if not self._initialized:
+            await self.initialize_mcp_servers()
+            self._initialized = True
+        original_prompt = self.next_step_prompt
+        recent_messages = self.memory.messages[-3:] if self.memory.messages else []
+        browser_in_use = any(
+            tc.function.name == BrowserUseTool().name
+            for msg in recent_messages
+            if msg.tool_calls
+            for tc in msg.tool_calls
+        )
+        if browser_in_use:
+            self.next_step_prompt = (
+                await self.browser_context_helper.format_next_step_prompt()
+            )
+        result = await super().think()
+        # Restore original prompt
+        self.next_step_prompt = original_prompt
+        return result

app/agent/mcp.py ADDED Viewed

	@@ -0,0 +1,185 @@

+from typing import Any, Dict, List, Optional, Tuple
+from pydantic import Field
+from app.agent.toolcall import ToolCallAgent
+from app.logger import logger
+from app.prompt.mcp import MULTIMEDIA_RESPONSE_PROMPT, NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.schema import AgentState, Message
+from app.tool.base import ToolResult
+from app.tool.mcp import MCPClients
+class MCPAgent(ToolCallAgent):
+    """Agent for interacting with MCP (Model Context Protocol) servers.
+    This agent connects to an MCP server using either SSE or stdio transport
+    and makes the server's tools available through the agent's tool interface.
+    """
+    name: str = "mcp_agent"
+    description: str = "An agent that connects to an MCP server and uses its tools."
+    system_prompt: str = SYSTEM_PROMPT
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    # Initialize MCP tool collection
+    mcp_clients: MCPClients = Field(default_factory=MCPClients)
+    available_tools: MCPClients = None  # Will be set in initialize()
+    max_steps: int = 20
+    connection_type: str = "stdio"  # "stdio" or "sse"
+    # Track tool schemas to detect changes
+    tool_schemas: Dict[str, Dict[str, Any]] = Field(default_factory=dict)
+    _refresh_tools_interval: int = 5  # Refresh tools every N steps
+    # Special tool names that should trigger termination
+    special_tool_names: List[str] = Field(default_factory=lambda: ["terminate"])
+    async def initialize(
+        self,
+        connection_type: Optional[str] = None,
+        server_url: Optional[str] = None,
+        command: Optional[str] = None,
+        args: Optional[List[str]] = None,
+    ) -> None:
+        """Initialize the MCP connection.
+        Args:
+            connection_type: Type of connection to use ("stdio" or "sse")
+            server_url: URL of the MCP server (for SSE connection)
+            command: Command to run (for stdio connection)
+            args: Arguments for the command (for stdio connection)
+        """
+        if connection_type:
+            self.connection_type = connection_type
+        # Connect to the MCP server based on connection type
+        if self.connection_type == "sse":
+            if not server_url:
+                raise ValueError("Server URL is required for SSE connection")
+            await self.mcp_clients.connect_sse(server_url=server_url)
+        elif self.connection_type == "stdio":
+            if not command:
+                raise ValueError("Command is required for stdio connection")
+            await self.mcp_clients.connect_stdio(command=command, args=args or [])
+        else:
+            raise ValueError(f"Unsupported connection type: {self.connection_type}")
+        # Set available_tools to our MCP instance
+        self.available_tools = self.mcp_clients
+        # Store initial tool schemas
+        await self._refresh_tools()
+        # Add system message about available tools
+        tool_names = list(self.mcp_clients.tool_map.keys())
+        tools_info = ", ".join(tool_names)
+        # Add system prompt and available tools information
+        self.memory.add_message(
+            Message.system_message(
+                f"{self.system_prompt}\n\nAvailable MCP tools: {tools_info}"
+            )
+        )
+    async def _refresh_tools(self) -> Tuple[List[str], List[str]]:
+        """Refresh the list of available tools from the MCP server.
+        Returns:
+            A tuple of (added_tools, removed_tools)
+        """
+        if not self.mcp_clients.sessions:
+            return [], []
+        # Get current tool schemas directly from the server
+        response = await self.mcp_clients.list_tools()
+        current_tools = {tool.name: tool.inputSchema for tool in response.tools}
+        # Determine added, removed, and changed tools
+        current_names = set(current_tools.keys())
+        previous_names = set(self.tool_schemas.keys())
+        added_tools = list(current_names - previous_names)
+        removed_tools = list(previous_names - current_names)
+        # Check for schema changes in existing tools
+        changed_tools = []
+        for name in current_names.intersection(previous_names):
+            if current_tools[name] != self.tool_schemas.get(name):
+                changed_tools.append(name)
+        # Update stored schemas
+        self.tool_schemas = current_tools
+        # Log and notify about changes
+        if added_tools:
+            logger.info(f"Added MCP tools: {added_tools}")
+            self.memory.add_message(
+                Message.system_message(f"New tools available: {', '.join(added_tools)}")
+            )
+        if removed_tools:
+            logger.info(f"Removed MCP tools: {removed_tools}")
+            self.memory.add_message(
+                Message.system_message(
+                    f"Tools no longer available: {', '.join(removed_tools)}"
+                )
+            )
+        if changed_tools:
+            logger.info(f"Changed MCP tools: {changed_tools}")
+        return added_tools, removed_tools
+    async def think(self) -> bool:
+        """Process current state and decide next action."""
+        # Check MCP session and tools availability
+        if not self.mcp_clients.sessions or not self.mcp_clients.tool_map:
+            logger.info("MCP service is no longer available, ending interaction")
+            self.state = AgentState.FINISHED
+            return False
+        # Refresh tools periodically
+        if self.current_step % self._refresh_tools_interval == 0:
+            await self._refresh_tools()
+            # All tools removed indicates shutdown
+            if not self.mcp_clients.tool_map:
+                logger.info("MCP service has shut down, ending interaction")
+                self.state = AgentState.FINISHED
+                return False
+        # Use the parent class's think method
+        return await super().think()
+    async def _handle_special_tool(self, name: str, result: Any, **kwargs) -> None:
+        """Handle special tool execution and state changes"""
+        # First process with parent handler
+        await super()._handle_special_tool(name, result, **kwargs)
+        # Handle multimedia responses
+        if isinstance(result, ToolResult) and result.base64_image:
+            self.memory.add_message(
+                Message.system_message(
+                    MULTIMEDIA_RESPONSE_PROMPT.format(tool_name=name)
+                )
+            )
+    def _should_finish_execution(self, name: str, **kwargs) -> bool:
+        """Determine if tool execution should finish the agent"""
+        # Terminate if the tool name is 'terminate'
+        return name.lower() == "terminate"
+    async def cleanup(self) -> None:
+        """Clean up MCP connection when done."""
+        if self.mcp_clients.sessions:
+            await self.mcp_clients.disconnect()
+            logger.info("MCP connection closed")
+    async def run(self, request: Optional[str] = None) -> str:
+        """Run the agent with cleanup when done."""
+        try:
+            result = await super().run(request)
+            return result
+        finally:
+            # Ensure cleanup happens even if there's an error
+            await self.cleanup()

app/agent/react.py ADDED Viewed

	@@ -0,0 +1,38 @@

+from abc import ABC, abstractmethod
+from typing import Optional
+from pydantic import Field
+from app.agent.base import BaseAgent
+from app.llm import LLM
+from app.schema import AgentState, Memory
+class ReActAgent(BaseAgent, ABC):
+    name: str
+    description: Optional[str] = None
+    system_prompt: Optional[str] = None
+    next_step_prompt: Optional[str] = None
+    llm: Optional[LLM] = Field(default_factory=LLM)
+    memory: Memory = Field(default_factory=Memory)
+    state: AgentState = AgentState.IDLE
+    max_steps: int = 10
+    current_step: int = 0
+    @abstractmethod
+    async def think(self) -> bool:
+        """Process current state and decide next action"""
+    @abstractmethod
+    async def act(self) -> str:
+        """Execute decided actions"""
+    async def step(self) -> str:
+        """Execute a single step: think and act."""
+        should_act = await self.think()
+        if not should_act:
+            return "Thinking complete - no action needed"
+        return await self.act()

app/agent/sandbox_agent.py ADDED Viewed

	@@ -0,0 +1,223 @@

+from typing import Dict, List, Optional
+from pydantic import Field, model_validator
+from app.agent.browser import BrowserContextHelper
+from app.agent.toolcall import ToolCallAgent
+from app.config import config
+from app.daytona.sandbox import create_sandbox, delete_sandbox
+from app.daytona.tool_base import SandboxToolsBase
+from app.logger import logger
+from app.prompt.manus import NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.tool import Terminate, ToolCollection
+from app.tool.ask_human import AskHuman
+from app.tool.mcp import MCPClients, MCPClientTool
+from app.tool.sandbox.sb_browser_tool import SandboxBrowserTool
+from app.tool.sandbox.sb_files_tool import SandboxFilesTool
+from app.tool.sandbox.sb_shell_tool import SandboxShellTool
+from app.tool.sandbox.sb_vision_tool import SandboxVisionTool
+class SandboxManus(ToolCallAgent):
+    """A versatile general-purpose agent with support for both local and MCP tools."""
+    name: str = "SandboxManus"
+    description: str = "A versatile agent that can solve various tasks using multiple sandbox-tools including MCP-based tools"
+    system_prompt: str = SYSTEM_PROMPT.format(directory=config.workspace_root)
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    max_observe: int = 10000
+    max_steps: int = 20
+    # MCP clients for remote tool access
+    mcp_clients: MCPClients = Field(default_factory=MCPClients)
+    # Add general-purpose tools to the tool collection
+    available_tools: ToolCollection = Field(
+        default_factory=lambda: ToolCollection(
+            # PythonExecute(),
+            # BrowserUseTool(),
+            # StrReplaceEditor(),
+            AskHuman(),
+            Terminate(),
+        )
+    )
+    special_tool_names: list[str] = Field(default_factory=lambda: [Terminate().name])
+    browser_context_helper: Optional[BrowserContextHelper] = None
+    # Track connected MCP servers
+    connected_servers: Dict[str, str] = Field(
+        default_factory=dict
+    )  # server_id -> url/command
+    _initialized: bool = False
+    sandbox_link: Optional[dict[str, dict[str, str]]] = Field(default_factory=dict)
+    @model_validator(mode="after")
+    def initialize_helper(self) -> "SandboxManus":
+        """Initialize basic components synchronously."""
+        self.browser_context_helper = BrowserContextHelper(self)
+        return self
+    @classmethod
+    async def create(cls, **kwargs) -> "SandboxManus":
+        """Factory method to create and properly initialize a Manus instance."""
+        instance = cls(**kwargs)
+        await instance.initialize_mcp_servers()
+        await instance.initialize_sandbox_tools()
+        instance._initialized = True
+        return instance
+    async def initialize_sandbox_tools(
+        self,
+        password: str = config.daytona.VNC_password,
+    ) -> None:
+        try:
+            # 创建新沙箱
+            if password:
+                sandbox = create_sandbox(password=password)
+                self.sandbox = sandbox
+            else:
+                raise ValueError("password must be provided")
+            vnc_link = sandbox.get_preview_link(6080)
+            website_link = sandbox.get_preview_link(8080)
+            vnc_url = vnc_link.url if hasattr(vnc_link, "url") else str(vnc_link)
+            website_url = (
+                website_link.url if hasattr(website_link, "url") else str(website_link)
+            )
+            # Get the actual sandbox_id from the created sandbox
+            actual_sandbox_id = sandbox.id if hasattr(sandbox, "id") else "new_sandbox"
+            if not self.sandbox_link:
+                self.sandbox_link = {}
+            self.sandbox_link[actual_sandbox_id] = {
+                "vnc": vnc_url,
+                "website": website_url,
+            }
+            logger.info(f"VNC URL: {vnc_url}")
+            logger.info(f"Website URL: {website_url}")
+            SandboxToolsBase._urls_printed = True
+            sb_tools = [
+                SandboxBrowserTool(sandbox),
+                SandboxFilesTool(sandbox),
+                SandboxShellTool(sandbox),
+                SandboxVisionTool(sandbox),
+            ]
+            self.available_tools.add_tools(*sb_tools)
+        except Exception as e:
+            logger.error(f"Error initializing sandbox tools: {e}")
+            raise
+    async def initialize_mcp_servers(self) -> None:
+        """Initialize connections to configured MCP servers."""
+        for server_id, server_config in config.mcp_config.servers.items():
+            try:
+                if server_config.type == "sse":
+                    if server_config.url:
+                        await self.connect_mcp_server(server_config.url, server_id)
+                        logger.info(
+                            f"Connected to MCP server {server_id} at {server_config.url}"
+                        )
+                elif server_config.type == "stdio":
+                    if server_config.command:
+                        await self.connect_mcp_server(
+                            server_config.command,
+                            server_id,
+                            use_stdio=True,
+                            stdio_args=server_config.args,
+                        )
+                        logger.info(
+                            f"Connected to MCP server {server_id} using command {server_config.command}"
+                        )
+            except Exception as e:
+                logger.error(f"Failed to connect to MCP server {server_id}: {e}")
+    async def connect_mcp_server(
+        self,
+        server_url: str,
+        server_id: str = "",
+        use_stdio: bool = False,
+        stdio_args: List[str] = None,
+    ) -> None:
+        """Connect to an MCP server and add its tools."""
+        if use_stdio:
+            await self.mcp_clients.connect_stdio(
+                server_url, stdio_args or [], server_id
+            )
+            self.connected_servers[server_id or server_url] = server_url
+        else:
+            await self.mcp_clients.connect_sse(server_url, server_id)
+            self.connected_servers[server_id or server_url] = server_url
+        # Update available tools with only the new tools from this server
+        new_tools = [
+            tool for tool in self.mcp_clients.tools if tool.server_id == server_id
+        ]
+        self.available_tools.add_tools(*new_tools)
+    async def disconnect_mcp_server(self, server_id: str = "") -> None:
+        """Disconnect from an MCP server and remove its tools."""
+        await self.mcp_clients.disconnect(server_id)
+        if server_id:
+            self.connected_servers.pop(server_id, None)
+        else:
+            self.connected_servers.clear()
+        # Rebuild available tools without the disconnected server's tools
+        base_tools = [
+            tool
+            for tool in self.available_tools.tools
+            if not isinstance(tool, MCPClientTool)
+        ]
+        self.available_tools = ToolCollection(*base_tools)
+        self.available_tools.add_tools(*self.mcp_clients.tools)
+    async def delete_sandbox(self, sandbox_id: str) -> None:
+        """Delete a sandbox by ID."""
+        try:
+            await delete_sandbox(sandbox_id)
+            logger.info(f"Sandbox {sandbox_id} deleted successfully")
+            if sandbox_id in self.sandbox_link:
+                del self.sandbox_link[sandbox_id]
+        except Exception as e:
+            logger.error(f"Error deleting sandbox {sandbox_id}: {e}")
+            raise e
+    async def cleanup(self):
+        """Clean up Manus agent resources."""
+        if self.browser_context_helper:
+            await self.browser_context_helper.cleanup_browser()
+        # Disconnect from all MCP servers only if we were initialized
+        if self._initialized:
+            await self.disconnect_mcp_server()
+            await self.delete_sandbox(self.sandbox.id if self.sandbox else "unknown")
+            self._initialized = False
+    async def think(self) -> bool:
+        """Process current state and decide next actions with appropriate context."""
+        if not self._initialized:
+            await self.initialize_mcp_servers()
+            self._initialized = True
+        original_prompt = self.next_step_prompt
+        recent_messages = self.memory.messages[-3:] if self.memory.messages else []
+        browser_in_use = any(
+            tc.function.name == SandboxBrowserTool().name
+            for msg in recent_messages
+            if msg.tool_calls
+            for tc in msg.tool_calls
+        )
+        if browser_in_use:
+            self.next_step_prompt = (
+                await self.browser_context_helper.format_next_step_prompt()
+            )
+        result = await super().think()
+        # Restore original prompt
+        self.next_step_prompt = original_prompt
+        return result

app/agent/swe.py ADDED Viewed

	@@ -0,0 +1,24 @@

+from typing import List
+from pydantic import Field
+from app.agent.toolcall import ToolCallAgent
+from app.prompt.swe import SYSTEM_PROMPT
+from app.tool import Bash, StrReplaceEditor, Terminate, ToolCollection
+class SWEAgent(ToolCallAgent):
+    """An agent that implements the SWEAgent paradigm for executing code and natural conversations."""
+    name: str = "swe"
+    description: str = "an autonomous AI programmer that interacts directly with the computer to solve tasks."
+    system_prompt: str = SYSTEM_PROMPT
+    next_step_prompt: str = ""
+    available_tools: ToolCollection = ToolCollection(
+        Bash(), StrReplaceEditor(), Terminate()
+    )
+    special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
+    max_steps: int = 20

app/agent/toolcall.py ADDED Viewed

	@@ -0,0 +1,250 @@

+import asyncio
+import json
+from typing import Any, List, Optional, Union
+from pydantic import Field
+from app.agent.react import ReActAgent
+from app.exceptions import TokenLimitExceeded
+from app.logger import logger
+from app.prompt.toolcall import NEXT_STEP_PROMPT, SYSTEM_PROMPT
+from app.schema import TOOL_CHOICE_TYPE, AgentState, Message, ToolCall, ToolChoice
+from app.tool import CreateChatCompletion, Terminate, ToolCollection
+TOOL_CALL_REQUIRED = "Tool calls required but none provided"
+class ToolCallAgent(ReActAgent):
+    """Base agent class for handling tool/function calls with enhanced abstraction"""
+    name: str = "toolcall"
+    description: str = "an agent that can execute tool calls."
+    system_prompt: str = SYSTEM_PROMPT
+    next_step_prompt: str = NEXT_STEP_PROMPT
+    available_tools: ToolCollection = ToolCollection(
+        CreateChatCompletion(), Terminate()
+    )
+    tool_choices: TOOL_CHOICE_TYPE = ToolChoice.AUTO  # type: ignore
+    special_tool_names: List[str] = Field(default_factory=lambda: [Terminate().name])
+    tool_calls: List[ToolCall] = Field(default_factory=list)
+    _current_base64_image: Optional[str] = None
+    max_steps: int = 30
+    max_observe: Optional[Union[int, bool]] = None
+    async def think(self) -> bool:
+        """Process current state and decide next actions using tools"""
+        if self.next_step_prompt:
+            user_msg = Message.user_message(self.next_step_prompt)
+            self.messages += [user_msg]
+        try:
+            # Get response with tool options
+            response = await self.llm.ask_tool(
+                messages=self.messages,
+                system_msgs=(
+                    [Message.system_message(self.system_prompt)]
+                    if self.system_prompt
+                    else None
+                ),
+                tools=self.available_tools.to_params(),
+                tool_choice=self.tool_choices,
+            )
+        except ValueError:
+            raise
+        except Exception as e:
+            # Check if this is a RetryError containing TokenLimitExceeded
+            if hasattr(e, "__cause__") and isinstance(e.__cause__, TokenLimitExceeded):
+                token_limit_error = e.__cause__
+                logger.error(
+                    f"🚨 Token limit error (from RetryError): {token_limit_error}"
+                )
+                self.memory.add_message(
+                    Message.assistant_message(
+                        f"Maximum token limit reached, cannot continue execution: {str(token_limit_error)}"
+                    )
+                )
+                self.state = AgentState.FINISHED
+                return False
+            raise
+        self.tool_calls = tool_calls = (
+            response.tool_calls if response and response.tool_calls else []
+        )
+        content = response.content if response and response.content else ""
+        # Log response info
+        logger.info(f"✨ {self.name}'s thoughts: {content}")
+        logger.info(
+            f"🛠️ {self.name} selected {len(tool_calls) if tool_calls else 0} tools to use"
+        )
+        if tool_calls:
+            logger.info(
+                f"🧰 Tools being prepared: {[call.function.name for call in tool_calls]}"
+            )
+            logger.info(f"🔧 Tool arguments: {tool_calls[0].function.arguments}")
+        try:
+            if response is None:
+                raise RuntimeError("No response received from the LLM")
+            # Handle different tool_choices modes
+            if self.tool_choices == ToolChoice.NONE:
+                if tool_calls:
+                    logger.warning(
+                        f"🤔 Hmm, {self.name} tried to use tools when they weren't available!"
+                    )
+                if content:
+                    self.memory.add_message(Message.assistant_message(content))
+                    return True
+                return False
+            # Create and add assistant message
+            assistant_msg = (
+                Message.from_tool_calls(content=content, tool_calls=self.tool_calls)
+                if self.tool_calls
+                else Message.assistant_message(content)
+            )
+            self.memory.add_message(assistant_msg)
+            if self.tool_choices == ToolChoice.REQUIRED and not self.tool_calls:
+                return True  # Will be handled in act()
+            # For 'auto' mode, continue with content if no commands but content exists
+            if self.tool_choices == ToolChoice.AUTO and not self.tool_calls:
+                return bool(content)
+            return bool(self.tool_calls)
+        except Exception as e:
+            logger.error(f"🚨 Oops! The {self.name}'s thinking process hit a snag: {e}")
+            self.memory.add_message(
+                Message.assistant_message(
+                    f"Error encountered while processing: {str(e)}"
+                )
+            )
+            return False
+    async def act(self) -> str:
+        """Execute tool calls and handle their results"""
+        if not self.tool_calls:
+            if self.tool_choices == ToolChoice.REQUIRED:
+                raise ValueError(TOOL_CALL_REQUIRED)
+            # Return last message content if no tool calls
+            return self.messages[-1].content or "No content or commands to execute"
+        results = []
+        for command in self.tool_calls:
+            # Reset base64_image for each tool call
+            self._current_base64_image = None
+            result = await self.execute_tool(command)
+            if self.max_observe:
+                result = result[: self.max_observe]
+            logger.info(
+                f"🎯 Tool '{command.function.name}' completed its mission! Result: {result}"
+            )
+            # Add tool response to memory
+            tool_msg = Message.tool_message(
+                content=result,
+                tool_call_id=command.id,
+                name=command.function.name,
+                base64_image=self._current_base64_image,
+            )
+            self.memory.add_message(tool_msg)
+            results.append(result)
+        return "\n\n".join(results)
+    async def execute_tool(self, command: ToolCall) -> str:
+        """Execute a single tool call with robust error handling"""
+        if not command or not command.function or not command.function.name:
+            return "Error: Invalid command format"
+        name = command.function.name
+        if name not in self.available_tools.tool_map:
+            return f"Error: Unknown tool '{name}'"
+        try:
+            # Parse arguments
+            args = json.loads(command.function.arguments or "{}")
+            # Execute the tool
+            logger.info(f"🔧 Activating tool: '{name}'...")
+            result = await self.available_tools.execute(name=name, tool_input=args)
+            # Handle special tools
+            await self._handle_special_tool(name=name, result=result)
+            # Check if result is a ToolResult with base64_image
+            if hasattr(result, "base64_image") and result.base64_image:
+                # Store the base64_image for later use in tool_message
+                self._current_base64_image = result.base64_image
+            # Format result for display (standard case)
+            observation = (
+                f"Observed output of cmd `{name}` executed:\n{str(result)}"
+                if result
+                else f"Cmd `{name}` completed with no output"
+            )
+            return observation
+        except json.JSONDecodeError:
+            error_msg = f"Error parsing arguments for {name}: Invalid JSON format"
+            logger.error(
+                f"📝 Oops! The arguments for '{name}' don't make sense - invalid JSON, arguments:{command.function.arguments}"
+            )
+            return f"Error: {error_msg}"
+        except Exception as e:
+            error_msg = f"⚠️ Tool '{name}' encountered a problem: {str(e)}"
+            logger.exception(error_msg)
+            return f"Error: {error_msg}"
+    async def _handle_special_tool(self, name: str, result: Any, **kwargs):
+        """Handle special tool execution and state changes"""
+        if not self._is_special_tool(name):
+            return
+        if self._should_finish_execution(name=name, result=result, **kwargs):
+            # Set agent state to finished
+            logger.info(f"🏁 Special tool '{name}' has completed the task!")
+            self.state = AgentState.FINISHED
+    @staticmethod
+    def _should_finish_execution(**kwargs) -> bool:
+        """Determine if tool execution should finish the agent"""
+        return True
+    def _is_special_tool(self, name: str) -> bool:
+        """Check if tool name is in special tools list"""
+        return name.lower() in [n.lower() for n in self.special_tool_names]
+    async def cleanup(self):
+        """Clean up resources used by the agent's tools."""
+        logger.info(f"🧹 Cleaning up resources for agent '{self.name}'...")
+        for tool_name, tool_instance in self.available_tools.tool_map.items():
+            if hasattr(tool_instance, "cleanup") and asyncio.iscoroutinefunction(
+                tool_instance.cleanup
+            ):
+                try:
+                    logger.debug(f"🧼 Cleaning up tool: {tool_name}")
+                    await tool_instance.cleanup()
+                except Exception as e:
+                    logger.error(
+                        f"🚨 Error cleaning up tool '{tool_name}': {e}", exc_info=True
+                    )
+        logger.info(f"✨ Cleanup complete for agent '{self.name}'.")
+    async def run(self, request: Optional[str] = None) -> str:
+        """Run the agent with cleanup when done."""
+        try:
+            return await super().run(request)
+        finally:
+            await self.cleanup()

app/bedrock.py ADDED Viewed

	@@ -0,0 +1,334 @@

+import json
+import sys
+import time
+import uuid
+from datetime import datetime
+from typing import Dict, List, Literal, Optional
+import boto3
+# Global variables to track the current tool use ID across function calls
+# Tmp solution
+CURRENT_TOOLUSE_ID = None
+# Class to handle OpenAI-style response formatting
+class OpenAIResponse:
+    def __init__(self, data):
+        # Recursively convert nested dicts and lists to OpenAIResponse objects
+        for key, value in data.items():
+            if isinstance(value, dict):
+                value = OpenAIResponse(value)
+            elif isinstance(value, list):
+                value = [
+                    OpenAIResponse(item) if isinstance(item, dict) else item
+                    for item in value
+                ]
+            setattr(self, key, value)
+    def model_dump(self, *args, **kwargs):
+        # Convert object to dict and add timestamp
+        data = self.__dict__
+        data["created_at"] = datetime.now().isoformat()
+        return data
+# Main client class for interacting with Amazon Bedrock
+class BedrockClient:
+    def __init__(self):
+        # Initialize Bedrock client, you need to configure AWS env first
+        try:
+            self.client = boto3.client("bedrock-runtime")
+            self.chat = Chat(self.client)
+        except Exception as e:
+            print(f"Error initializing Bedrock client: {e}")
+            sys.exit(1)
+# Chat interface class
+class Chat:
+    def __init__(self, client):
+        self.completions = ChatCompletions(client)
+# Core class handling chat completions functionality
+class ChatCompletions:
+    def __init__(self, client):
+        self.client = client
+    def _convert_openai_tools_to_bedrock_format(self, tools):
+        # Convert OpenAI function calling format to Bedrock tool format
+        bedrock_tools = []
+        for tool in tools:
+            if tool.get("type") == "function":
+                function = tool.get("function", {})
+                bedrock_tool = {
+                    "toolSpec": {
+                        "name": function.get("name", ""),
+                        "description": function.get("description", ""),
+                        "inputSchema": {
+                            "json": {
+                                "type": "object",
+                                "properties": function.get("parameters", {}).get(
+                                    "properties", {}
+                                ),
+                                "required": function.get("parameters", {}).get(
+                                    "required", []
+                                ),
+                            }
+                        },
+                    }
+                }
+                bedrock_tools.append(bedrock_tool)
+        return bedrock_tools
+    def _convert_openai_messages_to_bedrock_format(self, messages):
+        # Convert OpenAI message format to Bedrock message format
+        bedrock_messages = []
+        system_prompt = []
+        for message in messages:
+            if message.get("role") == "system":
+                system_prompt = [{"text": message.get("content")}]
+            elif message.get("role") == "user":
+                bedrock_message = {
+                    "role": message.get("role", "user"),
+                    "content": [{"text": message.get("content")}],
+                }
+                bedrock_messages.append(bedrock_message)
+            elif message.get("role") == "assistant":
+                bedrock_message = {
+                    "role": "assistant",
+                    "content": [{"text": message.get("content")}],
+                }
+                openai_tool_calls = message.get("tool_calls", [])
+                if openai_tool_calls:
+                    bedrock_tool_use = {
+                        "toolUseId": openai_tool_calls[0]["id"],
+                        "name": openai_tool_calls[0]["function"]["name"],
+                        "input": json.loads(
+                            openai_tool_calls[0]["function"]["arguments"]
+                        ),
+                    }
+                    bedrock_message["content"].append({"toolUse": bedrock_tool_use})
+                    global CURRENT_TOOLUSE_ID
+                    CURRENT_TOOLUSE_ID = openai_tool_calls[0]["id"]
+                bedrock_messages.append(bedrock_message)
+            elif message.get("role") == "tool":
+                bedrock_message = {
+                    "role": "user",
+                    "content": [
+                        {
+                            "toolResult": {
+                                "toolUseId": CURRENT_TOOLUSE_ID,
+                                "content": [{"text": message.get("content")}],
+                            }
+                        }
+                    ],
+                }
+                bedrock_messages.append(bedrock_message)
+            else:
+                raise ValueError(f"Invalid role: {message.get('role')}")
+        return system_prompt, bedrock_messages
+    def _convert_bedrock_response_to_openai_format(self, bedrock_response):
+        # Convert Bedrock response format to OpenAI format
+        content = ""
+        if bedrock_response.get("output", {}).get("message", {}).get("content"):
+            content_array = bedrock_response["output"]["message"]["content"]
+            content = "".join(item.get("text", "") for item in content_array)
+        if content == "":
+            content = "."
+        # Handle tool calls in response
+        openai_tool_calls = []
+        if bedrock_response.get("output", {}).get("message", {}).get("content"):
+            for content_item in bedrock_response["output"]["message"]["content"]:
+                if content_item.get("toolUse"):
+                    bedrock_tool_use = content_item["toolUse"]
+                    global CURRENT_TOOLUSE_ID
+                    CURRENT_TOOLUSE_ID = bedrock_tool_use["toolUseId"]
+                    openai_tool_call = {
+                        "id": CURRENT_TOOLUSE_ID,
+                        "type": "function",
+                        "function": {
+                            "name": bedrock_tool_use["name"],
+                            "arguments": json.dumps(bedrock_tool_use["input"]),
+                        },
+                    }
+                    openai_tool_calls.append(openai_tool_call)
+        # Construct final OpenAI format response
+        openai_format = {
+            "id": f"chatcmpl-{uuid.uuid4()}",
+            "created": int(time.time()),
+            "object": "chat.completion",
+            "system_fingerprint": None,
+            "choices": [
+                {
+                    "finish_reason": bedrock_response.get("stopReason", "end_turn"),
+                    "index": 0,
+                    "message": {
+                        "content": content,
+                        "role": bedrock_response.get("output", {})
+                        .get("message", {})
+                        .get("role", "assistant"),
+                        "tool_calls": openai_tool_calls
+                        if openai_tool_calls != []
+                        else None,
+                        "function_call": None,
+                    },
+                }
+            ],
+            "usage": {
+                "completion_tokens": bedrock_response.get("usage", {}).get(
+                    "outputTokens", 0
+                ),
+                "prompt_tokens": bedrock_response.get("usage", {}).get(
+                    "inputTokens", 0
+                ),
+                "total_tokens": bedrock_response.get("usage", {}).get("totalTokens", 0),
+            },
+        }
+        return OpenAIResponse(openai_format)
+    async def _invoke_bedrock(
+        self,
+        model: str,
+        messages: List[Dict[str, str]],
+        max_tokens: int,
+        temperature: float,
+        tools: Optional[List[dict]] = None,
+        tool_choice: Literal["none", "auto", "required"] = "auto",
+        **kwargs,
+    ) -> OpenAIResponse:
+        # Non-streaming invocation of Bedrock model
+        (
+            system_prompt,
+            bedrock_messages,
+        ) = self._convert_openai_messages_to_bedrock_format(messages)
+        response = self.client.converse(
+            modelId=model,
+            system=system_prompt,
+            messages=bedrock_messages,
+            inferenceConfig={"temperature": temperature, "maxTokens": max_tokens},
+            toolConfig={"tools": tools} if tools else None,
+        )
+        openai_response = self._convert_bedrock_response_to_openai_format(response)
+        return openai_response
+    async def _invoke_bedrock_stream(
+        self,
+        model: str,
+        messages: List[Dict[str, str]],
+        max_tokens: int,
+        temperature: float,
+        tools: Optional[List[dict]] = None,
+        tool_choice: Literal["none", "auto", "required"] = "auto",
+        **kwargs,
+    ) -> OpenAIResponse:
+        # Streaming invocation of Bedrock model
+        (
+            system_prompt,
+            bedrock_messages,
+        ) = self._convert_openai_messages_to_bedrock_format(messages)
+        response = self.client.converse_stream(
+            modelId=model,
+            system=system_prompt,
+            messages=bedrock_messages,
+            inferenceConfig={"temperature": temperature, "maxTokens": max_tokens},
+            toolConfig={"tools": tools} if tools else None,
+        )
+        # Initialize response structure
+        bedrock_response = {
+            "output": {"message": {"role": "", "content": []}},
+            "stopReason": "",
+            "usage": {},
+            "metrics": {},
+        }
+        bedrock_response_text = ""
+        bedrock_response_tool_input = ""
+        # Process streaming response
+        stream = response.get("stream")
+        if stream:
+            for event in stream:
+                if event.get("messageStart", {}).get("role"):
+                    bedrock_response["output"]["message"]["role"] = event[
+                        "messageStart"
+                    ]["role"]
+                if event.get("contentBlockDelta", {}).get("delta", {}).get("text"):
+                    bedrock_response_text += event["contentBlockDelta"]["delta"]["text"]
+                    print(
+                        event["contentBlockDelta"]["delta"]["text"], end="", flush=True
+                    )
+                if event.get("contentBlockStop", {}).get("contentBlockIndex") == 0:
+                    bedrock_response["output"]["message"]["content"].append(
+                        {"text": bedrock_response_text}
+                    )
+                if event.get("contentBlockStart", {}).get("start", {}).get("toolUse"):
+                    bedrock_tool_use = event["contentBlockStart"]["start"]["toolUse"]
+                    tool_use = {
+                        "toolUseId": bedrock_tool_use["toolUseId"],
+                        "name": bedrock_tool_use["name"],
+                    }
+                    bedrock_response["output"]["message"]["content"].append(
+                        {"toolUse": tool_use}
+                    )
+                    global CURRENT_TOOLUSE_ID
+                    CURRENT_TOOLUSE_ID = bedrock_tool_use["toolUseId"]
+                if event.get("contentBlockDelta", {}).get("delta", {}).get("toolUse"):
+                    bedrock_response_tool_input += event["contentBlockDelta"]["delta"][
+                        "toolUse"
+                    ]["input"]
+                    print(
+                        event["contentBlockDelta"]["delta"]["toolUse"]["input"],
+                        end="",
+                        flush=True,
+                    )
+                if event.get("contentBlockStop", {}).get("contentBlockIndex") == 1:
+                    bedrock_response["output"]["message"]["content"][1]["toolUse"][
+                        "input"
+                    ] = json.loads(bedrock_response_tool_input)
+        print()
+        openai_response = self._convert_bedrock_response_to_openai_format(
+            bedrock_response
+        )
+        return openai_response
+    def create(
+        self,
+        model: str,
+        messages: List[Dict[str, str]],
+        max_tokens: int,
+        temperature: float,
+        stream: Optional[bool] = True,
+        tools: Optional[List[dict]] = None,
+        tool_choice: Literal["none", "auto", "required"] = "auto",
+        **kwargs,
+    ) -> OpenAIResponse:
+        # Main entry point for chat completion
+        bedrock_tools = []
+        if tools is not None:
+            bedrock_tools = self._convert_openai_tools_to_bedrock_format(tools)
+        if stream:
+            return self._invoke_bedrock_stream(
+                model,
+                messages,
+                max_tokens,
+                temperature,
+                bedrock_tools,
+                tool_choice,
+                **kwargs,
+            )
+        else:
+            return self._invoke_bedrock(
+                model,
+                messages,
+                max_tokens,
+                temperature,
+                bedrock_tools,
+                tool_choice,
+                **kwargs,
+            )

app/config.py ADDED Viewed

	@@ -0,0 +1,384 @@

+import json
+import threading
+import tomllib
+import os
+from pathlib import Path
+from typing import Dict, List, Optional
+from pydantic import BaseModel, Field
+def get_project_root() -> Path:
+    """Get the project root directory"""
+    return Path(__file__).resolve().parent.parent
+PROJECT_ROOT = get_project_root()
+WORKSPACE_ROOT = PROJECT_ROOT / "workspace"
+class LLMSettings(BaseModel):
+    model: str = Field(..., description="Model name")
+    base_url: str = Field(..., description="API base URL")
+    api_key: str = Field(..., description="API key")
+    max_tokens: int = Field(4096, description="Maximum number of tokens per request")
+    max_input_tokens: Optional[int] = Field(
+        None,
+        description="Maximum input tokens to use across all requests (None for unlimited)",
+    )
+    temperature: float = Field(1.0, description="Sampling temperature")
+    api_type: str = Field(..., description="Azure, Openai, or Ollama")
+    api_version: str = Field(..., description="Azure Openai version if AzureOpenai")
+class ProxySettings(BaseModel):
+    server: str = Field(None, description="Proxy server address")
+    username: Optional[str] = Field(None, description="Proxy username")
+    password: Optional[str] = Field(None, description="Proxy password")
+class SearchSettings(BaseModel):
+    engine: str = Field(default="Google", description="Search engine the llm to use")
+    fallback_engines: List[str] = Field(
+        default_factory=lambda: ["DuckDuckGo", "Baidu", "Bing"],
+        description="Fallback search engines to try if the primary engine fails",
+    )
+    retry_delay: int = Field(
+        default=60,
+        description="Seconds to wait before retrying all engines again after they all fail",
+    )
+    max_retries: int = Field(
+        default=3,
+        description="Maximum number of times to retry all engines when all fail",
+    )
+    lang: str = Field(
+        default="en",
+        description="Language code for search results (e.g., en, zh, fr)",
+    )
+    country: str = Field(
+        default="us",
+        description="Country code for search results (e.g., us, cn, uk)",
+    )
+class RunflowSettings(BaseModel):
+    use_data_analysis_agent: bool = Field(
+        default=False, description="Enable data analysis agent in run flow"
+    )
+class BrowserSettings(BaseModel):
+    headless: bool = Field(False, description="Whether to run browser in headless mode")
+    disable_security: bool = Field(
+        True, description="Disable browser security features"
+    )
+    extra_chromium_args: List[str] = Field(
+        default_factory=list, description="Extra arguments to pass to the browser"
+    )
+    chrome_instance_path: Optional[str] = Field(
+        None, description="Path to a Chrome instance to use"
+    )
+    wss_url: Optional[str] = Field(
+        None, description="Connect to a browser instance via WebSocket"
+    )
+    cdp_url: Optional[str] = Field(
+        None, description="Connect to a browser instance via CDP"
+    )
+    proxy: Optional[ProxySettings] = Field(
+        None, description="Proxy settings for the browser"
+    )
+    max_content_length: int = Field(
+        2000, description="Maximum length for content retrieval operations"
+    )
+class SandboxSettings(BaseModel):
+    """Configuration for the execution sandbox"""
+    use_sandbox: bool = Field(False, description="Whether to use the sandbox")
+    image: str = Field("python:3.12-slim", description="Base image")
+    work_dir: str = Field("/workspace", description="Container working directory")
+    memory_limit: str = Field("512m", description="Memory limit")
+    cpu_limit: float = Field(1.0, description="CPU limit")
+    timeout: int = Field(300, description="Default command timeout (seconds)")
+    network_enabled: bool = Field(
+        False, description="Whether network access is allowed"
+    )
+class DaytonaSettings(BaseModel):
+    daytona_api_key: str
+    daytona_server_url: Optional[str] = Field(
+        "https://app.daytona.io/api", description=""
+    )
+    daytona_target: Optional[str] = Field("us", description="enum ['eu', 'us']")
+    sandbox_image_name: Optional[str] = Field("whitezxj/sandbox:0.1.0", description="")
+    sandbox_entrypoint: Optional[str] = Field(
+        "/usr/bin/supervisord -n -c /etc/supervisor/conf.d/supervisord.conf",
+        description="",
+    )
+    # sandbox_id: Optional[str] = Field(
+    #     None, description="ID of the daytona sandbox to use, if any"
+    # )
+    VNC_password: Optional[str] = Field(
+        "123456", description="VNC password for the vnc service in sandbox"
+    )
+class MCPServerConfig(BaseModel):
+    """Configuration for a single MCP server"""
+    type: str = Field(..., description="Server connection type (sse or stdio)")
+    url: Optional[str] = Field(None, description="Server URL for SSE connections")
+    command: Optional[str] = Field(None, description="Command for stdio connections")
+    args: List[str] = Field(
+        default_factory=list, description="Arguments for stdio command"
+    )
+class MCPSettings(BaseModel):
+    """Configuration for MCP (Model Context Protocol)"""
+    server_reference: str = Field(
+        "app.mcp.server", description="Module reference for the MCP server"
+    )
+    servers: Dict[str, MCPServerConfig] = Field(
+        default_factory=dict, description="MCP server configurations"
+    )
+    @classmethod
+    def load_server_config(cls) -> Dict[str, MCPServerConfig]:
+        """Load MCP server configuration from JSON file"""
+        config_path = PROJECT_ROOT / "config" / "mcp.json"
+        try:
+            config_file = config_path if config_path.exists() else None
+            if not config_file:
+                return {}
+            with config_file.open() as f:
+                data = json.load(f)
+                servers = {}
+                for server_id, server_config in data.get("mcpServers", {}).items():
+                    servers[server_id] = MCPServerConfig(
+                        type=server_config["type"],
+                        url=server_config.get("url"),
+                        command=server_config.get("command"),
+                        args=server_config.get("args", []),
+                    )
+                return servers
+        except Exception as e:
+            raise ValueError(f"Failed to load MCP server config: {e}")
+class AppConfig(BaseModel):
+    llm: Dict[str, LLMSettings]
+    sandbox: Optional[SandboxSettings] = Field(
+        None, description="Sandbox configuration"
+    )
+    browser_config: Optional[BrowserSettings] = Field(
+        None, description="Browser configuration"
+    )
+    search_config: Optional[SearchSettings] = Field(
+        None, description="Search configuration"
+    )
+    mcp_config: Optional[MCPSettings] = Field(None, description="MCP configuration")
+    run_flow_config: Optional[RunflowSettings] = Field(
+        None, description="Run flow configuration"
+    )
+    daytona_config: Optional[DaytonaSettings] = Field(
+        None, description="Daytona configuration"
+    )
+    class Config:
+        arbitrary_types_allowed = True
+class Config:
+    _instance = None
+    _lock = threading.Lock()
+    _initialized = False
+    def __new__(cls):
+        if cls._instance is None:
+            with cls._lock:
+                if cls._instance is None:
+                    cls._instance = super().__new__(cls)
+        return cls._instance
+    def __init__(self):
+        if not self._initialized:
+            with self._lock:
+                if not self._initialized:
+                    self._config = None
+                    self._load_initial_config()
+                    self._initialized = True
+    @staticmethod
+    def _get_config_path() -> Path:
+        root = PROJECT_ROOT
+        config_path = root / "config" / "config.toml"
+        if config_path.exists():
+            return config_path
+        example_path = root / "config" / "config.example.toml"
+        if example_path.exists():
+            return example_path
+        raise FileNotFoundError("No configuration file found in config directory")
+    def _load_config(self) -> dict:
+        config_path = self._get_config_path()
+        with config_path.open("rb") as f:
+            config_data = tomllib.load(f)
+        # Override with environment variables if present
+        if "llm" not in config_data:
+            config_data["llm"] = {}
+        if os.environ.get("GEMINI_API_KEY"):
+            config_data["llm"]["api_key"] = os.environ.get("GEMINI_API_KEY")
+            config_data["llm"]["base_url"] = "https://generativelanguage.googleapis.com/v1beta/openai/"
+            config_data["llm"]["model"] = os.environ.get("GEMINI_MODEL", "gemini-1.5-pro")
+        return config_data
+    def _load_initial_config(self):
+        raw_config = self._load_config()
+        base_llm = raw_config.get("llm", {})
+        llm_overrides = {
+            k: v for k, v in raw_config.get("llm", {}).items() if isinstance(v, dict)
+        }
+        default_settings = {
+            "model": base_llm.get("model"),
+            "base_url": base_llm.get("base_url"),
+            "api_key": base_llm.get("api_key"),
+            "max_tokens": base_llm.get("max_tokens", 4096),
+            "max_input_tokens": base_llm.get("max_input_tokens"),
+            "temperature": base_llm.get("temperature", 1.0),
+            "api_type": base_llm.get("api_type", ""),
+            "api_version": base_llm.get("api_version", ""),
+        }
+        # handle browser config.
+        browser_config = raw_config.get("browser", {})
+        browser_settings = None
+        if browser_config:
+            # handle proxy settings.
+            proxy_config = browser_config.get("proxy", {})
+            proxy_settings = None
+            if proxy_config and proxy_config.get("server"):
+                proxy_settings = ProxySettings(
+                    **{
+                        k: v
+                        for k, v in proxy_config.items()
+                        if k in ["server", "username", "password"] and v
+                    }
+                )
+            # filter valid browser config parameters.
+            valid_browser_params = {
+                k: v
+                for k, v in browser_config.items()
+                if k in BrowserSettings.__annotations__ and v is not None
+            }
+            # if there is proxy settings, add it to the parameters.
+            if proxy_settings:
+                valid_browser_params["proxy"] = proxy_settings
+            # only create BrowserSettings when there are valid parameters.
+            if valid_browser_params:
+                browser_settings = BrowserSettings(**valid_browser_params)
+        search_config = raw_config.get("search", {})
+        search_settings = None
+        if search_config:
+            search_settings = SearchSettings(**search_config)
+        sandbox_config = raw_config.get("sandbox", {})
+        if sandbox_config:
+            sandbox_settings = SandboxSettings(**sandbox_config)
+        else:
+            sandbox_settings = SandboxSettings()
+        daytona_config = raw_config.get("daytona", {})
+        if daytona_config:
+            daytona_settings = DaytonaSettings(**daytona_config)
+        else:
+            daytona_settings = DaytonaSettings()
+        mcp_config = raw_config.get("mcp", {})
+        mcp_settings = None
+        if mcp_config:
+            # Load server configurations from JSON
+            mcp_config["servers"] = MCPSettings.load_server_config()
+            mcp_settings = MCPSettings(**mcp_config)
+        else:
+            mcp_settings = MCPSettings(servers=MCPSettings.load_server_config())
+        run_flow_config = raw_config.get("runflow")
+        if run_flow_config:
+            run_flow_settings = RunflowSettings(**run_flow_config)
+        else:
+            run_flow_settings = RunflowSettings()
+        config_dict = {
+            "llm": {
+                "default": default_settings,
+                **{
+                    name: {**default_settings, **override_config}
+                    for name, override_config in llm_overrides.items()
+                },
+            },
+            "sandbox": sandbox_settings,
+            "browser_config": browser_settings,
+            "search_config": search_settings,
+            "mcp_config": mcp_settings,
+            "run_flow_config": run_flow_settings,
+            "daytona_config": daytona_settings,
+        }
+        self._config = AppConfig(**config_dict)
+    @property
+    def llm(self) -> Dict[str, LLMSettings]:
+        return self._config.llm
+    @property
+    def sandbox(self) -> SandboxSettings:
+        return self._config.sandbox
+    @property
+    def daytona(self) -> DaytonaSettings:
+        return self._config.daytona_config
+    @property
+    def browser_config(self) -> Optional[BrowserSettings]:
+        return self._config.browser_config
+    @property
+    def search_config(self) -> Optional[SearchSettings]:
+        return self._config.search_config
+    @property
+    def mcp_config(self) -> MCPSettings:
+        """Get the MCP configuration"""
+        return self._config.mcp_config
+    @property
+    def run_flow_config(self) -> RunflowSettings:
+        """Get the Run Flow configuration"""
+        return self._config.run_flow_config
+    @property
+    def workspace_root(self) -> Path:
+        """Get the workspace root directory"""
+        return WORKSPACE_ROOT
+    @property
+    def root_path(self) -> Path:
+        """Get the root path of the application"""
+        return PROJECT_ROOT
+config = Config()

app/daytona/README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+# Agent with Daytona sandbox
+## Prerequisites
+- conda activate 'Your OpenManus python env'
+- pip install daytona==0.21.8 structlog==25.4.0
+## Setup & Running
+1. daytona config :
+   ```bash
+   cd OpenManus
+   cp config/config.example-daytona.toml config/config.toml
+   ```
+2. get daytona apikey :
+   goto https://app.daytona.io/dashboard/keys and create your apikey
+3. set your apikey in config.toml
+   ```toml
+   # daytona config
+   [daytona]
+   daytona_api_key = ""
+   #daytona_server_url = "https://app.daytona.io/api"
+   #daytona_target = "us"                                   #Daytona is currently available in the following regions:United States (us)、Europe (eu)
+   #sandbox_image_name = "whitezxj/sandbox:0.1.0"                #If you don't use this default image,sandbox tools may be useless
+   #sandbox_entrypoint = "/usr/bin/supervisord -n -c /etc/supervisor/conf.d/supervisord.conf"   #If you change this entrypoint,server in sandbox may be useless
+   #VNC_password =                                          #The password you set to log in sandbox by VNC,it will be 123456 if you don't set
+   ```
+2. Run :
+   ```bash
+   cd OpenManus
+   python sandbox_main.py
+   ```
+3. Send tasks to Agent
+   You can sent tasks to Agent by terminate,agent will use sandbox tools to handle your tasks.
+4. See results
+   If agent use sb_browser_use tool, you can see the operations by VNC link, The VNC link will print in the termination,e.g.:https://6080-sandbox-123456.h7890.daytona.work.
+   If agent use sb_shell tool, you can see the results by terminate of sandbox in https://app.daytona.io/dashboard/sandboxes.
+   Agent can use sb_files tool to operate files to sandbox.
+## Example
+ You can send task e.g.:"帮我在https://hk.trip.com/travel-guide/guidebook/nanjing-9696/?ishideheader=true&isHideNavBar=YES&disableFontScaling=1&catalogId=514634&locale=zh-HK查询相关信息上制定一份南京旅游攻略，并在工作区保存为index.html"
+ Then you can see the agent's browser action in VNC link(https://6080-sandbox-123456.h7890.proxy.daytona.work) and you can see the html made by agent in Website URL(https://8080-sandbox-123456.h7890.proxy.daytona.work).
+## Learn More
+- [Daytona Documentation](https://www.daytona.io/docs/)

app/daytona/sandbox.py ADDED Viewed

	@@ -0,0 +1,165 @@

+import time
+from daytona import (
+    CreateSandboxFromImageParams,
+    Daytona,
+    DaytonaConfig,
+    Resources,
+    Sandbox,
+    SandboxState,
+    SessionExecuteRequest,
+)
+from app.config import config
+from app.utils.logger import logger
+# load_dotenv()
+daytona_settings = config.daytona
+logger.info("Initializing Daytona sandbox configuration")
+daytona_config = DaytonaConfig(
+    api_key=daytona_settings.daytona_api_key,
+    server_url=daytona_settings.daytona_server_url,
+    target=daytona_settings.daytona_target,
+)
+if daytona_config.api_key:
+    logger.info("Daytona API key configured successfully")
+else:
+    logger.warning("No Daytona API key found in environment variables")
+if daytona_config.server_url:
+    logger.info(f"Daytona server URL set to: {daytona_config.server_url}")
+else:
+    logger.warning("No Daytona server URL found in environment variables")
+if daytona_config.target:
+    logger.info(f"Daytona target set to: {daytona_config.target}")
+else:
+    logger.warning("No Daytona target found in environment variables")
+daytona = Daytona(daytona_config)
+logger.info("Daytona client initialized")
+async def get_or_start_sandbox(sandbox_id: str):
+    """Retrieve a sandbox by ID, check its state, and start it if needed."""
+    logger.info(f"Getting or starting sandbox with ID: {sandbox_id}")
+    try:
+        sandbox = daytona.get(sandbox_id)
+        # Check if sandbox needs to be started
+        if (
+            sandbox.state == SandboxState.ARCHIVED
+            or sandbox.state == SandboxState.STOPPED
+        ):
+            logger.info(f"Sandbox is in {sandbox.state} state. Starting...")
+            try:
+                daytona.start(sandbox)
+                # Wait a moment for the sandbox to initialize
+                # sleep(5)
+                # Refresh sandbox state after starting
+                sandbox = daytona.get(sandbox_id)
+                # Start supervisord in a session when restarting
+                start_supervisord_session(sandbox)
+            except Exception as e:
+                logger.error(f"Error starting sandbox: {e}")
+                raise e
+        logger.info(f"Sandbox {sandbox_id} is ready")
+        return sandbox
+    except Exception as e:
+        logger.error(f"Error retrieving or starting sandbox: {str(e)}")
+        raise e
+def start_supervisord_session(sandbox: Sandbox):
+    """Start supervisord in a session."""
+    session_id = "supervisord-session"
+    try:
+        logger.info(f"Creating session {session_id} for supervisord")
+        sandbox.process.create_session(session_id)
+        # Execute supervisord command
+        sandbox.process.execute_session_command(
+            session_id,
+            SessionExecuteRequest(
+                command="exec /usr/bin/supervisord -n -c /etc/supervisor/conf.d/supervisord.conf",
+                var_async=True,
+            ),
+        )
+        time.sleep(25)  # Wait a bit to ensure supervisord starts properly
+        logger.info(f"Supervisord started in session {session_id}")
+    except Exception as e:
+        logger.error(f"Error starting supervisord session: {str(e)}")
+        raise e
+def create_sandbox(password: str, project_id: str = None):
+    """Create a new sandbox with all required services configured and running."""
+    logger.info("Creating new Daytona sandbox environment")
+    logger.info("Configuring sandbox with browser-use image and environment variables")
+    labels = None
+    if project_id:
+        logger.info(f"Using sandbox_id as label: {project_id}")
+        labels = {"id": project_id}
+    params = CreateSandboxFromImageParams(
+        image=daytona_settings.sandbox_image_name,
+        public=True,
+        labels=labels,
+        env_vars={
+            "CHROME_PERSISTENT_SESSION": "true",
+            "RESOLUTION": "1024x768x24",
+            "RESOLUTION_WIDTH": "1024",
+            "RESOLUTION_HEIGHT": "768",
+            "VNC_PASSWORD": password,
+            "ANONYMIZED_TELEMETRY": "false",
+            "CHROME_PATH": "",
+            "CHROME_USER_DATA": "",
+            "CHROME_DEBUGGING_PORT": "9222",
+            "CHROME_DEBUGGING_HOST": "localhost",
+            "CHROME_CDP": "",
+        },
+        resources=Resources(
+            cpu=2,
+            memory=4,
+            disk=5,
+        ),
+        auto_stop_interval=15,
+        auto_archive_interval=24 * 60,
+    )
+    # Create the sandbox
+    sandbox = daytona.create(params)
+    logger.info(f"Sandbox created with ID: {sandbox.id}")
+    # Start supervisord in a session for new sandbox
+    start_supervisord_session(sandbox)
+    logger.info(f"Sandbox environment successfully initialized")
+    return sandbox
+async def delete_sandbox(sandbox_id: str):
+    """Delete a sandbox by its ID."""
+    logger.info(f"Deleting sandbox with ID: {sandbox_id}")
+    try:
+        # Get the sandbox
+        sandbox = daytona.get(sandbox_id)
+        # Delete the sandbox
+        daytona.delete(sandbox)
+        logger.info(f"Successfully deleted sandbox {sandbox_id}")
+        return True
+    except Exception as e:
+        logger.error(f"Error deleting sandbox {sandbox_id}: {str(e)}")
+        raise e

app/daytona/tool_base.py ADDED Viewed

	@@ -0,0 +1,138 @@

+from dataclasses import dataclass, field
+from datetime import datetime
+from typing import Any, ClassVar, Dict, Optional
+from daytona import Daytona, DaytonaConfig, Sandbox, SandboxState
+from pydantic import Field
+from app.config import config
+from app.daytona.sandbox import create_sandbox, start_supervisord_session
+from app.tool.base import BaseTool
+from app.utils.files_utils import clean_path
+from app.utils.logger import logger
+# load_dotenv()
+daytona_settings = config.daytona
+daytona_config = DaytonaConfig(
+    api_key=daytona_settings.daytona_api_key,
+    server_url=daytona_settings.daytona_server_url,
+    target=daytona_settings.daytona_target,
+)
+daytona = Daytona(daytona_config)
+@dataclass
+class ThreadMessage:
+    """
+    Represents a message to be added to a thread.
+    """
+    type: str
+    content: Dict[str, Any]
+    is_llm_message: bool = False
+    metadata: Optional[Dict[str, Any]] = None
+    timestamp: Optional[float] = field(
+        default_factory=lambda: datetime.now().timestamp()
+    )
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert the message to a dictionary for API calls"""
+        return {
+            "type": self.type,
+            "content": self.content,
+            "is_llm_message": self.is_llm_message,
+            "metadata": self.metadata or {},
+            "timestamp": self.timestamp,
+        }
+class SandboxToolsBase(BaseTool):
+    """Base class for all sandbox tools that provides project-based sandbox access."""
+    # Class variable to track if sandbox URLs have been printed
+    _urls_printed: ClassVar[bool] = False
+    # Required fields
+    project_id: Optional[str] = None
+    # thread_manager: Optional[ThreadManager] = None
+    # Private fields (not part of the model schema)
+    _sandbox: Optional[Sandbox] = None
+    _sandbox_id: Optional[str] = None
+    _sandbox_pass: Optional[str] = None
+    workspace_path: str = Field(default="/workspace", exclude=True)
+    _sessions: dict[str, str] = {}
+    class Config:
+        arbitrary_types_allowed = True  # Allow non-pydantic types like ThreadManager
+        underscore_attrs_are_private = True
+    async def _ensure_sandbox(self) -> Sandbox:
+        """Ensure we have a valid sandbox instance, retrieving it from the project if needed."""
+        if self._sandbox is None:
+            # Get or start the sandbox
+            try:
+                self._sandbox = create_sandbox(password=config.daytona.VNC_password)
+                # Log URLs if not already printed
+                if not SandboxToolsBase._urls_printed:
+                    vnc_link = self._sandbox.get_preview_link(6080)
+                    website_link = self._sandbox.get_preview_link(8080)
+                    vnc_url = (
+                        vnc_link.url if hasattr(vnc_link, "url") else str(vnc_link)
+                    )
+                    website_url = (
+                        website_link.url
+                        if hasattr(website_link, "url")
+                        else str(website_link)
+                    )
+                    print("\033[95m***")
+                    print(f"VNC URL: {vnc_url}")
+                    print(f"Website URL: {website_url}")
+                    print("***\033[0m")
+                    SandboxToolsBase._urls_printed = True
+            except Exception as e:
+                logger.error(f"Error retrieving or starting sandbox: {str(e)}")
+                raise e
+        else:
+            if (
+                self._sandbox.state == SandboxState.ARCHIVED
+                or self._sandbox.state == SandboxState.STOPPED
+            ):
+                logger.info(f"Sandbox is in {self._sandbox.state} state. Starting...")
+                try:
+                    daytona.start(self._sandbox)
+                    # Wait a moment for the sandbox to initialize
+                    # sleep(5)
+                    # Refresh sandbox state after starting
+                    # Start supervisord in a session when restarting
+                    start_supervisord_session(self._sandbox)
+                except Exception as e:
+                    logger.error(f"Error starting sandbox: {e}")
+                    raise e
+        return self._sandbox
+    @property
+    def sandbox(self) -> Sandbox:
+        """Get the sandbox instance, ensuring it exists."""
+        if self._sandbox is None:
+            raise RuntimeError("Sandbox not initialized. Call _ensure_sandbox() first.")
+        return self._sandbox
+    @property
+    def sandbox_id(self) -> str:
+        """Get the sandbox ID, ensuring it exists."""
+        if self._sandbox_id is None:
+            raise RuntimeError(
+                "Sandbox ID not initialized. Call _ensure_sandbox() first."
+            )
+        return self._sandbox_id
+    def clean_path(self, path: str) -> str:
+        """Clean and normalize a path to be relative to /workspace."""
+        cleaned_path = clean_path(path, self.workspace_path)
+        logger.debug(f"Cleaned path: {path} -> {cleaned_path}")
+        return cleaned_path

app/exceptions.py ADDED Viewed

	@@ -0,0 +1,13 @@

+class ToolError(Exception):
+    """Raised when a tool encounters an error."""
+    def __init__(self, message):
+        self.message = message
+class OpenManusError(Exception):
+    """Base exception for all OpenManus errors"""
+class TokenLimitExceeded(OpenManusError):
+    """Exception raised when the token limit is exceeded"""

app/flow/__init__.py ADDED Viewed

File without changes

app/flow/base.py ADDED Viewed

	@@ -0,0 +1,57 @@

+from abc import ABC, abstractmethod
+from typing import Dict, List, Optional, Union
+from pydantic import BaseModel
+from app.agent.base import BaseAgent
+class BaseFlow(BaseModel, ABC):
+    """Base class for execution flows supporting multiple agents"""
+    agents: Dict[str, BaseAgent]
+    tools: Optional[List] = None
+    primary_agent_key: Optional[str] = None
+    class Config:
+        arbitrary_types_allowed = True
+    def __init__(
+        self, agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]], **data
+    ):
+        # Handle different ways of providing agents
+        if isinstance(agents, BaseAgent):
+            agents_dict = {"default": agents}
+        elif isinstance(agents, list):
+            agents_dict = {f"agent_{i}": agent for i, agent in enumerate(agents)}
+        else:
+            agents_dict = agents
+        # If primary agent not specified, use first agent
+        primary_key = data.get("primary_agent_key")
+        if not primary_key and agents_dict:
+            primary_key = next(iter(agents_dict))
+            data["primary_agent_key"] = primary_key
+        # Set the agents dictionary
+        data["agents"] = agents_dict
+        # Initialize using BaseModel's init
+        super().__init__(**data)
+    @property
+    def primary_agent(self) -> Optional[BaseAgent]:
+        """Get the primary agent for the flow"""
+        return self.agents.get(self.primary_agent_key)
+    def get_agent(self, key: str) -> Optional[BaseAgent]:
+        """Get a specific agent by key"""
+        return self.agents.get(key)
+    def add_agent(self, key: str, agent: BaseAgent) -> None:
+        """Add a new agent to the flow"""
+        self.agents[key] = agent
+    @abstractmethod
+    async def execute(self, input_text: str) -> str:
+        """Execute the flow with given input"""

app/flow/flow_factory.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from enum import Enum
+from typing import Dict, List, Union
+from app.agent.base import BaseAgent
+from app.flow.base import BaseFlow
+from app.flow.planning import PlanningFlow
+class FlowType(str, Enum):
+    PLANNING = "planning"
+class FlowFactory:
+    """Factory for creating different types of flows with support for multiple agents"""
+    @staticmethod
+    def create_flow(
+        flow_type: FlowType,
+        agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]],
+        **kwargs,
+    ) -> BaseFlow:
+        flows = {
+            FlowType.PLANNING: PlanningFlow,
+        }
+        flow_class = flows.get(flow_type)
+        if not flow_class:
+            raise ValueError(f"Unknown flow type: {flow_type}")
+        return flow_class(agents, **kwargs)

app/flow/planning.py ADDED Viewed

	@@ -0,0 +1,442 @@

+import json
+import time
+from enum import Enum
+from typing import Dict, List, Optional, Union
+from pydantic import Field
+from app.agent.base import BaseAgent
+from app.flow.base import BaseFlow
+from app.llm import LLM
+from app.logger import logger
+from app.schema import AgentState, Message, ToolChoice
+from app.tool import PlanningTool
+class PlanStepStatus(str, Enum):
+    """Enum class defining possible statuses of a plan step"""
+    NOT_STARTED = "not_started"
+    IN_PROGRESS = "in_progress"
+    COMPLETED = "completed"
+    BLOCKED = "blocked"
+    @classmethod
+    def get_all_statuses(cls) -> list[str]:
+        """Return a list of all possible step status values"""
+        return [status.value for status in cls]
+    @classmethod
+    def get_active_statuses(cls) -> list[str]:
+        """Return a list of values representing active statuses (not started or in progress)"""
+        return [cls.NOT_STARTED.value, cls.IN_PROGRESS.value]
+    @classmethod
+    def get_status_marks(cls) -> Dict[str, str]:
+        """Return a mapping of statuses to their marker symbols"""
+        return {
+            cls.COMPLETED.value: "[✓]",
+            cls.IN_PROGRESS.value: "[→]",
+            cls.BLOCKED.value: "[!]",
+            cls.NOT_STARTED.value: "[ ]",
+        }
+class PlanningFlow(BaseFlow):
+    """A flow that manages planning and execution of tasks using agents."""
+    llm: LLM = Field(default_factory=lambda: LLM())
+    planning_tool: PlanningTool = Field(default_factory=PlanningTool)
+    executor_keys: List[str] = Field(default_factory=list)
+    active_plan_id: str = Field(default_factory=lambda: f"plan_{int(time.time())}")
+    current_step_index: Optional[int] = None
+    def __init__(
+        self, agents: Union[BaseAgent, List[BaseAgent], Dict[str, BaseAgent]], **data
+    ):
+        # Set executor keys before super().__init__
+        if "executors" in data:
+            data["executor_keys"] = data.pop("executors")
+        # Set plan ID if provided
+        if "plan_id" in data:
+            data["active_plan_id"] = data.pop("plan_id")
+        # Initialize the planning tool if not provided
+        if "planning_tool" not in data:
+            planning_tool = PlanningTool()
+            data["planning_tool"] = planning_tool
+        # Call parent's init with the processed data
+        super().__init__(agents, **data)
+        # Set executor_keys to all agent keys if not specified
+        if not self.executor_keys:
+            self.executor_keys = list(self.agents.keys())
+    def get_executor(self, step_type: Optional[str] = None) -> BaseAgent:
+        """
+        Get an appropriate executor agent for the current step.
+        Can be extended to select agents based on step type/requirements.
+        """
+        # If step type is provided and matches an agent key, use that agent
+        if step_type and step_type in self.agents:
+            return self.agents[step_type]
+        # Otherwise use the first available executor or fall back to primary agent
+        for key in self.executor_keys:
+            if key in self.agents:
+                return self.agents[key]
+        # Fallback to primary agent
+        return self.primary_agent
+    async def execute(self, input_text: str) -> str:
+        """Execute the planning flow with agents."""
+        try:
+            if not self.primary_agent:
+                raise ValueError("No primary agent available")
+            # Create initial plan if input provided
+            if input_text:
+                await self._create_initial_plan(input_text)
+                # Verify plan was created successfully
+                if self.active_plan_id not in self.planning_tool.plans:
+                    logger.error(
+                        f"Plan creation failed. Plan ID {self.active_plan_id} not found in planning tool."
+                    )
+                    return f"Failed to create plan for: {input_text}"
+            result = ""
+            while True:
+                # Get current step to execute
+                self.current_step_index, step_info = await self._get_current_step_info()
+                # Exit if no more steps or plan completed
+                if self.current_step_index is None:
+                    result += await self._finalize_plan()
+                    break
+                # Execute current step with appropriate agent
+                step_type = step_info.get("type") if step_info else None
+                executor = self.get_executor(step_type)
+                step_result = await self._execute_step(executor, step_info)
+                result += step_result + "\n"
+                # Check if agent wants to terminate
+                if hasattr(executor, "state") and executor.state == AgentState.FINISHED:
+                    break
+            return result
+        except Exception as e:
+            logger.error(f"Error in PlanningFlow: {str(e)}")
+            return f"Execution failed: {str(e)}"
+    async def _create_initial_plan(self, request: str) -> None:
+        """Create an initial plan based on the request using the flow's LLM and PlanningTool."""
+        logger.info(f"Creating initial plan with ID: {self.active_plan_id}")
+        system_message_content = (
+            "You are a planning assistant. Create a concise, actionable plan with clear steps. "
+            "Focus on key milestones rather than detailed sub-steps. "
+            "Optimize for clarity and efficiency."
+        )
+        agents_description = []
+        for key in self.executor_keys:
+            if key in self.agents:
+                agents_description.append(
+                    {
+                        "name": key.upper(),
+                        "description": self.agents[key].description,
+                    }
+                )
+        if len(agents_description) > 1:
+            # Add description of agents to select
+            system_message_content += (
+                f"\nNow we have {agents_description} agents. "
+                f"The infomation of them are below: {json.dumps(agents_description)}\n"
+                "When creating steps in the planning tool, please specify the agent names using the format '[agent_name]'."
+            )
+        # Create a system message for plan creation
+        system_message = Message.system_message(system_message_content)
+        # Create a user message with the request
+        user_message = Message.user_message(
+            f"Create a reasonable plan with clear steps to accomplish the task: {request}"
+        )
+        # Call LLM with PlanningTool
+        response = await self.llm.ask_tool(
+            messages=[user_message],
+            system_msgs=[system_message],
+            tools=[self.planning_tool.to_param()],
+            tool_choice=ToolChoice.AUTO,
+        )
+        # Process tool calls if present
+        if response.tool_calls:
+            for tool_call in response.tool_calls:
+                if tool_call.function.name == "planning":
+                    # Parse the arguments
+                    args = tool_call.function.arguments
+                    if isinstance(args, str):
+                        try:
+                            args = json.loads(args)
+                        except json.JSONDecodeError:
+                            logger.error(f"Failed to parse tool arguments: {args}")
+                            continue
+                    # Ensure plan_id is set correctly and execute the tool
+                    args["plan_id"] = self.active_plan_id
+                    # Execute the tool via ToolCollection instead of directly
+                    result = await self.planning_tool.execute(**args)
+                    logger.info(f"Plan creation result: {str(result)}")
+                    return
+        # If execution reached here, create a default plan
+        logger.warning("Creating default plan")
+        # Create default plan using the ToolCollection
+        await self.planning_tool.execute(
+            **{
+                "command": "create",
+                "plan_id": self.active_plan_id,
+                "title": f"Plan for: {request[:50]}{'...' if len(request) > 50 else ''}",
+                "steps": ["Analyze request", "Execute task", "Verify results"],
+            }
+        )
+    async def _get_current_step_info(self) -> tuple[Optional[int], Optional[dict]]:
+        """
+        Parse the current plan to identify the first non-completed step's index and info.
+        Returns (None, None) if no active step is found.
+        """
+        if (
+            not self.active_plan_id
+            or self.active_plan_id not in self.planning_tool.plans
+        ):
+            logger.error(f"Plan with ID {self.active_plan_id} not found")
+            return None, None
+        try:
+            # Direct access to plan data from planning tool storage
+            plan_data = self.planning_tool.plans[self.active_plan_id]
+            steps = plan_data.get("steps", [])
+            step_statuses = plan_data.get("step_statuses", [])
+            # Find first non-completed step
+            for i, step in enumerate(steps):
+                if i >= len(step_statuses):
+                    status = PlanStepStatus.NOT_STARTED.value
+                else:
+                    status = step_statuses[i]
+                if status in PlanStepStatus.get_active_statuses():
+                    # Extract step type/category if available
+                    step_info = {"text": step}
+                    # Try to extract step type from the text (e.g., [SEARCH] or [CODE])
+                    import re
+                    type_match = re.search(r"\[([A-Z_]+)\]", step)
+                    if type_match:
+                        step_info["type"] = type_match.group(1).lower()
+                    # Mark current step as in_progress
+                    try:
+                        await self.planning_tool.execute(
+                            command="mark_step",
+                            plan_id=self.active_plan_id,
+                            step_index=i,
+                            step_status=PlanStepStatus.IN_PROGRESS.value,
+                        )
+                    except Exception as e:
+                        logger.warning(f"Error marking step as in_progress: {e}")
+                        # Update step status directly if needed
+                        if i < len(step_statuses):
+                            step_statuses[i] = PlanStepStatus.IN_PROGRESS.value
+                        else:
+                            while len(step_statuses) < i:
+                                step_statuses.append(PlanStepStatus.NOT_STARTED.value)
+                            step_statuses.append(PlanStepStatus.IN_PROGRESS.value)
+                        plan_data["step_statuses"] = step_statuses
+                    return i, step_info
+            return None, None  # No active step found
+        except Exception as e:
+            logger.warning(f"Error finding current step index: {e}")
+            return None, None
+    async def _execute_step(self, executor: BaseAgent, step_info: dict) -> str:
+        """Execute the current step with the specified agent using agent.run()."""
+        # Prepare context for the agent with current plan status
+        plan_status = await self._get_plan_text()
+        step_text = step_info.get("text", f"Step {self.current_step_index}")
+        # Create a prompt for the agent to execute the current step
+        step_prompt = f"""
+        CURRENT PLAN STATUS:
+        {plan_status}
+        YOUR CURRENT TASK:
+        You are now working on step {self.current_step_index}: "{step_text}"
+        Please only execute this current step using the appropriate tools. When you're done, provide a summary of what you accomplished.
+        """
+        # Use agent.run() to execute the step
+        try:
+            step_result = await executor.run(step_prompt)
+            # Mark the step as completed after successful execution
+            await self._mark_step_completed()
+            return step_result
+        except Exception as e:
+            logger.error(f"Error executing step {self.current_step_index}: {e}")
+            return f"Error executing step {self.current_step_index}: {str(e)}"
+    async def _mark_step_completed(self) -> None:
+        """Mark the current step as completed."""
+        if self.current_step_index is None:
+            return
+        try:
+            # Mark the step as completed
+            await self.planning_tool.execute(
+                command="mark_step",
+                plan_id=self.active_plan_id,
+                step_index=self.current_step_index,
+                step_status=PlanStepStatus.COMPLETED.value,
+            )
+            logger.info(
+                f"Marked step {self.current_step_index} as completed in plan {self.active_plan_id}"
+            )
+        except Exception as e:
+            logger.warning(f"Failed to update plan status: {e}")
+            # Update step status directly in planning tool storage
+            if self.active_plan_id in self.planning_tool.plans:
+                plan_data = self.planning_tool.plans[self.active_plan_id]
+                step_statuses = plan_data.get("step_statuses", [])
+                # Ensure the step_statuses list is long enough
+                while len(step_statuses) <= self.current_step_index:
+                    step_statuses.append(PlanStepStatus.NOT_STARTED.value)
+                # Update the status
+                step_statuses[self.current_step_index] = PlanStepStatus.COMPLETED.value
+                plan_data["step_statuses"] = step_statuses
+    async def _get_plan_text(self) -> str:
+        """Get the current plan as formatted text."""
+        try:
+            result = await self.planning_tool.execute(
+                command="get", plan_id=self.active_plan_id
+            )
+            return result.output if hasattr(result, "output") else str(result)
+        except Exception as e:
+            logger.error(f"Error getting plan: {e}")
+            return self._generate_plan_text_from_storage()
+    def _generate_plan_text_from_storage(self) -> str:
+        """Generate plan text directly from storage if the planning tool fails."""
+        try:
+            if self.active_plan_id not in self.planning_tool.plans:
+                return f"Error: Plan with ID {self.active_plan_id} not found"
+            plan_data = self.planning_tool.plans[self.active_plan_id]
+            title = plan_data.get("title", "Untitled Plan")
+            steps = plan_data.get("steps", [])
+            step_statuses = plan_data.get("step_statuses", [])
+            step_notes = plan_data.get("step_notes", [])
+            # Ensure step_statuses and step_notes match the number of steps
+            while len(step_statuses) < len(steps):
+                step_statuses.append(PlanStepStatus.NOT_STARTED.value)
+            while len(step_notes) < len(steps):
+                step_notes.append("")
+            # Count steps by status
+            status_counts = {status: 0 for status in PlanStepStatus.get_all_statuses()}
+            for status in step_statuses:
+                if status in status_counts:
+                    status_counts[status] += 1
+            completed = status_counts[PlanStepStatus.COMPLETED.value]
+            total = len(steps)
+            progress = (completed / total) * 100 if total > 0 else 0
+            plan_text = f"Plan: {title} (ID: {self.active_plan_id})\n"
+            plan_text += "=" * len(plan_text) + "\n\n"
+            plan_text += (
+                f"Progress: {completed}/{total} steps completed ({progress:.1f}%)\n"
+            )
+            plan_text += f"Status: {status_counts[PlanStepStatus.COMPLETED.value]} completed, {status_counts[PlanStepStatus.IN_PROGRESS.value]} in progress, "
+            plan_text += f"{status_counts[PlanStepStatus.BLOCKED.value]} blocked, {status_counts[PlanStepStatus.NOT_STARTED.value]} not started\n\n"
+            plan_text += "Steps:\n"
+            status_marks = PlanStepStatus.get_status_marks()
+            for i, (step, status, notes) in enumerate(
+                zip(steps, step_statuses, step_notes)
+            ):
+                # Use status marks to indicate step status
+                status_mark = status_marks.get(
+                    status, status_marks[PlanStepStatus.NOT_STARTED.value]
+                )
+                plan_text += f"{i}. {status_mark} {step}\n"
+                if notes:
+                    plan_text += f"   Notes: {notes}\n"
+            return plan_text
+        except Exception as e:
+            logger.error(f"Error generating plan text from storage: {e}")
+            return f"Error: Unable to retrieve plan with ID {self.active_plan_id}"
+    async def _finalize_plan(self) -> str:
+        """Finalize the plan and provide a summary using the flow's LLM directly."""
+        plan_text = await self._get_plan_text()
+        # Create a summary using the flow's LLM directly
+        try:
+            system_message = Message.system_message(
+                "You are a planning assistant. Your task is to summarize the completed plan."
+            )
+            user_message = Message.user_message(
+                f"The plan has been completed. Here is the final plan status:\n\n{plan_text}\n\nPlease provide a summary of what was accomplished and any final thoughts."
+            )
+            response = await self.llm.ask(
+                messages=[user_message], system_msgs=[system_message]
+            )
+            return f"Plan completed:\n\n{response}"
+        except Exception as e:
+            logger.error(f"Error finalizing plan with LLM: {e}")
+            # Fallback to using an agent for the summary
+            try:
+                agent = self.primary_agent
+                summary_prompt = f"""
+                The plan has been completed. Here is the final plan status:
+                {plan_text}
+                Please provide a summary of what was accomplished and any final thoughts.
+                """
+                summary = await agent.run(summary_prompt)
+                return f"Plan completed:\n\n{summary}"
+            except Exception as e2:
+                logger.error(f"Error finalizing plan with agent: {e2}")
+                return "Plan completed. Error generating summary."

app/llm.py ADDED Viewed

	@@ -0,0 +1,766 @@

+import math
+from typing import Dict, List, Optional, Union
+import tiktoken
+from openai import (
+    APIError,
+    AsyncAzureOpenAI,
+    AsyncOpenAI,
+    AuthenticationError,
+    OpenAIError,
+    RateLimitError,
+)
+from openai.types.chat import ChatCompletion, ChatCompletionMessage
+from tenacity import (
+    retry,
+    retry_if_exception_type,
+    stop_after_attempt,
+    wait_random_exponential,
+)
+from app.bedrock import BedrockClient
+from app.config import LLMSettings, config
+from app.exceptions import TokenLimitExceeded
+from app.logger import logger  # Assuming a logger is set up in your app
+from app.schema import (
+    ROLE_VALUES,
+    TOOL_CHOICE_TYPE,
+    TOOL_CHOICE_VALUES,
+    Message,
+    ToolChoice,
+)
+REASONING_MODELS = ["o1", "o3-mini"]
+MULTIMODAL_MODELS = [
+    "gpt-4-vision-preview",
+    "gpt-4o",
+    "gpt-4o-mini",
+    "claude-3-opus-20240229",
+    "claude-3-sonnet-20240229",
+    "claude-3-haiku-20240307",
+]
+class TokenCounter:
+    # Token constants
+    BASE_MESSAGE_TOKENS = 4
+    FORMAT_TOKENS = 2
+    LOW_DETAIL_IMAGE_TOKENS = 85
+    HIGH_DETAIL_TILE_TOKENS = 170
+    # Image processing constants
+    MAX_SIZE = 2048
+    HIGH_DETAIL_TARGET_SHORT_SIDE = 768
+    TILE_SIZE = 512
+    def __init__(self, tokenizer):
+        self.tokenizer = tokenizer
+    def count_text(self, text: str) -> int:
+        """Calculate tokens for a text string"""
+        return 0 if not text else len(self.tokenizer.encode(text))
+    def count_image(self, image_item: dict) -> int:
+        """
+        Calculate tokens for an image based on detail level and dimensions
+        For "low" detail: fixed 85 tokens
+        For "high" detail:
+        1. Scale to fit in 2048x2048 square
+        2. Scale shortest side to 768px
+        3. Count 512px tiles (170 tokens each)
+        4. Add 85 tokens
+        """
+        detail = image_item.get("detail", "medium")
+        # For low detail, always return fixed token count
+        if detail == "low":
+            return self.LOW_DETAIL_IMAGE_TOKENS
+        # For medium detail (default in OpenAI), use high detail calculation
+        # OpenAI doesn't specify a separate calculation for medium
+        # For high detail, calculate based on dimensions if available
+        if detail == "high" or detail == "medium":
+            # If dimensions are provided in the image_item
+            if "dimensions" in image_item:
+                width, height = image_item["dimensions"]
+                return self._calculate_high_detail_tokens(width, height)
+        return (
+            self._calculate_high_detail_tokens(1024, 1024) if detail == "high" else 1024
+        )
+    def _calculate_high_detail_tokens(self, width: int, height: int) -> int:
+        """Calculate tokens for high detail images based on dimensions"""
+        # Step 1: Scale to fit in MAX_SIZE x MAX_SIZE square
+        if width > self.MAX_SIZE or height > self.MAX_SIZE:
+            scale = self.MAX_SIZE / max(width, height)
+            width = int(width * scale)
+            height = int(height * scale)
+        # Step 2: Scale so shortest side is HIGH_DETAIL_TARGET_SHORT_SIDE
+        scale = self.HIGH_DETAIL_TARGET_SHORT_SIDE / min(width, height)
+        scaled_width = int(width * scale)
+        scaled_height = int(height * scale)
+        # Step 3: Count number of 512px tiles
+        tiles_x = math.ceil(scaled_width / self.TILE_SIZE)
+        tiles_y = math.ceil(scaled_height / self.TILE_SIZE)
+        total_tiles = tiles_x * tiles_y
+        # Step 4: Calculate final token count
+        return (
+            total_tiles * self.HIGH_DETAIL_TILE_TOKENS
+        ) + self.LOW_DETAIL_IMAGE_TOKENS
+    def count_content(self, content: Union[str, List[Union[str, dict]]]) -> int:
+        """Calculate tokens for message content"""
+        if not content:
+            return 0
+        if isinstance(content, str):
+            return self.count_text(content)
+        token_count = 0
+        for item in content:
+            if isinstance(item, str):
+                token_count += self.count_text(item)
+            elif isinstance(item, dict):
+                if "text" in item:
+                    token_count += self.count_text(item["text"])
+                elif "image_url" in item:
+                    token_count += self.count_image(item)
+        return token_count
+    def count_tool_calls(self, tool_calls: List[dict]) -> int:
+        """Calculate tokens for tool calls"""
+        token_count = 0
+        for tool_call in tool_calls:
+            if "function" in tool_call:
+                function = tool_call["function"]
+                token_count += self.count_text(function.get("name", ""))
+                token_count += self.count_text(function.get("arguments", ""))
+        return token_count
+    def count_message_tokens(self, messages: List[dict]) -> int:
+        """Calculate the total number of tokens in a message list"""
+        total_tokens = self.FORMAT_TOKENS  # Base format tokens
+        for message in messages:
+            tokens = self.BASE_MESSAGE_TOKENS  # Base tokens per message
+            # Add role tokens
+            tokens += self.count_text(message.get("role", ""))
+            # Add content tokens
+            if "content" in message:
+                tokens += self.count_content(message["content"])
+            # Add tool calls tokens
+            if "tool_calls" in message:
+                tokens += self.count_tool_calls(message["tool_calls"])
+            # Add name and tool_call_id tokens
+            tokens += self.count_text(message.get("name", ""))
+            tokens += self.count_text(message.get("tool_call_id", ""))
+            total_tokens += tokens
+        return total_tokens
+class LLM:
+    _instances: Dict[str, "LLM"] = {}
+    def __new__(
+        cls, config_name: str = "default", llm_config: Optional[LLMSettings] = None
+    ):
+        if config_name not in cls._instances:
+            instance = super().__new__(cls)
+            instance.__init__(config_name, llm_config)
+            cls._instances[config_name] = instance
+        return cls._instances[config_name]
+    def __init__(
+        self, config_name: str = "default", llm_config: Optional[LLMSettings] = None
+    ):
+        if not hasattr(self, "client"):  # Only initialize if not already initialized
+            llm_config = llm_config or config.llm
+            llm_config = llm_config.get(config_name, llm_config["default"])
+            self.model = llm_config.model
+            self.max_tokens = llm_config.max_tokens
+            self.temperature = llm_config.temperature
+            self.api_type = llm_config.api_type
+            self.api_key = llm_config.api_key
+            self.api_version = llm_config.api_version
+            self.base_url = llm_config.base_url
+            # Add token counting related attributes
+            self.total_input_tokens = 0
+            self.total_completion_tokens = 0
+            self.max_input_tokens = (
+                llm_config.max_input_tokens
+                if hasattr(llm_config, "max_input_tokens")
+                else None
+            )
+            # Initialize tokenizer
+            try:
+                self.tokenizer = tiktoken.encoding_for_model(self.model)
+            except KeyError:
+                # If the model is not in tiktoken's presets, use cl100k_base as default
+                self.tokenizer = tiktoken.get_encoding("cl100k_base")
+            if self.api_type == "azure":
+                self.client = AsyncAzureOpenAI(
+                    base_url=self.base_url,
+                    api_key=self.api_key,
+                    api_version=self.api_version,
+                )
+            elif self.api_type == "aws":
+                self.client = BedrockClient()
+            else:
+                self.client = AsyncOpenAI(api_key=self.api_key, base_url=self.base_url)
+            self.token_counter = TokenCounter(self.tokenizer)
+    def count_tokens(self, text: str) -> int:
+        """Calculate the number of tokens in a text"""
+        if not text:
+            return 0
+        return len(self.tokenizer.encode(text))
+    def count_message_tokens(self, messages: List[dict]) -> int:
+        return self.token_counter.count_message_tokens(messages)
+    def update_token_count(self, input_tokens: int, completion_tokens: int = 0) -> None:
+        """Update token counts"""
+        # Only track tokens if max_input_tokens is set
+        self.total_input_tokens += input_tokens
+        self.total_completion_tokens += completion_tokens
+        logger.info(
+            f"Token usage: Input={input_tokens}, Completion={completion_tokens}, "
+            f"Cumulative Input={self.total_input_tokens}, Cumulative Completion={self.total_completion_tokens}, "
+            f"Total={input_tokens + completion_tokens}, Cumulative Total={self.total_input_tokens + self.total_completion_tokens}"
+        )
+    def check_token_limit(self, input_tokens: int) -> bool:
+        """Check if token limits are exceeded"""
+        if self.max_input_tokens is not None:
+            return (self.total_input_tokens + input_tokens) <= self.max_input_tokens
+        # If max_input_tokens is not set, always return True
+        return True
+    def get_limit_error_message(self, input_tokens: int) -> str:
+        """Generate error message for token limit exceeded"""
+        if (
+            self.max_input_tokens is not None
+            and (self.total_input_tokens + input_tokens) > self.max_input_tokens
+        ):
+            return f"Request may exceed input token limit (Current: {self.total_input_tokens}, Needed: {input_tokens}, Max: {self.max_input_tokens})"
+        return "Token limit exceeded"
+    @staticmethod
+    def format_messages(
+        messages: List[Union[dict, Message]], supports_images: bool = False
+    ) -> List[dict]:
+        """
+        Format messages for LLM by converting them to OpenAI message format.
+        Args:
+            messages: List of messages that can be either dict or Message objects
+            supports_images: Flag indicating if the target model supports image inputs
+        Returns:
+            List[dict]: List of formatted messages in OpenAI format
+        Raises:
+            ValueError: If messages are invalid or missing required fields
+            TypeError: If unsupported message types are provided
+        Examples:
+            >>> msgs = [
+            ...     Message.system_message("You are a helpful assistant"),
+            ...     {"role": "user", "content": "Hello"},
+            ...     Message.user_message("How are you?")
+            ... ]
+            >>> formatted = LLM.format_messages(msgs)
+        """
+        formatted_messages = []
+        for message in messages:
+            # Convert Message objects to dictionaries
+            if isinstance(message, Message):
+                message = message.to_dict()
+            if isinstance(message, dict):
+                # If message is a dict, ensure it has required fields
+                if "role" not in message:
+                    raise ValueError("Message dict must contain 'role' field")
+                # Process base64 images if present and model supports images
+                if supports_images and message.get("base64_image"):
+                    # Initialize or convert content to appropriate format
+                    if not message.get("content"):
+                        message["content"] = []
+                    elif isinstance(message["content"], str):
+                        message["content"] = [
+                            {"type": "text", "text": message["content"]}
+                        ]
+                    elif isinstance(message["content"], list):
+                        # Convert string items to proper text objects
+                        message["content"] = [
+                            (
+                                {"type": "text", "text": item}
+                                if isinstance(item, str)
+                                else item
+                            )
+                            for item in message["content"]
+                        ]
+                    # Add the image to content
+                    message["content"].append(
+                        {
+                            "type": "image_url",
+                            "image_url": {
+                                "url": f"data:image/jpeg;base64,{message['base64_image']}"
+                            },
+                        }
+                    )
+                    # Remove the base64_image field
+                    del message["base64_image"]
+                # If model doesn't support images but message has base64_image, handle gracefully
+                elif not supports_images and message.get("base64_image"):
+                    # Just remove the base64_image field and keep the text content
+                    del message["base64_image"]
+                if "content" in message or "tool_calls" in message:
+                    formatted_messages.append(message)
+                # else: do not include the message
+            else:
+                raise TypeError(f"Unsupported message type: {type(message)}")
+        # Validate all messages have required fields
+        for msg in formatted_messages:
+            if msg["role"] not in ROLE_VALUES:
+                raise ValueError(f"Invalid role: {msg['role']}")
+        return formatted_messages
+    @retry(
+        wait=wait_random_exponential(min=1, max=60),
+        stop=stop_after_attempt(6),
+        retry=retry_if_exception_type(
+            (OpenAIError, Exception, ValueError)
+        ),  # Don't retry TokenLimitExceeded
+    )
+    async def ask(
+        self,
+        messages: List[Union[dict, Message]],
+        system_msgs: Optional[List[Union[dict, Message]]] = None,
+        stream: bool = True,
+        temperature: Optional[float] = None,
+    ) -> str:
+        """
+        Send a prompt to the LLM and get the response.
+        Args:
+            messages: List of conversation messages
+            system_msgs: Optional system messages to prepend
+            stream (bool): Whether to stream the response
+            temperature (float): Sampling temperature for the response
+        Returns:
+            str: The generated response
+        Raises:
+            TokenLimitExceeded: If token limits are exceeded
+            ValueError: If messages are invalid or response is empty
+            OpenAIError: If API call fails after retries
+            Exception: For unexpected errors
+        """
+        try:
+            # Check if the model supports images
+            supports_images = self.model in MULTIMODAL_MODELS
+            # Format system and user messages with image support check
+            if system_msgs:
+                system_msgs = self.format_messages(system_msgs, supports_images)
+                messages = system_msgs + self.format_messages(messages, supports_images)
+            else:
+                messages = self.format_messages(messages, supports_images)
+            # Calculate input token count
+            input_tokens = self.count_message_tokens(messages)
+            # Check if token limits are exceeded
+            if not self.check_token_limit(input_tokens):
+                error_message = self.get_limit_error_message(input_tokens)
+                # Raise a special exception that won't be retried
+                raise TokenLimitExceeded(error_message)
+            params = {
+                "model": self.model,
+                "messages": messages,
+            }
+            if self.model in REASONING_MODELS:
+                params["max_completion_tokens"] = self.max_tokens
+            else:
+                params["max_tokens"] = self.max_tokens
+                params["temperature"] = (
+                    temperature if temperature is not None else self.temperature
+                )
+            if not stream:
+                # Non-streaming request
+                response = await self.client.chat.completions.create(
+                    **params, stream=False
+                )
+                if not response.choices or not response.choices[0].message.content:
+                    raise ValueError("Empty or invalid response from LLM")
+                # Update token counts
+                self.update_token_count(
+                    response.usage.prompt_tokens, response.usage.completion_tokens
+                )
+                return response.choices[0].message.content
+            # Streaming request, For streaming, update estimated token count before making the request
+            self.update_token_count(input_tokens)
+            response = await self.client.chat.completions.create(**params, stream=True)
+            collected_messages = []
+            completion_text = ""
+            async for chunk in response:
+                chunk_message = chunk.choices[0].delta.content or ""
+                collected_messages.append(chunk_message)
+                completion_text += chunk_message
+                print(chunk_message, end="", flush=True)
+            print()  # Newline after streaming
+            full_response = "".join(collected_messages).strip()
+            if not full_response:
+                raise ValueError("Empty response from streaming LLM")
+            # estimate completion tokens for streaming response
+            completion_tokens = self.count_tokens(completion_text)
+            logger.info(
+                f"Estimated completion tokens for streaming response: {completion_tokens}"
+            )
+            self.total_completion_tokens += completion_tokens
+            return full_response
+        except TokenLimitExceeded:
+            # Re-raise token limit errors without logging
+            raise
+        except ValueError:
+            logger.exception(f"Validation error")
+            raise
+        except OpenAIError as oe:
+            logger.exception(f"OpenAI API error")
+            if isinstance(oe, AuthenticationError):
+                logger.error("Authentication failed. Check API key.")
+            elif isinstance(oe, RateLimitError):
+                logger.error("Rate limit exceeded. Consider increasing retry attempts.")
+            elif isinstance(oe, APIError):
+                logger.error(f"API error: {oe}")
+            raise
+        except Exception:
+            logger.exception(f"Unexpected error in ask")
+            raise
+    @retry(
+        wait=wait_random_exponential(min=1, max=60),
+        stop=stop_after_attempt(6),
+        retry=retry_if_exception_type(
+            (OpenAIError, Exception, ValueError)
+        ),  # Don't retry TokenLimitExceeded
+    )
+    async def ask_with_images(
+        self,
+        messages: List[Union[dict, Message]],
+        images: List[Union[str, dict]],
+        system_msgs: Optional[List[Union[dict, Message]]] = None,
+        stream: bool = False,
+        temperature: Optional[float] = None,
+    ) -> str:
+        """
+        Send a prompt with images to the LLM and get the response.
+        Args:
+            messages: List of conversation messages
+            images: List of image URLs or image data dictionaries
+            system_msgs: Optional system messages to prepend
+            stream (bool): Whether to stream the response
+            temperature (float): Sampling temperature for the response
+        Returns:
+            str: The generated response
+        Raises:
+            TokenLimitExceeded: If token limits are exceeded
+            ValueError: If messages are invalid or response is empty
+            OpenAIError: If API call fails after retries
+            Exception: For unexpected errors
+        """
+        try:
+            # For ask_with_images, we always set supports_images to True because
+            # this method should only be called with models that support images
+            if self.model not in MULTIMODAL_MODELS:
+                raise ValueError(
+                    f"Model {self.model} does not support images. Use a model from {MULTIMODAL_MODELS}"
+                )
+            # Format messages with image support
+            formatted_messages = self.format_messages(messages, supports_images=True)
+            # Ensure the last message is from the user to attach images
+            if not formatted_messages or formatted_messages[-1]["role"] != "user":
+                raise ValueError(
+                    "The last message must be from the user to attach images"
+                )
+            # Process the last user message to include images
+            last_message = formatted_messages[-1]
+            # Convert content to multimodal format if needed
+            content = last_message["content"]
+            multimodal_content = (
+                [{"type": "text", "text": content}]
+                if isinstance(content, str)
+                else content
+                if isinstance(content, list)
+                else []
+            )
+            # Add images to content
+            for image in images:
+                if isinstance(image, str):
+                    multimodal_content.append(
+                        {"type": "image_url", "image_url": {"url": image}}
+                    )
+                elif isinstance(image, dict) and "url" in image:
+                    multimodal_content.append({"type": "image_url", "image_url": image})
+                elif isinstance(image, dict) and "image_url" in image:
+                    multimodal_content.append(image)
+                else:
+                    raise ValueError(f"Unsupported image format: {image}")
+            # Update the message with multimodal content
+            last_message["content"] = multimodal_content
+            # Add system messages if provided
+            if system_msgs:
+                all_messages = (
+                    self.format_messages(system_msgs, supports_images=True)
+                    + formatted_messages
+                )
+            else:
+                all_messages = formatted_messages
+            # Calculate tokens and check limits
+            input_tokens = self.count_message_tokens(all_messages)
+            if not self.check_token_limit(input_tokens):
+                raise TokenLimitExceeded(self.get_limit_error_message(input_tokens))
+            # Set up API parameters
+            params = {
+                "model": self.model,
+                "messages": all_messages,
+                "stream": stream,
+            }
+            # Add model-specific parameters
+            if self.model in REASONING_MODELS:
+                params["max_completion_tokens"] = self.max_tokens
+            else:
+                params["max_tokens"] = self.max_tokens
+                params["temperature"] = (
+                    temperature if temperature is not None else self.temperature
+                )
+            # Handle non-streaming request
+            if not stream:
+                response = await self.client.chat.completions.create(**params)
+                if not response.choices or not response.choices[0].message.content:
+                    raise ValueError("Empty or invalid response from LLM")
+                self.update_token_count(response.usage.prompt_tokens)
+                return response.choices[0].message.content
+            # Handle streaming request
+            self.update_token_count(input_tokens)
+            response = await self.client.chat.completions.create(**params)
+            collected_messages = []
+            async for chunk in response:
+                chunk_message = chunk.choices[0].delta.content or ""
+                collected_messages.append(chunk_message)
+                print(chunk_message, end="", flush=True)
+            print()  # Newline after streaming
+            full_response = "".join(collected_messages).strip()
+            if not full_response:
+                raise ValueError("Empty response from streaming LLM")
+            return full_response
+        except TokenLimitExceeded:
+            raise
+        except ValueError as ve:
+            logger.error(f"Validation error in ask_with_images: {ve}")
+            raise
+        except OpenAIError as oe:
+            logger.error(f"OpenAI API error: {oe}")
+            if isinstance(oe, AuthenticationError):
+                logger.error("Authentication failed. Check API key.")
+            elif isinstance(oe, RateLimitError):
+                logger.error("Rate limit exceeded. Consider increasing retry attempts.")
+            elif isinstance(oe, APIError):
+                logger.error(f"API error: {oe}")
+            raise
+        except Exception as e:
+            logger.error(f"Unexpected error in ask_with_images: {e}")
+            raise
+    @retry(
+        wait=wait_random_exponential(min=1, max=60),
+        stop=stop_after_attempt(6),
+        retry=retry_if_exception_type(
+            (OpenAIError, Exception, ValueError)
+        ),  # Don't retry TokenLimitExceeded
+    )
+    async def ask_tool(
+        self,
+        messages: List[Union[dict, Message]],
+        system_msgs: Optional[List[Union[dict, Message]]] = None,
+        timeout: int = 300,
+        tools: Optional[List[dict]] = None,
+        tool_choice: TOOL_CHOICE_TYPE = ToolChoice.AUTO,  # type: ignore
+        temperature: Optional[float] = None,
+        **kwargs,
+    ) -> ChatCompletionMessage | None:
+        """
+        Ask LLM using functions/tools and return the response.
+        Args:
+            messages: List of conversation messages
+            system_msgs: Optional system messages to prepend
+            timeout: Request timeout in seconds
+            tools: List of tools to use
+            tool_choice: Tool choice strategy
+            temperature: Sampling temperature for the response
+            **kwargs: Additional completion arguments
+        Returns:
+            ChatCompletionMessage: The model's response
+        Raises:
+            TokenLimitExceeded: If token limits are exceeded
+            ValueError: If tools, tool_choice, or messages are invalid
+            OpenAIError: If API call fails after retries
+            Exception: For unexpected errors
+        """
+        try:
+            # Validate tool_choice
+            if tool_choice not in TOOL_CHOICE_VALUES:
+                raise ValueError(f"Invalid tool_choice: {tool_choice}")
+            # Check if the model supports images
+            supports_images = self.model in MULTIMODAL_MODELS
+            # Format messages
+            if system_msgs:
+                system_msgs = self.format_messages(system_msgs, supports_images)
+                messages = system_msgs + self.format_messages(messages, supports_images)
+            else:
+                messages = self.format_messages(messages, supports_images)
+            # Calculate input token count
+            input_tokens = self.count_message_tokens(messages)
+            # If there are tools, calculate token count for tool descriptions
+            tools_tokens = 0
+            if tools:
+                for tool in tools:
+                    tools_tokens += self.count_tokens(str(tool))
+            input_tokens += tools_tokens
+            # Check if token limits are exceeded
+            if not self.check_token_limit(input_tokens):
+                error_message = self.get_limit_error_message(input_tokens)
+                # Raise a special exception that won't be retried
+                raise TokenLimitExceeded(error_message)
+            # Validate tools if provided
+            if tools:
+                for tool in tools:
+                    if not isinstance(tool, dict) or "type" not in tool:
+                        raise ValueError("Each tool must be a dict with 'type' field")
+            # Set up the completion request
+            params = {
+                "model": self.model,
+                "messages": messages,
+                "tools": tools,
+                "tool_choice": tool_choice,
+                "timeout": timeout,
+                **kwargs,
+            }
+            if self.model in REASONING_MODELS:
+                params["max_completion_tokens"] = self.max_tokens
+            else:
+                params["max_tokens"] = self.max_tokens
+                params["temperature"] = (
+                    temperature if temperature is not None else self.temperature
+                )
+            params["stream"] = False  # Always use non-streaming for tool requests
+            response: ChatCompletion = await self.client.chat.completions.create(
+                **params
+            )
+            # Check if response is valid
+            if not response.choices or not response.choices[0].message:
+                print(response)
+                # raise ValueError("Invalid or empty response from LLM")
+                return None
+            # Update token counts
+            self.update_token_count(
+                response.usage.prompt_tokens, response.usage.completion_tokens
+            )
+            return response.choices[0].message
+        except TokenLimitExceeded:
+            # Re-raise token limit errors without logging
+            raise
+        except ValueError as ve:
+            logger.error(f"Validation error in ask_tool: {ve}")
+            raise
+        except OpenAIError as oe:
+            logger.error(f"OpenAI API error: {oe}")
+            if isinstance(oe, AuthenticationError):
+                logger.error("Authentication failed. Check API key.")
+            elif isinstance(oe, RateLimitError):
+                logger.error("Rate limit exceeded. Consider increasing retry attempts.")
+            elif isinstance(oe, APIError):
+                logger.error(f"API error: {oe}")
+            raise
+        except Exception as e:
+            logger.error(f"Unexpected error in ask_tool: {e}")
+            raise

app/logger.py ADDED Viewed

	@@ -0,0 +1,42 @@

+import sys
+from datetime import datetime
+from loguru import logger as _logger
+from app.config import PROJECT_ROOT
+_print_level = "INFO"
+def define_log_level(print_level="INFO", logfile_level="DEBUG", name: str = None):
+    """Adjust the log level to above level"""
+    global _print_level
+    _print_level = print_level
+    current_date = datetime.now()
+    formatted_date = current_date.strftime("%Y%m%d%H%M%S")
+    log_name = (
+        f"{name}_{formatted_date}" if name else formatted_date
+    )  # name a log with prefix name
+    _logger.remove()
+    _logger.add(sys.stderr, level=print_level)
+    _logger.add(PROJECT_ROOT / f"logs/{log_name}.log", level=logfile_level)
+    return _logger
+logger = define_log_level()
+if __name__ == "__main__":
+    logger.info("Starting application")
+    logger.debug("Debug message")
+    logger.warning("Warning message")
+    logger.error("Error message")
+    logger.critical("Critical message")
+    try:
+        raise ValueError("Test error")
+    except Exception as e:
+        logger.exception(f"An error occurred: {e}")

app/mcp/__init__.py ADDED Viewed

File without changes

app/mcp/server.py ADDED Viewed

	@@ -0,0 +1,180 @@

+import logging
+import sys
+logging.basicConfig(level=logging.INFO, handlers=[logging.StreamHandler(sys.stderr)])
+import argparse
+import asyncio
+import atexit
+import json
+from inspect import Parameter, Signature
+from typing import Any, Dict, Optional
+from mcp.server.fastmcp import FastMCP
+from app.logger import logger
+from app.tool.base import BaseTool
+from app.tool.bash import Bash
+from app.tool.browser_use_tool import BrowserUseTool
+from app.tool.str_replace_editor import StrReplaceEditor
+from app.tool.terminate import Terminate
+class MCPServer:
+    """MCP Server implementation with tool registration and management."""
+    def __init__(self, name: str = "openmanus"):
+        self.server = FastMCP(name)
+        self.tools: Dict[str, BaseTool] = {}
+        # Initialize standard tools
+        self.tools["bash"] = Bash()
+        self.tools["browser"] = BrowserUseTool()
+        self.tools["editor"] = StrReplaceEditor()
+        self.tools["terminate"] = Terminate()
+    def register_tool(self, tool: BaseTool, method_name: Optional[str] = None) -> None:
+        """Register a tool with parameter validation and documentation."""
+        tool_name = method_name or tool.name
+        tool_param = tool.to_param()
+        tool_function = tool_param["function"]
+        # Define the async function to be registered
+        async def tool_method(**kwargs):
+            logger.info(f"Executing {tool_name}: {kwargs}")
+            result = await tool.execute(**kwargs)
+            logger.info(f"Result of {tool_name}: {result}")
+            # Handle different types of results (match original logic)
+            if hasattr(result, "model_dump"):
+                return json.dumps(result.model_dump())
+            elif isinstance(result, dict):
+                return json.dumps(result)
+            return result
+        # Set method metadata
+        tool_method.__name__ = tool_name
+        tool_method.__doc__ = self._build_docstring(tool_function)
+        tool_method.__signature__ = self._build_signature(tool_function)
+        # Store parameter schema (important for tools that access it programmatically)
+        param_props = tool_function.get("parameters", {}).get("properties", {})
+        required_params = tool_function.get("parameters", {}).get("required", [])
+        tool_method._parameter_schema = {
+            param_name: {
+                "description": param_details.get("description", ""),
+                "type": param_details.get("type", "any"),
+                "required": param_name in required_params,
+            }
+            for param_name, param_details in param_props.items()
+        }
+        # Register with server
+        self.server.tool()(tool_method)
+        logger.info(f"Registered tool: {tool_name}")
+    def _build_docstring(self, tool_function: dict) -> str:
+        """Build a formatted docstring from tool function metadata."""
+        description = tool_function.get("description", "")
+        param_props = tool_function.get("parameters", {}).get("properties", {})
+        required_params = tool_function.get("parameters", {}).get("required", [])
+        # Build docstring (match original format)
+        docstring = description
+        if param_props:
+            docstring += "\n\nParameters:\n"
+            for param_name, param_details in param_props.items():
+                required_str = (
+                    "(required)" if param_name in required_params else "(optional)"
+                )
+                param_type = param_details.get("type", "any")
+                param_desc = param_details.get("description", "")
+                docstring += (
+                    f"    {param_name} ({param_type}) {required_str}: {param_desc}\n"
+                )
+        return docstring
+    def _build_signature(self, tool_function: dict) -> Signature:
+        """Build a function signature from tool function metadata."""
+        param_props = tool_function.get("parameters", {}).get("properties", {})
+        required_params = tool_function.get("parameters", {}).get("required", [])
+        parameters = []
+        # Follow original type mapping
+        for param_name, param_details in param_props.items():
+            param_type = param_details.get("type", "")
+            default = Parameter.empty if param_name in required_params else None
+            # Map JSON Schema types to Python types (same as original)
+            annotation = Any
+            if param_type == "string":
+                annotation = str
+            elif param_type == "integer":
+                annotation = int
+            elif param_type == "number":
+                annotation = float
+            elif param_type == "boolean":
+                annotation = bool
+            elif param_type == "object":
+                annotation = dict
+            elif param_type == "array":
+                annotation = list
+            # Create parameter with same structure as original
+            param = Parameter(
+                name=param_name,
+                kind=Parameter.KEYWORD_ONLY,
+                default=default,
+                annotation=annotation,
+            )
+            parameters.append(param)
+        return Signature(parameters=parameters)
+    async def cleanup(self) -> None:
+        """Clean up server resources."""
+        logger.info("Cleaning up resources")
+        # Follow original cleanup logic - only clean browser tool
+        if "browser" in self.tools and hasattr(self.tools["browser"], "cleanup"):
+            await self.tools["browser"].cleanup()
+    def register_all_tools(self) -> None:
+        """Register all tools with the server."""
+        for tool in self.tools.values():
+            self.register_tool(tool)
+    def run(self, transport: str = "stdio") -> None:
+        """Run the MCP server."""
+        # Register all tools
+        self.register_all_tools()
+        # Register cleanup function (match original behavior)
+        atexit.register(lambda: asyncio.run(self.cleanup()))
+        # Start server (with same logging as original)
+        logger.info(f"Starting OpenManus server ({transport} mode)")
+        self.server.run(transport=transport)
+def parse_args() -> argparse.Namespace:
+    """Parse command line arguments."""
+    parser = argparse.ArgumentParser(description="OpenManus MCP Server")
+    parser.add_argument(
+        "--transport",
+        choices=["stdio"],
+        default="stdio",
+        help="Communication method: stdio or http (default: stdio)",
+    )
+    return parser.parse_args()
+if __name__ == "__main__":
+    args = parse_args()
+    # Create and run server (maintaining original flow)
+    server = MCPServer()
+    server.run(transport=args.transport)

app/prompt/__init__.py ADDED Viewed

File without changes

app/prompt/browser.py ADDED Viewed

	@@ -0,0 +1,94 @@

+SYSTEM_PROMPT = """\
+You are an AI agent designed to automate browser tasks. Your goal is to accomplish the ultimate task following the rules.
+# Input Format
+Task
+Previous steps
+Current URL
+Open Tabs
+Interactive Elements
+[index]<type>text</type>
+- index: Numeric identifier for interaction
+- type: HTML element type (button, input, etc.)
+- text: Element description
+Example:
+[33]<button>Submit Form</button>
+- Only elements with numeric indexes in [] are interactive
+- elements without [] provide only context
+# Response Rules
+1. RESPONSE FORMAT: You must ALWAYS respond with valid JSON in this exact format:
+{{"current_state": {{"evaluation_previous_goal": "Success|Failed|Unknown - Analyze the current elements and the image to check if the previous goals/actions are successful like intended by the task. Mention if something unexpected happened. Shortly state why/why not",
+"memory": "Description of what has been done and what you need to remember. Be very specific. Count here ALWAYS how many times you have done something and how many remain. E.g. 0 out of 10 websites analyzed. Continue with abc and xyz",
+"next_goal": "What needs to be done with the next immediate action"}},
+"action":[{{"one_action_name": {{// action-specific parameter}}}}, // ... more actions in sequence]}}
+2. ACTIONS: You can specify multiple actions in the list to be executed in sequence. But always specify only one action name per item. Use maximum {{max_actions}} actions per sequence.
+Common action sequences:
+- Form filling: [{{"input_text": {{"index": 1, "text": "username"}}}}, {{"input_text": {{"index": 2, "text": "password"}}}}, {{"click_element": {{"index": 3}}}}]
+- Navigation and extraction: [{{"go_to_url": {{"url": "https://example.com"}}}}, {{"extract_content": {{"goal": "extract the names"}}}}]
+- Actions are executed in the given order
+- If the page changes after an action, the sequence is interrupted and you get the new state.
+- Only provide the action sequence until an action which changes the page state significantly.
+- Try to be efficient, e.g. fill forms at once, or chain actions where nothing changes on the page
+- only use multiple actions if it makes sense.
+3. ELEMENT INTERACTION:
+- Only use indexes of the interactive elements
+- Elements marked with "[]Non-interactive text" are non-interactive
+4. NAVIGATION & ERROR HANDLING:
+- If no suitable elements exist, use other functions to complete the task
+- If stuck, try alternative approaches - like going back to a previous page, new search, new tab etc.
+- Handle popups/cookies by accepting or closing them
+- Use scroll to find elements you are looking for
+- If you want to research something, open a new tab instead of using the current tab
+- If captcha pops up, try to solve it - else try a different approach
+- If the page is not fully loaded, use wait action
+5. TASK COMPLETION:
+- Use the done action as the last action as soon as the ultimate task is complete
+- Dont use "done" before you are done with everything the user asked you, except you reach the last step of max_steps.
+- If you reach your last step, use the done action even if the task is not fully finished. Provide all the information you have gathered so far. If the ultimate task is completly finished set success to true. If not everything the user asked for is completed set success in done to false!
+- If you have to do something repeatedly for example the task says for "each", or "for all", or "x times", count always inside "memory" how many times you have done it and how many remain. Don't stop until you have completed like the task asked you. Only call done after the last step.
+- Don't hallucinate actions
+- Make sure you include everything you found out for the ultimate task in the done text parameter. Do not just say you are done, but include the requested information of the task.
+6. VISUAL CONTEXT:
+- When an image is provided, use it to understand the page layout
+- Bounding boxes with labels on their top right corner correspond to element indexes
+7. Form filling:
+- If you fill an input field and your action sequence is interrupted, most often something changed e.g. suggestions popped up under the field.
+8. Long tasks:
+- Keep track of the status and subresults in the memory.
+9. Extraction:
+- If your task is to find information - call extract_content on the specific pages to get and store the information.
+Your responses must be always JSON with the specified format.
+"""
+NEXT_STEP_PROMPT = """
+What should I do next to achieve my goal?
+When you see [Current state starts here], focus on the following:
+- Current URL and page title{url_placeholder}
+- Available tabs{tabs_placeholder}
+- Interactive elements and their indices
+- Content above{content_above_placeholder} or below{content_below_placeholder} the viewport (if indicated)
+- Any action results or errors{results_placeholder}
+For browser interactions:
+- To navigate: browser_use with action="go_to_url", url="..."
+- To click: browser_use with action="click_element", index=N
+- To type: browser_use with action="input_text", index=N, text="..."
+- To extract: browser_use with action="extract_content", goal="..."
+- To scroll: browser_use with action="scroll_down" or "scroll_up"
+Consider both what's visible and what might be beyond the current viewport.
+Be methodical - remember your progress and what you've learned so far.
+If you want to stop the interaction at any point, use the `terminate` tool/function call.
+"""