Final_Assignment_Agent

Sleeping

App Files Files Community

hugh2023 commited on Aug 8, 2025

Commit

adec1cb

1 Parent(s): 81917a3

Add multi-modal agent system with media analysis, web scraping, and enhanced configuration management

Browse files

Files changed (10) hide show

README.md +274 -1
SETUP.md +195 -0
api_keys copy.json +12 -0
app.py +971 -84
check_ffmpeg.py +148 -0
config.py +122 -0
prompts.py +61 -0
requirements.txt +23 -1
run.py +138 -0
tools.py +2197 -0

README.md CHANGED Viewed

@@ -12,4 +12,277 @@ hf_oauth: true
 hf_oauth_expiration_minutes: 480
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 hf_oauth_expiration_minutes: 480
 ---
+# 多模态智能体系统 (Multi-Modal Agent System)
+一个基于Hugging Face和LangGraph的智能多模态智能体系统，能够理解视频、图片，并使用搜索引擎回答问题。
+## 🚀 功能特性
+### 🎥 视频理解与分析
+- **关键帧提取**: 自动提取视频关键帧进行分析
+- **视频描述**: 生成视频内容的自然语言描述
+- **音频分析**: 分析视频的音频信息
+- **时长统计**: 获取视频的基本信息（时长、帧率、分辨率等）
+### 🖼️ 图像识别与描述
+- **图像描述**: 使用BLIP模型生成图像的自然语言描述
+- **对象检测**: 检测图像中的物体和位置
+- **图像分类**: 对图像进行分类识别
+- **OCR文本提取**: 从图像中提取文字内容
+- **情感分析**: 分析图像中的情感元素
+### 📄 PDF文档处理
+- **PDF下载**: 从URL下载PDF文档
+- **文本提取**: 从PDF中提取文本内容
+- **结构分析**: 分析PDF文档结构和元数据
+- **内容搜索**: 在PDF中搜索特定文本
+- **图像提取**: 从PDF中提取图像
+- **内容总结**: 自动总结PDF文档内容
+### 🌐 网页内容分析
+- **网页抓取**: 获取网页内容和结构
+- **文本提取**: 从网页中提取纯文本内容
+- **结构分析**: 分析网页的标题、表单、表格等结构
+- **内容搜索**: 在网页中搜索特定文本
+- **链接提取**: 提取网页中的所有链接
+- **内容总结**: 自动总结网页内容
+- **可访问性检查**: 检查网页的可访问性问题
+### 📺 YouTube视频处理
+- **视频信息获取**: 获取YouTube视频的标题、作者、时长、观看次数等
+- **视频下载**: 下载YouTube视频到本地
+- **音频提取**: 从YouTube视频中提取音频
+- **缩略图下载**: 下载YouTube视频缩略图
+- **视频搜索**: 搜索YouTube视频
+- **评论分析**: 分析YouTube视频评论
+- **播放列表处理**: 获取播放列表信息和视频列表
+### 📚 Wikipedia百科处理
+- **页面搜索**: 搜索Wikipedia页面
+- **内容获取**: 获取Wikipedia页面完整内容
+- **摘要提取**: 获取页面摘要信息
+- **分类获取**: 获取页面分类信息
+- **链接提取**: 获取页面相关链接
+- **搜索建议**: 获取搜索建议
+- **英文版本**: 支持英文Wikipedia搜索
+- **随机页面**: 获取随机Wikipedia页面
+- **地理搜索**: 根据坐标搜索附近页面
+### 🔍 智能搜索引擎
+- **网络搜索**: 使用DuckDuckGo进行实时网络搜索
+- **图像搜索**: 搜索相关图像资源
+- **视频搜索**: 搜索相关视频内容
+- **智能查询**: 根据问题自动构建搜索查询
+### 🤖 LangGraph工作流编排
+- **状态管理**: 使用AgentState管理智能体状态
+- **工作流节点**: 媒体分类 → 媒体分析 → 信息搜索 → 工具使用 → 答案合成
+- **智能路由**: 根据问题类型自动选择合适的处理路径
+### 🛠️ 丰富工具集
+- **文本分析**: 情感分析、关键词提取、文本摘要
+- **翻译工具**: 多语言文本翻译
+- **数学计算**: 安全的数学表达式计算
+- **天气查询**: 实时天气信息获取
+## 📋 系统架构
+```
+用户问题 → 媒体分类 → 媒体分析 → 信息搜索 → 工具使用 → 答案合成 → 最终答案
+    ↓         ↓         ↓         ↓         ↓         ↓
+  文本/图片/视频/PDF/网页/YouTube/Wikipedia  图像/视频/PDF/网页/YouTube/Wikipedia处理  网络搜索  专用工具  信息整合  自然语言回答
+```
+## 🛠️ 安装与配置
+### 1. 环境要求
+- Python 3.8+
+- CUDA支持（可选，用于GPU加速）
+### 2. 安装依赖
+```bash
+pip install -r requirements.txt
+```
+### 3. 环境变量配置
+创建 `.env` 文件并配置以下变量：
+```env
+# OpenAI API配置
+OPENAI_API_KEY=your_openai_api_key_here
+# Hugging Face配置（可选）
+HUGGINGFACE_API_KEY=your_huggingface_api_key_here
+# 搜索引擎配置（可选）
+SERPER_API_KEY=your_serper_api_key_here
+# 调试配置
+DEBUG=True
+LOG_LEVEL=INFO
+```
+### 4. 运行系统
+```bash
+python app.py
+```
+## 🎯 使用示例
+### 基本使用
+```python
+from app import MultiModalAgent
+# 初始化智能体
+agent = MultiModalAgent()
+# 文本问题
+answer = agent("什么是人工智能？")
+# 图像问题
+answer = agent("这张图片里有什么？", "https://example.com/image.jpg")
+# 视频问题
+answer = agent("这个视频在讲什么？", "https://youtube.com/watch?v=example")
+# 网页问题
+answer = agent("这个网页的主要内容是什么？", "https://example.com")
+# YouTube问题
+answer = agent("这个YouTube视频的信息是什么？", "https://www.youtube.com/watch?v=example")
+# Wikipedia问题
+answer = agent("Wikipedia关于人工智能的信息是什么？")
+```
+### 高级功能
+```python
+# 情感分析
+answer = agent("分析这段文字的情感", "这是一段需要分析的文本")
+# 关键词提取
+answer = agent("提取这段文字的关键词", "这是一段需要提取关键词的文本")
+# 文本摘要
+answer = agent("总结这段文字", "这是一段很长的文字需要总结...")
+```
+## 📊 支持的模型
+### 图像处理模型
+- **BLIP**: Salesforce/blip-image-captioning-base
+- **ResNet**: microsoft/resnet-50
+- **DETR**: facebook/detr-resnet-50
+- **GIT**: microsoft/git-base
+### 文本处理模型
+- **情感分析**: cardiffnlp/twitter-roberta-base-sentiment-latest
+- **命名实体识别**: dbmdz/bert-large-cased-finetuned-conll03-english
+- **文本摘要**: facebook/bart-large-cnn
+- **翻译**: Helsinki-NLP/opus-mt-en-zh
+### 视频处理
+- **MoviePy**: 视频编辑和处理
+- **OpenCV**: 计算机视觉处理
+- **PyTube**: YouTube视频下载
+## 🔧 自定义扩展
+### 添加新工具
+```python
+from tools import ToolManager
+class CustomTools:
+    @staticmethod
+    @tool
+    def custom_function(input_text: str) -> str:
+        """自定义工具函数"""
+        # 实现你的逻辑
+        return "处理结果"
+# 注册工具
+tool_manager = ToolManager()
+tool_manager.tools["custom_function"] = CustomTools.custom_function
+```
+### 修改工作流
+```python
+def _build_workflow(self) -> StateGraph:
+    workflow = StateGraph(AgentState)
+    # 添加自定义节点
+    workflow.add_node("custom_node", self._custom_processing)
+    # 修改工作流路径
+    workflow.add_edge("analyze_media", "custom_node")
+    workflow.add_edge("custom_node", "search_info")
+    return workflow.compile()
+```
+## 📈 性能优化
+### GPU加速
+系统会自动检测CUDA可用性并使用GPU加速：
+```python
+device = 0 if torch.cuda.is_available() else -1
+```
+### 缓存机制
+- 模型缓存：自动缓存下载的模型
+- 结果缓存：缓存分析结果避免重复计算
+### 内存优化
+- 图像尺寸限制：自动调整大图像尺寸
+- 视频帧采样：智能选择关键帧进行分析
+## 🐛 故障排除
+### 常见问题
+1. **OpenAI API错误**
+   - 检查API密钥是否正确
+   - 确认账户余额充足
+2. **模型下载失败**
+   - 检查网络连接
+   - 尝试使用镜像源
+3. **内存不足**
+   - 减少批处理大小
+   - 使用CPU模式运行
+4. **视频处理失败**
+   - 检查视频格式是否支持
+   - 确认视频文件完整性
+### 调试模式
+设置环境变量启用调试模式：
+```env
+DEBUG=True
+LOG_LEVEL=DEBUG
+```
+## 🤝 贡献指南
+欢迎提交Issue和Pull Request来改进这个项目！
+### 开发环境设置
+1. Fork项目
+2. 创建功能分支
+3. 提交更改
+4. 创建Pull Request
+## 📄 许可证
+本项目采用MIT许可证 - 查看 [LICENSE](LICENSE) 文件了解详情。
+## 🙏 致谢
+- [Hugging Face](https://huggingface.co/) - 提供优秀的预训练模型
+- [LangGraph](https://github.com/langchain-ai/langgraph) - 工作流编排框架
+- [LangChain](https://langchain.com/) - LLM应用开发框架
+- [Gradio](https://gradio.app/) - 快速构建Web界面
+---
+**注意**: 这是一个教育项目，请确保遵守相关API的使用条款和隐私政策。

SETUP.md ADDED Viewed

	@@ -0,0 +1,195 @@

+# 多模态智能体系统配置指南
+## 🚀 快速开始
+### 1. 安装依赖
+```bash
+pip install -r requirements.txt
+```
+### 2. 配置API密钥
+#### 方法一：使用配置文件（推荐）
+1. 编辑 `api_keys.json` 文件：
+```json
+{
+    "openai": {
+        "api_key": "sk-your-openai-api-key-here"
+    },
+    "huggingface": {
+        "api_key": "hf-your-huggingface-api-key-here"
+    },
+    "search_engine": {
+        "type": "duckduckgo",
+        "api_key": null
+    }
+}
+```
+2. 将你的OpenAI API密钥替换 `sk-your-openai-api-key-here`
+#### 方法二：使用环境变量
+```bash
+# Windows
+set OPENAI_API_KEY=sk-your-openai-api-key-here
+# Linux/Mac
+export OPENAI_API_KEY=sk-your-openai-api-key-here
+```
+### 3. 运行系统
+#### Web界面模式
+```bash
+python run.py --mode web
+```
+#### 测试模式
+```bash
+python run.py --mode test
+```
+#### 交互式模式
+```bash
+python run.py --mode interactive
+```
+## 🔑 API密钥获取指南
+### OpenAI API密钥
+1. 访问 [OpenAI官网](https://platform.openai.com/)
+2. 注册或登录账户
+3. 进入 "API Keys" 页面
+4. 点击 "Create new secret key"
+5. 复制生成的密钥（以 `sk-` 开头）
+### Hugging Face API密钥（可选）
+1. 访问 [Hugging Face](https://huggingface.co/)
+2. 注册或登录账户
+3. 进入 "Settings" → "Access Tokens"
+4. 点击 "New token"
+5. 复制生成的令牌（以 `hf_` 开头）
+## 🔍 搜索引擎配置
+### DuckDuckGo搜索（默认，无需API密钥）
+- 无需配置API密钥
+- 免费使用
+- 支持文本、图像、视频搜索
+### 其他搜索引擎（可选）
+如果需要使用其他搜索引擎，可以修改 `api_keys.json`：
+```json
+{
+    "search_engine": {
+        "type": "serper",
+        "api_key": "your-serper-api-key"
+    }
+}
+```
+## ⚙️ 高级配置
+### 模型配置
+在 `config.py` 中可以修改使用的模型：
+```python
+# 图像描述模型
+IMAGE_CAPTION_MODEL = "Salesforce/blip-image-captioning-base"
+# 图像分类模型
+IMAGE_CLASSIFICATION_MODEL = "microsoft/resnet-50"
+# 对象检测模型
+OBJECT_DETECTION_MODEL = "facebook/detr-resnet-50"
+```
+### 系统配置
+```python
+# 调试模式
+DEBUG = True
+# 日志级别
+LOG_LEVEL = "DEBUG"
+# 视频处理配置
+MAX_VIDEO_DURATION = 300  # 最大视频时长（秒）
+FRAMES_TO_ANALYZE = 5     # 视频分析帧数
+```
+## 🐛 常见问题
+### 1. API密钥错误
+**错误信息**: `OpenAI API密钥未配置`
+**解决方案**:
+- 检查 `api_keys.json` 文件是否存在
+- 确认API密钥格式正确（OpenAI密钥以 `sk-` 开头）
+- 验证API密钥是否有效
+### 2. 依赖包安装失败
+**错误信息**: `ModuleNotFoundError`
+**解决方案**:
+```bash
+# 升级pip
+pip install --upgrade pip
+# 重新安装依赖
+pip install -r requirements.txt --force-reinstall
+```
+### 3. 模型下载失败
+**错误信息**: `模型下载失败`
+**解决方案**:
+- 检查网络连接
+- 使用VPN或代理
+- 手动下载模型到本地缓存目录
+### 4. 内存不足
+**错误信息**: `CUDA out of memory`
+**解决方案**:
+- 减少批处理大小
+- 使用CPU模式运行
+- 关闭其他占用内存的程序
+## 📁 文件结构
+```
+Final_Assignment_Agent/
+├── api_keys.json          # API密钥配置文件
+├── config.py              # 系统配置
+├── app.py                 # 主应用
+├── tools.py               # 工具模块
+├── test_agent.py          # 测试脚本
+├── run.py                 # 启动脚本
+├── requirements.txt       # 依赖包列表
+├── README.md              # 项目说明
+└── SETUP.md              # 配置指南
+```
+## 🔒 安全注意事项
+1. **不要提交API密钥到版本控制**
+   - 将 `api_keys.json` 添加到 `.gitignore`
+   - 使用环境变量或配置文件
+2. **定期更新API密钥**
+   - 定期检查API密钥的有效性
+   - 及时更新过期的密钥
+3. **限制API使用**
+   - 设置API使用限制
+   - 监控API调用次数和费用
+## 📞 技术支持
+如果遇到问题，请：
+1. 查看错误日志
+2. 检查配置文件
+3. 运行测试脚本
+4. 查看常见问题解答
+---
+**注意**: 请确保遵守相关API的使用条款和隐私政策。

api_keys copy.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+    "openai": {
+        "api_key":""
+    },
+    "huggingface": {
+        "api_key": ""
+    },
+    "search_engine": {
+        "type": "duckduckgo",
+        "api_key": null
+    }
+}

app.py CHANGED Viewed

@@ -1,54 +1,946 @@
 import os
 import gradio as gr
 import requests
-import inspect
 import pandas as pd
-# (Keep Constants as is)
-# --- Constants ---
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
-# --- Basic Agent Definition ---
-# ----- THIS IS WERE YOU CAN BUILD WHAT YOU WANT ------
-class BasicAgent:
     def __init__(self):
-        print("BasicAgent initialized.")
-    def __call__(self, question: str) -> str:
-        print(f"Agent received question (first 50 chars): {question[:50]}...")
-        fixed_answer = "This is a default answer."
-        print(f"Agent returning fixed answer: {fixed_answer}")
-        return fixed_answer
-def run_and_submit_all( profile: gr.OAuthProfile | None):
-    """
-    Fetches all questions, runs the BasicAgent on them, submits all answers,
-    and displays the results.
-    """
-    # --- Determine HF Space Runtime URL and Repo URL ---
-    space_id = os.getenv("SPACE_ID") # Get the SPACE_ID for sending link to the code
     if profile:
-        username= f"{profile.username}"
         print(f"User logged in: {username}")
     else:
         print("User not logged in.")
         return "Please Login to Hugging Face with the button.", None
     api_url = DEFAULT_API_URL
     questions_url = f"{api_url}/questions"
     submit_url = f"{api_url}/submit"
-    # 1. Instantiate Agent ( modify this part to create your agent)
     try:
-        agent = BasicAgent()
     except Exception as e:
         print(f"Error instantiating agent: {e}")
         return f"Error initializing agent: {e}", None
-    # In the case of an app running as a hugging Face space, this link points toward your codebase ( usefull for others so please keep it public)
     agent_code = f"https://huggingface.co/spaces/{space_id}/tree/main"
     print(agent_code)
-    # 2. Fetch Questions
     print(f"Fetching questions from: {questions_url}")
     try:
         response = requests.get(questions_url, timeout=15)
@@ -58,27 +950,22 @@ def run_and_submit_all( profile: gr.OAuthProfile | None):
              print("Fetched questions list is empty.")
              return "Fetched questions list is empty or invalid format.", None
         print(f"Fetched {len(questions_data)} questions.")
-    except requests.exceptions.RequestException as e:
         print(f"Error fetching questions: {e}")
         return f"Error fetching questions: {e}", None
-    except requests.exceptions.JSONDecodeError as e:
-         print(f"Error decoding JSON response from questions endpoint: {e}")
-         print(f"Response text: {response.text[:500]}")
-         return f"Error decoding server response for questions: {e}", None
-    except Exception as e:
-        print(f"An unexpected error occurred fetching questions: {e}")
-        return f"An unexpected error occurred fetching questions: {e}", None
-    # 3. Run your Agent
     results_log = []
     answers_payload = []
     print(f"Running agent on {len(questions_data)} questions...")
     for item in questions_data:
         task_id = item.get("task_id")
         question_text = item.get("question")
         if not task_id or question_text is None:
             print(f"Skipping item with missing task_id or question: {item}")
             continue
         try:
             submitted_answer = agent(question_text)
             answers_payload.append({"task_id": task_id, "submitted_answer": submitted_answer})
@@ -91,12 +978,12 @@ def run_and_submit_all( profile: gr.OAuthProfile | None):
         print("Agent did not produce any answers to submit.")
         return "Agent did not produce any answers to submit.", pd.DataFrame(results_log)
-    # 4. Prepare Submission
     submission_data = {"username": username.strip(), "agent_code": agent_code, "answers": answers_payload}
     status_update = f"Agent finished. Submitting {len(answers_payload)} answers for user '{username}'..."
     print(status_update)
-    # 5. Submit
     print(f"Submitting {len(answers_payload)} answers to: {submit_url}")
     try:
         response = requests.post(submit_url, json=submission_data, timeout=60)
@@ -112,85 +999,85 @@ def run_and_submit_all( profile: gr.OAuthProfile | None):
         print("Submission successful.")
         results_df = pd.DataFrame(results_log)
         return final_status, results_df
-    except requests.exceptions.HTTPError as e:
-        error_detail = f"Server responded with status {e.response.status_code}."
-        try:
-            error_json = e.response.json()
-            error_detail += f" Detail: {error_json.get('detail', e.response.text)}"
-        except requests.exceptions.JSONDecodeError:
-            error_detail += f" Response: {e.response.text[:500]}"
-        status_message = f"Submission Failed: {error_detail}"
-        print(status_message)
-        results_df = pd.DataFrame(results_log)
-        return status_message, results_df
-    except requests.exceptions.Timeout:
-        status_message = "Submission Failed: The request timed out."
-        print(status_message)
-        results_df = pd.DataFrame(results_log)
-        return status_message, results_df
-    except requests.exceptions.RequestException as e:
-        status_message = f"Submission Failed: Network error - {e}"
-        print(status_message)
-        results_df = pd.DataFrame(results_log)
-        return status_message, results_df
     except Exception as e:
-        status_message = f"An unexpected error occurred during submission: {e}"
         print(status_message)
         results_df = pd.DataFrame(results_log)
         return status_message, results_df
-# --- Build Gradio Interface using Blocks ---
 with gr.Blocks() as demo:
-    gr.Markdown("# Basic Agent Evaluation Runner")
     gr.Markdown(
         """
-        **Instructions:**
-        1.  Please clone this space, then modify the code to define your agent's logic, the tools, the necessary packages, etc ...
-        2.  Log in to your Hugging Face account using the button below. This uses your HF username for submission.
-        3.  Click 'Run Evaluation & Submit All Answers' to fetch questions, run your agent, submit answers, and see the score.
-        ---
-        **Disclaimers:**
-        Once clicking on the "submit button, it can take quite some time ( this is the time for the agent to go through all the questions).
-        This space provides a basic setup and is intentionally sub-optimal to encourage you to develop your own, more robust solution. For instance for the delay process of the submit button, a solution could be to cache the answers and submit in a seperate action or even to answer the questions in async.
         """
     )
     gr.LoginButton()
-    run_button = gr.Button("Run Evaluation & Submit All Answers")
-    status_output = gr.Textbox(label="Run Status / Submission Result", lines=5, interactive=False)
-    # Removed max_rows=10 from DataFrame constructor
-    results_table = gr.DataFrame(label="Questions and Agent Answers", wrap=True)
     run_button.click(
         fn=run_and_submit_all,
         outputs=[status_output, results_table]
     )
 if __name__ == "__main__":
-    print("\n" + "-"*30 + " App Starting " + "-"*30)
-    # Check for SPACE_HOST and SPACE_ID at startup for information
     space_host_startup = os.getenv("SPACE_HOST")
-    space_id_startup = os.getenv("SPACE_ID") # Get SPACE_ID at startup
     if space_host_startup:
         print(f"✅ SPACE_HOST found: {space_host_startup}")
-        print(f"   Runtime URL should be: https://{space_host_startup}.hf.space")
     else:
         print("ℹ️  SPACE_HOST environment variable not found (running locally?).")
-    if space_id_startup: # Print repo URLs if SPACE_ID is found
         print(f"✅ SPACE_ID found: {space_id_startup}")
         print(f"   Repo URL: https://huggingface.co/spaces/{space_id_startup}")
-        print(f"   Repo Tree URL: https://huggingface.co/spaces/{space_id_startup}/tree/main")
     else:
-        print("ℹ️  SPACE_ID environment variable not found (running locally?). Repo URL cannot be determined.")
-    print("-"*(60 + len(" App Starting ")) + "\n")
-    print("Launching Gradio Interface for Basic Agent Evaluation...")
     demo.launch(debug=True, share=False)

 import os
 import gradio as gr
 import requests
 import pandas as pd
+import json
+import base64
+import io
+from typing import Dict, List, Any, Optional, Union
+from dataclasses import dataclass
+from pathlib import Path
+import tempfile
+import cv2
+import numpy as np
+from PIL import Image
+import torch
+from transformers import pipeline, AutoProcessor, AutoModel
+# import moviepy.editor as mp  # 暂时注释掉，需要安装moviepy
+# from pytube import YouTube   # 暂时注释掉，需要安装pytube
+import urllib.request
+from langgraph.graph import StateGraph, END
+from langchain_core.messages import HumanMessage, AIMessage
+from langchain_openai import ChatOpenAI
+from langchain_community.tools import DuckDuckGoSearchRun
+from langchain_core.tools import tool
+import matplotlib.pyplot as plt
+import seaborn as sns
+# 环境变量设置
+from dotenv import load_dotenv
+load_dotenv()
+# 导入自定义模块
+from config import Config
+from tools import ToolManager
+from prompts import get_answer_prompt, ERROR_ANSWER_TEMPLATE
+# 常量定义
 DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
+@dataclass
+class AgentState:
+    """智能体状态类"""
+    question: str
+    media_type: Optional[str] = None  # 'image', 'video', 'text'
+    media_path: Optional[str] = None
+    extracted_info: Dict[str, Any] = None
+    search_results: List[str] = None
+    analysis_results: Dict[str, Any] = None
+    workflow_plan: List[Dict[str, Any]] = None  # 工作流计划
+    current_step: int = 0  # 当前执行步骤
+    final_answer: str = ""
+    error: Optional[str] = None
+    def __post_init__(self):
+        if self.extracted_info is None:
+            self.extracted_info = {}
+        if self.search_results is None:
+            self.search_results = []
+        if self.analysis_results is None:
+            self.analysis_results = {}
+        if self.workflow_plan is None:
+            self.workflow_plan = []
+class MediaAnalyzer:
+    """媒体分析器类"""
     def __init__(self):
+        # 初始化图像分析模型
+        self.image_processor = AutoProcessor.from_pretrained("microsoft/git-base")
+        self.image_model = AutoModel.from_pretrained("microsoft/git-base")
+        # 初始化图像描述模型
+        self.image_caption_pipeline = pipeline(
+            "image-to-text",
+            model="Salesforce/blip-image-captioning-base",
+            device=0 if torch.cuda.is_available() else -1
+        )
+        # 初始化图像分类模型
+        self.image_classification_pipeline = pipeline(
+            "image-classification",
+            model="microsoft/resnet-50",
+            device=0 if torch.cuda.is_available() else -1
+        )
+        # 初始化对象检测模型
+        self.object_detection_pipeline = pipeline(
+            "object-detection",
+            model="facebook/detr-resnet-50",
+            device=0 if torch.cuda.is_available() else -1
+        )
+        print("MediaAnalyzer initialized successfully")
+    def analyze_image(self, image_path: str) -> Dict[str, Any]:
+        """分析图像内容"""
+        try:
+            # 加载图像
+            image = Image.open(image_path)
+            # 图像描述
+            caption_result = self.image_caption_pipeline(image)
+            caption = caption_result[0]['generated_text']
+            # 图像分类
+            classification_result = self.image_classification_pipeline(image)
+            top_classes = classification_result[:5]
+            # 对象检测
+            detection_result = self.object_detection_pipeline(image)
+            detected_objects = []
+            for detection in detection_result:
+                detected_objects.append({
+                    'label': detection['label'],
+                    'confidence': detection['score'],
+                    'box': detection['box']
+                })
+            # 图像基本信息
+            image_info = {
+                'size': image.size,
+                'mode': image.mode,
+                'format': image.format
+            }
+            return {
+                'caption': caption,
+                'classification': top_classes,
+                'detected_objects': detected_objects,
+                'image_info': image_info
+            }
+        except Exception as e:
+            return {'error': f"图像分析失败: {str(e)}"}
+    def analyze_video(self, video_path: str) -> Dict[str, Any]:
+        """分析视频内容 - 真正让VLLM看视频"""
+        try:
+            # 使用OpenCV分析视频
+            cap = cv2.VideoCapture(video_path)
+            if not cap.isOpened():
+                return {'error': "无法打开视频文件"}
+            # 获取视频基本信息
+            fps = cap.get(cv2.CAP_PROP_FPS)
+            frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            duration = frame_count / fps if fps > 0 else 0
+            width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+            height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+            print(f"🎬 开始分析视频: {frame_count}帧, {fps}fps, 时长{duration:.1f}秒")
+            # 提取关键帧进行分析（每秒1帧）
+            frames_analyzed = []
+            frame_interval = max(1, int(fps))  # 每秒1帧
+            for i in range(0, frame_count, frame_interval):
+                cap.set(cv2.CAP_PROP_POS_FRAMES, i)
+                ret, frame = cap.read()
+                if ret:
+                    # 转换为PIL图像进行分析
+                    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+                    pil_image = Image.fromarray(frame_rgb)
+                    # 使用VLLM分析图像
+                    try:
+                        caption_result = self.image_caption_pipeline(pil_image)
+                        frame_info = {
+                            "frame_number": i,
+                            "timestamp": i / fps if fps > 0 else 0,
+                            "caption": caption_result[0]['generated_text']
+                        }
+                        frames_analyzed.append(frame_info)
+                        print(f"📸 第{i//frame_interval}帧 ({i/fps:.1f}s): {frame_info['caption']}")
+                    except Exception as e:
+                        print(f"帧分析失败: {e}")
+                        frames_analyzed.append({
+                            "frame_number": i,
+                            "timestamp": i / fps if fps > 0 else 0,
+                            "caption": "无法分析此帧"
+                        })
+            cap.release()
+            # 生成视频内容总结
+            if frames_analyzed:
+                # 提取所有描述
+                descriptions = [frame['caption'] for frame in frames_analyzed if frame['caption'] != "无法分析此帧"]
+                if descriptions:
+                    # 使用LLM总结视频内容
+                    summary_prompt = f"""
+基于以下视频帧描述，总结这个视频的主要内容：
+{chr(10).join([f"时间 {frame['timestamp']:.1f}s: {frame['caption']}" for frame in frames_analyzed[:10]])}
+请用中文总结这个视频的主要内容：
+"""
+                    try:
+                        from langchain_openai import ChatOpenAI
+                        llm = ChatOpenAI(
+                            model="gpt-3.5-turbo",
+                            temperature=0.7,
+                            api_key=Config.OPENAI_API_KEY
+                        )
+                        summary_response = llm.invoke(summary_prompt)
+                        video_summary = summary_response.content
+                    except:
+                        video_summary = f"视频包含{len(frames_analyzed)}个场景，主要展示了各种视觉内容"
+                else:
+                    video_summary = "无法分析视频内容"
+            else:
+                video_summary = "视频分析失败"
+            return {
+                'type': 'video',
+                'video_info': {
+                    'duration': duration,
+                    'fps': fps,
+                    'frame_count': frame_count,
+                    'resolution': f"{width}x{height}"
+                },
+                'frames_analyzed': frames_analyzed[:10],  # 只返回前10帧
+                'video_summary': video_summary,
+                'analysis_method': 'OpenCV + VLLM',
+                'summary': f"视频时长{duration:.1f}秒，分析了{len(frames_analyzed)}个关键帧，内容：{video_summary}"
+            }
+        except Exception as e:
+            return {'error': f"视频分析失败: {str(e)}"}
+    def download_media(self, url: str, media_type: str) -> str:
+        """下载媒体文件"""
+        try:
+            if media_type == 'video':
+                # 简化版本：对于视频，只返回URL
+                print("⚠️  视频下载功能需要安装moviepy和pytube")
+                return url
+            else:
+                # 下载图像文件
+                temp_path = tempfile.mktemp(suffix='.jpg')
+                urllib.request.urlretrieve(url, temp_path)
+                return temp_path
+        except Exception as e:
+            raise Exception(f"媒体下载失败: {str(e)}")
+class SearchEngine:
+    """搜索引擎类"""
+    def __init__(self):
+        self.search_tool = DuckDuckGoSearchRun()
+    def search(self, query: str) -> List[str]:
+        """执行搜索"""
+        try:
+            results = self.search_tool.run(query)
+            return [results] if isinstance(results, str) else results
+        except Exception as e:
+            return [f"搜索失败: {str(e)}"]
+class MultiModalAgent:
+    """多模态智能体主类"""
+    def __init__(self):
+        # 验证配置
+        if not Config.validate():
+            raise ValueError("配置验证失败，请检查环境变量")
+        self.media_analyzer = MediaAnalyzer()
+        self.search_engine = SearchEngine()
+        self.tool_manager = ToolManager()
+        self.llm = ChatOpenAI(
+            model=Config.OPENAI_MODEL,
+            temperature=Config.OPENAI_TEMPERATURE,
+            api_key=Config.OPENAI_API_KEY
+        )
+        # 构建LangGraph工作流
+        self.workflow = self._build_workflow()
+        print("MultiModalAgent initialized successfully")
+    def _build_workflow(self) -> StateGraph:
+        """构建LangGraph工作流"""
+        # 创建状态图
+        workflow = StateGraph(AgentState)
+        # 添加节点
+        workflow.add_node("plan_workflow", self._plan_workflow)
+        workflow.add_node("classify_media", self._classify_media)
+        workflow.add_node("analyze_media", self._analyze_media)
+        workflow.add_node("search_info", self._search_info)
+        workflow.add_node("use_tools", self._use_tools)
+        workflow.add_node("synthesize_answer", self._synthesize_answer)
+        # 设置入口点
+        workflow.set_entry_point("plan_workflow")
+        # 添加边
+        workflow.add_edge("plan_workflow", "classify_media")
+        workflow.add_edge("classify_media", "analyze_media")
+        workflow.add_edge("analyze_media", "search_info")
+        workflow.add_edge("search_info", "use_tools")
+        workflow.add_edge("use_tools", "synthesize_answer")
+        workflow.add_edge("synthesize_answer", END)
+        return workflow.compile()
+    def _plan_workflow(self, state: AgentState) -> AgentState:
+        """智能规划工作流"""
+        try:
+            # 使用LLM分析任务并制定工作流计划
+            planning_prompt = f"""
+你是一个智能工作流规划专家。请分析以下任务，并制定一个详细的工作流计划。
+任务: {state.question}
+请根据任务类型和需求，设计一个合适的工作流。工作流应该包含以下信息：
+1. 步骤编号
+2. 步骤名称
+3. 步骤描述
+4. 是否需要搜索网络
+5. 需要使用哪些工具
+6. 预期输出
+请以JSON格式返回工作流计划，格式如下：
+{{
+    "workflow": [
+        {{
+            "step": 1,
+            "name": "步骤名称",
+            "description": "步骤描述",
+            "needs_search": true/false,
+            "tools": ["工具1", "工具2"],
+            "expected_output": "预期输出"
+        }}
+    ]
+}}
+请确保工作流是合理的、高效的，并且能够完成任务。
+"""
+            # 调用LLM进行工作流规划
+            response = self.llm.invoke(planning_prompt)
+            # 解析工作流计划
+            try:
+                import json
+                # 尝试从响应中提取JSON
+                if "```json" in response.content:
+                    json_start = response.content.find("```json") + 7
+                    json_end = response.content.find("```", json_start)
+                    json_str = response.content[json_start:json_end].strip()
+                else:
+                    # 尝试直接解析
+                    json_str = response.content.strip()
+                workflow_data = json.loads(json_str)
+                state.workflow_plan = workflow_data.get("workflow", [])
+                print(f"🤖 工作流规划完成，共 {len(state.workflow_plan)} 个步骤:")
+                for step in state.workflow_plan:
+                    print(f"   📋 步骤 {step.get('step', '?')}: {step.get('name', 'Unknown')}")
+                    print(f"      {step.get('description', 'No description')}")
+                    if step.get('needs_search', False):
+                        print(f"      🔍 需要搜索: 是")
+                    if step.get('tools'):
+                        print(f"      🛠️ 工具: {', '.join(step['tools'])}")
+                    print()
+            except json.JSONDecodeError:
+                # 如果JSON���析失败，使用默认工作流
+                print("⚠️ 工作流规划解析失败，使用默认工作流")
+                state.workflow_plan = [
+                    {
+                        "step": 1,
+                        "name": "媒体分类",
+                        "description": "分析任务中的媒体类型",
+                        "needs_search": False,
+                        "tools": [],
+                        "expected_output": "确定媒体类型"
+                    },
+                    {
+                        "step": 2,
+                        "name": "媒体分析",
+                        "description": "分析媒体内容",
+                        "needs_search": False,
+                        "tools": ["媒体分析工具"],
+                        "expected_output": "提取媒体信息"
+                    },
+                    {
+                        "step": 3,
+                        "name": "信息搜索",
+                        "description": "搜索相关信息",
+                        "needs_search": True,
+                        "tools": ["搜索引擎"],
+                        "expected_output": "搜索结果"
+                    },
+                    {
+                        "step": 4,
+                        "name": "工具使用",
+                        "description": "使用专业工具",
+                        "needs_search": False,
+                        "tools": ["各种专业工具"],
+                        "expected_output": "工具分析结果"
+                    },
+                    {
+                        "step": 5,
+                        "name": "答案合成",
+                        "description": "综合所有信息生成答案",
+                        "needs_search": False,
+                        "tools": [],
+                        "expected_output": "最终答案"
+                    }
+                ]
+        except Exception as e:
+            print(f"❌ 工作流规划失败: {e}")
+            # 使用默认工作流
+            state.workflow_plan = [
+                {
+                    "step": 1,
+                    "name": "默认工作流",
+                    "description": "使用默认工作流处理任务",
+                    "needs_search": True,
+                    "tools": [],
+                    "expected_output": "任务完成"
+                }
+            ]
+        return state
+    def _classify_media(self, state: AgentState) -> AgentState:
+        """分类媒体类型"""
+        question = state.question.lower()
+        # 提取URL
+        import re
+        url_pattern = r'https?://[^\s]+'
+        urls = re.findall(url_pattern, state.question)
+        # 检测媒体类型
+        if any(keyword in question for keyword in ['图片', '图像', 'image', 'photo', 'img']):
+            state.media_type = 'image'
+        elif any(keyword in question for keyword in ['视频', 'video', 'movie', 'clip']):
+            state.media_type = 'video'
+        elif any(keyword in question for keyword in ['pdf', '文档', 'document', '报告', 'report']):
+            state.media_type = 'pdf'
+        elif any(keyword in question for keyword in ['网页', '网站', 'webpage', 'website', 'url', 'http', 'https']):
+            state.media_type = 'webpage'
+        elif any(keyword in question for keyword in ['youtube', 'yt', '视频', 'video']) and 'youtube.com' in question.lower():
+            state.media_type = 'youtube'
+        elif any(keyword in question for keyword in ['wikipedia', 'wiki', '维基', '百科']):
+            state.media_type = 'wikipedia'
+        else:
+            state.media_type = 'text'
+        # 设置媒体路径
+        if urls:
+            state.media_path = urls[0]  # 使用第一个URL
+        else:
+            state.media_path = None
+        return state
+    def _analyze_media(self, state: AgentState) -> AgentState:
+        """分析媒体内容"""
+        if state.media_type == 'image' and state.media_path:
+            state.extracted_info = self.media_analyzer.analyze_image(state.media_path)
+        elif state.media_type == 'video' and state.media_path:
+            state.extracted_info = self.media_analyzer.analyze_video(state.media_path)
+        elif state.media_type == 'pdf' and state.media_path:
+            # PDF分析
+            pdf_info = self.tool_manager.execute_tool('analyze_pdf_structure', pdf_path=state.media_path)
+            pdf_text = self.tool_manager.execute_tool('extract_text_from_pdf', pdf_path=state.media_path)
+            state.extracted_info = {
+                'type': 'pdf',
+                'pdf_info': pdf_info,
+                'text_content': pdf_text[:2000] if len(pdf_text) > 2000 else pdf_text  # 限制文本长度
+            }
+        elif state.media_type == 'webpage' and state.media_path:
+            # 网页分析
+            webpage_content = self.tool_manager.execute_tool('fetch_webpage_content', url=state.media_path)
+            webpage_structure = self.tool_manager.execute_tool('analyze_webpage_structure', url=state.media_path)
+            state.extracted_info = {
+                'type': 'webpage',
+                'webpage_content': webpage_content,
+                'webpage_structure': webpage_structure
+            }
+        elif state.media_type == 'youtube' and state.media_path:
+            # YouTube分析
+            youtube_info = self.tool_manager.execute_tool('get_youtube_info', url=state.media_path)
+            youtube_thumbnail = self.tool_manager.execute_tool('download_youtube_thumbnail', url=state.media_path)
+            state.extracted_info = {
+                'type': 'youtube',
+                'youtube_info': youtube_info,
+                'thumbnail_path': youtube_thumbnail
+            }
+        elif state.media_type == 'wikipedia':
+            # Wikipedia分析 - 从问题中提取搜索词
+            import re
+            # 提取可能的Wikipedia页面标题
+            wiki_pattern = r'(?:wikipedia|wiki|维基|百科)\s*(?:关于|的|页面|词条)?\s*[：:]\s*(.+)'
+            match = re.search(wiki_pattern, state.question, re.IGNORECASE)
+            if match:
+                search_term = match.group(1).strip()
+            else:
+                # 如果没有明确格式，尝试提取关键词
+                words = state.question.split()
+                search_term = ' '.join([w for w in words if w not in ['wikipedia', 'wiki', '维基', '百科', '的', '是', '什么', '关于']])
+            if search_term:
+                # 搜索Wikipedia
+                wiki_search = self.tool_manager.execute_tool('search_wikipedia', query=search_term, max_results=3)
+                if wiki_search and not 'error' in wiki_search[0]:
+                    # 获取第一个结果的详细信息
+                    first_result = wiki_search[0]
+                    wiki_page = self.tool_manager.execute_tool('get_wikipedia_page', title=first_result['title'])
+                    state.extracted_info = {
+                        'type': 'wikipedia',
+                        'search_term': search_term,
+                        'search_results': wiki_search,
+                        'page_content': wiki_page
+                    }
+                else:
+                    state.extracted_info = {
+                        'type': 'wikipedia',
+                        'search_term': search_term,
+                        'error': '未找到相关Wikipedia页面'
+                    }
+            else:
+                state.extracted_info = {
+                    'type': 'wikipedia',
+                    'error': '无法提取搜索词'
+                }
+        else:
+            state.extracted_info = {'type': 'text', 'content': state.question}
+        return state
+    def _search_info(self, state: AgentState) -> AgentState:
+        """智能搜索相关信息"""
+        # 根据工作流计划决定是否搜索
+        should_search = False
+        # 检查当前步骤是否需要搜索
+        if state.workflow_plan and state.current_step < len(state.workflow_plan):
+            current_step_plan = state.workflow_plan[state.current_step]
+            should_search = current_step_plan.get('needs_search', False)
+        # 如果没有工作流计划，使用原来的逻辑
+        if not state.workflow_plan:
+            should_search = self.tool_manager.should_use_search(state.question, {'extracted_info': state.extracted_info})
+        if should_search:
+            print(f"🔍 执行搜索 (步骤 {state.current_step + 1})")
+            # 构建搜索查询
+            search_query = state.question
+            if state.extracted_info and 'caption' in state.extracted_info:
+                search_query += f" {state.extracted_info['caption']}"
+            state.search_results = self.search_engine.search(search_query)
+            print(f"✅ 搜索完成，找到 {len(state.search_results)} 个结果")
+        else:
+            print(f"⏭️ 跳过搜索 (步骤 {state.current_step + 1})")
+            # 不需要搜索，设置为空
+            state.search_results = []
+        # 更新当前步骤
+        state.current_step += 1
+        return state
+    def _use_tools(self, state: AgentState) -> AgentState:
+        """使用工具进行额外分析"""
+        try:
+            tool_results = {}
+            # 根据工作流计划选择工具
+            current_tools = []
+            if state.workflow_plan and state.current_step < len(state.workflow_plan):
+                current_step_plan = state.workflow_plan[state.current_step]
+                current_tools = current_step_plan.get('tools', [])
+                print(f"🛠️ 使用工具 (步骤 {state.current_step + 1}): {', '.join(current_tools) if current_tools else '无'}")
+            # 如果没有工作流计划或工具列表为空，使用原来的逻辑
+            if not current_tools:
+                question_lower = state.question.lower()
+            # 代码分析工具
+            if any(keyword in question_lower for keyword in ['代码', 'code', 'python', '程序', 'program']):
+                # 检查是否有代码内容
+                if '```python' in state.question or 'def ' in state.question or 'import ' in state.question:
+                    # 提取代码块
+                    code_start = state.question.find('```python')
+                    if code_start != -1:
+                        code_end = state.question.find('```', code_start + 8)
+                        if code_end != -1:
+                            code = state.question[code_start + 8:code_end].strip()
+                        else:
+                            code = state.question[code_start + 8:].strip()
+                    else:
+                        # 尝试提取代码片段
+                        lines = state.question.split('\n')
+                        code_lines = []
+                        for line in lines:
+                            if line.strip().startswith(('def ', 'import ', 'class ', 'if ', 'for ', 'while ')):
+                                code_lines.append(line)
+                        code = '\n'.join(code_lines)
+                    if code.strip():
+                        # 分析代码
+                        tool_results['code_analysis'] = self.tool_manager.execute_tool(
+                            'analyze_python_code',
+                            code=code
+                        )
+                        # 解释代码
+                        tool_results['code_explanation'] = self.tool_manager.execute_tool(
+                            'explain_code',
+                            code=code
+                        )
+                        # 如果需要执行代码
+                        if any(keyword in question_lower for keyword in ['运行', '执行', 'execute', 'run']):
+                            tool_results['code_execution'] = self.tool_manager.execute_tool(
+                                'execute_python_code',
+                                code=code
+                            )
+            # 视频内容分析
+            if state.media_type == 'video' and state.media_path:
+                if any(keyword in question_lower for keyword in ['视频', 'video', '内容', 'content']):
+                    tool_results['video_analysis'] = self.tool_manager.execute_tool(
+                        'analyze_video_content',
+                        video_path=state.media_path
+                    )
+            # PDF内容分析
+            if state.media_type == 'pdf' and state.media_path:
+                if any(keyword in question_lower for keyword in ['pdf', '文档', 'document', '内容', 'content', '总结', 'summary']):
+                    tool_results['pdf_summary'] = self.tool_manager.execute_tool(
+                        'summarize_pdf_content',
+                        pdf_path=state.media_path
+                    )
+                # PDF文本搜索
+                if any(keyword in question_lower for keyword in ['搜索', '查找', 'search', 'find']):
+                    # 尝试从问题中提取搜索词
+                    search_terms = []
+                    for word in question_lower.split():
+                        if len(word) > 2 and word not in ['搜索', '查找', 'search', 'find', 'pdf', '文档']:
+                            search_terms.append(word)
+                    if search_terms:
+                        search_term = ' '.join(search_terms[:3])  # 最多3个词
+                        tool_results['pdf_search'] = self.tool_manager.execute_tool(
+                            'search_text_in_pdf',
+                            pdf_path=state.media_path,
+                            search_term=search_term
+                        )
+                # PDF图像提取
+                if any(keyword in question_lower for keyword in ['图像', '图片', 'image', '图', '图表']):
+                    tool_results['pdf_images'] = self.tool_manager.execute_tool(
+                        'extract_images_from_pdf',
+                        pdf_path=state.media_path
+                    )
+            # 网页内容分析
+            if state.media_type == 'webpage' and state.media_path:
+                if any(keyword in question_lower for keyword in ['网页', '网站', 'webpage', 'website', '内容', 'content', '总结', 'summary']):
+                    tool_results['webpage_summary'] = self.tool_manager.execute_tool(
+                        'summarize_webpage_content',
+                        url=state.media_path
+                    )
+                # 网页文本搜索
+                if any(keyword in question_lower for keyword in ['搜索', '查找', 'search', 'find']):
+                    # 尝试从问题中提取搜索词
+                    search_terms = []
+                    for word in question_lower.split():
+                        if len(word) > 2 and word not in ['搜索', '查找', 'search', 'find', '网页', '网站']:
+                            search_terms.append(word)
+                    if search_terms:
+                        search_term = ' '.join(search_terms[:3])  # 最多3个词
+                        tool_results['webpage_search'] = self.tool_manager.execute_tool(
+                            'search_content_in_webpage',
+                            url=state.media_path,
+                            search_term=search_term
+                        )
+                # 网页链接提取
+                if any(keyword in question_lower for keyword in ['链接', 'link', 'url', '地址']):
+                    tool_results['webpage_links'] = self.tool_manager.execute_tool(
+                        'extract_links_from_webpage',
+                        url=state.media_path
+                    )
+                # 网页可访问性检查
+                if any(keyword in question_lower for keyword in ['可访问性', 'accessibility', '无障碍', '检查']):
+                    tool_results['webpage_accessibility'] = self.tool_manager.execute_tool(
+                        'check_webpage_accessibility',
+                        url=state.media_path
+                    )
+            # YouTube内容分析
+            if state.media_type == 'youtube' and state.media_path:
+                if any(keyword in question_lower for keyword in ['youtube', '视频', 'video', '内容', 'content', '信息', 'info']):
+                    # 获取YouTube信息已经在_analyze_media中完成
+                    pass
+                # YouTube视频下载
+                if any(keyword in question_lower for keyword in ['下载', 'download', '保存', 'save']):
+                    tool_results['youtube_download'] = self.tool_manager.execute_tool(
+                        'download_youtube_video',
+                        url=state.media_path
+                    )
+                # YouTube音频提取
+                if any(keyword in question_lower for keyword in ['音频', 'audio', '声音', 'sound', '提取', 'extract']):
+                    tool_results['youtube_audio'] = self.tool_manager.execute_tool(
+                        'extract_youtube_audio',
+                        url=state.media_path
+                    )
+                # YouTube评论分析
+                if any(keyword in question_lower for keyword in ['评论', 'comment', '反馈', 'feedback']):
+                    tool_results['youtube_comments'] = self.tool_manager.execute_tool(
+                        'analyze_youtube_comments',
+                        url=state.media_path
+                    )
+            # Wikipedia内容分析
+            if state.media_type == 'wikipedia':
+                if any(keyword in question_lower for keyword in ['wikipedia', 'wiki', '维基', '百科', '搜索', 'search']):
+                    # Wikipedia搜索已经在_analyze_media中完成
+                    pass
+                # Wikipedia页面分类
+                if any(keyword in question_lower for keyword in ['分类', 'category', '类别']):
+                    if state.extracted_info and 'page_content' in state.extracted_info and 'title' in state.extracted_info['page_content']:
+                        tool_results['wikipedia_categories'] = self.tool_manager.execute_tool(
+                            'get_wikipedia_categories',
+                            title=state.extracted_info['page_content']['title']
+                        )
+                # Wikipedia页面链接
+                if any(keyword in question_lower for keyword in ['链接', 'link', '相关', 'related']):
+                    if state.extracted_info and 'page_content' in state.extracted_info and 'title' in state.extracted_info['page_content']:
+                        tool_results['wikipedia_links'] = self.tool_manager.execute_tool(
+                            'get_wikipedia_links',
+                            title=state.extracted_info['page_content']['title']
+                        )
+                # Wikipedia搜索建议
+                if any(keyword in question_lower for keyword in ['建议', 'suggestion', '推荐', 'recommend']):
+                    if state.extracted_info and 'search_term' in state.extracted_info:
+                        tool_results['wikipedia_suggestions'] = self.tool_manager.execute_tool(
+                            'get_wikipedia_suggestions',
+                            query=state.extracted_info['search_term']
+                        )
+                # 英文Wikipedia搜索
+                if any(keyword in question_lower for keyword in ['英文', 'english', '英文版']):
+                    if state.extracted_info and 'search_term' in state.extracted_info:
+                        tool_results['wikipedia_english_search'] = self.tool_manager.execute_tool(
+                            'search_wikipedia_english',
+                            query=state.extracted_info['search_term']
+                        )
+                # 随机Wikipedia页面
+                if any(keyword in question_lower for keyword in ['随机', 'random', '随便', '任意']):
+                    tool_results['wikipedia_random'] = self.tool_manager.execute_tool(
+                        'get_wikipedia_random_page'
+                    )
+            # 文本分析工具
+            if any(keyword in question_lower for keyword in ['情感', '情绪', 'sentiment', 'emotion']):
+                if state.extracted_info and 'caption' in state.extracted_info:
+                    tool_results['sentiment'] = self.tool_manager.execute_tool(
+                        'analyze_text_sentiment',
+                        text=state.extracted_info['caption']
+                    )
+            # 关键词提取
+            if any(keyword in question_lower for keyword in ['关键词', '关键', 'keywords', 'key']):
+                tool_results['keywords'] = self.tool_manager.execute_tool(
+                    'extract_keywords',
+                    text=state.question
+                )
+            # 文本摘要
+            if any(keyword in question_lower for keyword in ['摘要', '总结', 'summary', 'summarize']):
+                if state.search_results:
+                    combined_text = " ".join(state.search_results)
+                    tool_results['summary'] = self.tool_manager.execute_tool(
+                        'summarize_text',
+                        text=combined_text
+                    )
+            # 图像文本提取
+            if state.media_type == 'image' and state.media_path:
+                if any(keyword in question_lower for keyword in ['文字', '文本', 'text', 'ocr']):
+                    tool_results['ocr_text'] = self.tool_manager.execute_tool(
+                        'extract_text_from_image',
+                        image_path=state.media_path
+                    )
+            # 视频音频分析
+            if state.media_type == 'video' and state.media_path:
+                if any(keyword in question_lower for keyword in ['音频', '声音', 'audio', 'sound']):
+                    tool_results['audio_info'] = self.tool_manager.execute_tool(
+                        'extract_video_audio',
+                        video_path=state.media_path
+                    )
+            # 数学计算
+            if any(keyword in question_lower for keyword in ['计算', 'calculate', 'math', '数学']):
+                # 尝试提取数学表达式
+                import re
+                math_pattern = r'[\d\+\-\*\/\(\)\.\s]+'
+                math_matches = re.findall(math_pattern, state.question)
+                for match in math_matches:
+                    if len(match.strip()) > 3:  # 至少3个字符
+                        try:
+                            tool_results['math_calculation'] = self.tool_manager.execute_tool(
+                                'calculate_math_expression',
+                                expression=match.strip()
+                            )
+                            break
+                        except:
+                            continue
+            # 翻译
+            if any(keyword in question_lower for keyword in ['翻译', 'translate']):
+                # 提取需要翻译的文本
+                text_to_translate = state.question
+                if '翻译' in text_to_translate:
+                    text_to_translate = text_to_translate.split('翻译')[-1].strip()
+                elif 'translate' in text_to_translate:
+                    text_to_translate = text_to_translate.split('translate')[-1].strip()
+                if text_to_translate and len(text_to_translate) > 2:
+                    tool_results['translation'] = self.tool_manager.execute_tool(
+                        'translate_text',
+                        text=text_to_translate
+                    )
+            state.analysis_results = tool_results
+        except Exception as e:
+            state.error = f"工具使用失败: {str(e)}"
+            state.analysis_results = {}
+        return state
+    def _synthesize_answer(self, state: AgentState) -> AgentState:
+        """综合生成答案"""
+        try:
+            # 使用提示词函数生成提示
+            prompt = get_answer_prompt(
+                question=state.question,
+                media_analysis=json.dumps(state.extracted_info, ensure_ascii=False, indent=2),
+                search_results=json.dumps(state.search_results, ensure_ascii=False, indent=2),
+                tool_analysis=json.dumps(state.analysis_results, ensure_ascii=False, indent=2)
+            )
+            # 使用LLM生成答案
+            response = self.llm.invoke([HumanMessage(content=prompt)])
+            state.final_answer = response.content
+        except Exception as e:
+            state.error = f"答案生成失败: {str(e)}"
+            state.final_answer = ERROR_ANSWER_TEMPLATE
+        return state
+    def __call__(self, question: str, media_url: Optional[str] = None) -> str:
+        """主调用方法"""
+        try:
+            # 初始化状态
+            state = AgentState(question=question)
+            # 如果有媒体URL，下载并设置路径
+            if media_url:
+                if any(ext in media_url.lower() for ext in ['.pdf']):
+                    media_type = 'pdf'
+                    state.media_path = self.tool_manager.execute_tool('download_pdf_from_url', url=media_url)
+                elif 'youtube.com' in media_url.lower() or 'youtu.be' in media_url.lower():
+                    media_type = 'youtube'
+                    state.media_path = media_url  # 直接使用URL
+                elif any(ext in media_url.lower() for ext in ['.mp4', '.avi', '.mov']):
+                    media_type = 'video'
+                    state.media_path = self.media_analyzer.download_media(media_url, media_type)
+                elif any(ext in media_url.lower() for ext in ['http://', 'https://', 'www.']):
+                    media_type = 'webpage'
+                    state.media_path = media_url  # 直接使用URL
+                else:
+                    media_type = 'image'
+                    state.media_path = self.media_analyzer.download_media(media_url, media_type)
+                state.media_type = media_type
+            # 执行工作流
+            final_state = self.workflow.invoke(state)
+            # LangGraph返回的是字典，因此使用键来访问
+            return final_state['final_answer']
+        except Exception as e:
+            return f"智能体执行失败: {str(e)}"
+def run_and_submit_all(profile: gr.OAuthProfile | None):
+    """运行评估并提交所有答案"""
+    # 获取用户信息
     if profile:
+        username = f"{profile.username}"
         print(f"User logged in: {username}")
     else:
         print("User not logged in.")
         return "Please Login to Hugging Face with the button.", None
+    space_id = os.getenv("SPACE_ID")
     api_url = DEFAULT_API_URL
     questions_url = f"{api_url}/questions"
     submit_url = f"{api_url}/submit"
+    # 初始化多模态智能体
     try:
+        agent = MultiModalAgent()
     except Exception as e:
         print(f"Error instantiating agent: {e}")
         return f"Error initializing agent: {e}", None
     agent_code = f"https://huggingface.co/spaces/{space_id}/tree/main"
     print(agent_code)
+    # 获取问题
     print(f"Fetching questions from: {questions_url}")
     try:
         response = requests.get(questions_url, timeout=15)
              print("Fetched questions list is empty.")
              return "Fetched questions list is empty or invalid format.", None
         print(f"Fetched {len(questions_data)} questions.")
+    except Exception as e:
         print(f"Error fetching questions: {e}")
         return f"Error fetching questions: {e}", None
+    # 运行智能体
     results_log = []
     answers_payload = []
     print(f"Running agent on {len(questions_data)} questions...")
     for item in questions_data:
         task_id = item.get("task_id")
         question_text = item.get("question")
         if not task_id or question_text is None:
             print(f"Skipping item with missing task_id or question: {item}")
             continue
         try:
             submitted_answer = agent(question_text)
             answers_payload.append({"task_id": task_id, "submitted_answer": submitted_answer})
         print("Agent did not produce any answers to submit.")
         return "Agent did not produce any answers to submit.", pd.DataFrame(results_log)
+    # 准备提交
     submission_data = {"username": username.strip(), "agent_code": agent_code, "answers": answers_payload}
     status_update = f"Agent finished. Submitting {len(answers_payload)} answers for user '{username}'..."
     print(status_update)
+    # 提交答案
     print(f"Submitting {len(answers_payload)} answers to: {submit_url}")
     try:
         response = requests.post(submit_url, json=submission_data, timeout=60)
         print("Submission successful.")
         results_df = pd.DataFrame(results_log)
         return final_status, results_df
     except Exception as e:
+        status_message = f"Submission Failed: {e}"
         print(status_message)
         results_df = pd.DataFrame(results_log)
         return status_message, results_df
+def test_agent(question: str, media_url: str = ""):
+    """测试智能体功能"""
+    try:
+        agent = MultiModalAgent()
+        answer = agent(question, media_url if media_url else None)
+        return answer
+    except Exception as e:
+        return f"测试失败: {str(e)}"
+# 构建Gradio界面
 with gr.Blocks() as demo:
+    gr.Markdown("# 多模态智能体系统")
     gr.Markdown(
         """
+        **功能特性:**
+        - 🎥 视频理解与分析
+        - 🖼️ 图像识别与描述
+        - 🔍 智能搜索引擎
+        - 🤖 LangGraph工作流编排
+        - 🧠 多模态信息融合
+        **使用说明:**
+        1. 登录你的Hugging Face账户
+        2. 在测试区域输入问题（可选媒体URL）
+        3. 点击"运行评估"进行批量测试
         """
     )
     gr.LoginButton()
+    with gr.Tab("智能体测试"):
+        with gr.Row():
+            with gr.Column():
+                test_question = gr.Textbox(label="问题", placeholder="请输入你的问题...")
+                test_media_url = gr.Textbox(label="媒体URL（可选）", placeholder="图片或视频URL...")
+                test_button = gr.Button("测试智能体")
+            with gr.Column():
+                test_output = gr.Textbox(label="智能体回答", lines=10)
+        test_button.click(
+            fn=test_agent,
+            inputs=[test_question, test_media_url],
+            outputs=test_output
+        )
+    with gr.Tab("批量评估"):
+        run_button = gr.Button("运行评估 & 提交所有答案")
+        status_output = gr.Textbox(label="运行状态 / 提交结果", lines=5, interactive=False)
+        results_table = gr.DataFrame(label="问题和智能体答案", wrap=True)
     run_button.click(
         fn=run_and_submit_all,
         outputs=[status_output, results_table]
     )
 if __name__ == "__main__":
+    print("\n" + "-"*30 + " 多模态智能体系统启动 " + "-"*30)
     space_host_startup = os.getenv("SPACE_HOST")
+    space_id_startup = os.getenv("SPACE_ID")
     if space_host_startup:
         print(f"✅ SPACE_HOST found: {space_host_startup}")
+        print(f"   Runtime URL: https://{space_host_startup}.hf.space")
     else:
         print("ℹ️  SPACE_HOST environment variable not found (running locally?).")
+    if space_id_startup:
         print(f"✅ SPACE_ID found: {space_id_startup}")
         print(f"   Repo URL: https://huggingface.co/spaces/{space_id_startup}")
     else:
+        print("ℹ️  SPACE_ID environment variable not found (running locally?).")
+    print("-"*(60 + len(" 多模态智能体系统启动 ")) + "\n")
+    print("启动多模态智能体系统...")
     demo.launch(debug=True, share=False)

check_ffmpeg.py ADDED Viewed

	@@ -0,0 +1,148 @@

+#!/usr/bin/env python3
+"""
+检查ffmpeg安装情况
+"""
+import subprocess
+import os
+import sys
+def check_ffmpeg():
+    """检查ffmpeg是否可用"""
+    print("🔍 检查ffmpeg安装情况...")
+    # 方法1: 检查系统PATH中的ffmpeg
+    try:
+        result = subprocess.run(['ffmpeg', '-version'],
+                              capture_output=True, text=True, timeout=10)
+        if result.returncode == 0:
+            print("✅ ffmpeg在系统PATH中可用")
+            print(f"   版本信息: {result.stdout.split('ffmpeg version')[1].split('\n')[0]}")
+            return True
+    except (subprocess.TimeoutExpired, FileNotFoundError, subprocess.CalledProcessError):
+        print("❌ ffmpeg不在系统PATH中")
+    # 方法2: 检查conda环境中的ffmpeg
+    try:
+        conda_prefix = os.environ.get('CONDA_PREFIX')
+        if conda_prefix:
+            ffmpeg_path = os.path.join(conda_prefix, 'bin', 'ffmpeg')
+            if os.path.exists(ffmpeg_path):
+                print(f"✅ 在conda环境中找到ffmpeg: {ffmpeg_path}")
+                return True
+            else:
+                print(f"❌ conda环境中没有ffmpeg: {ffmpeg_path}")
+    except Exception as e:
+        print(f"❌ 检查conda环境失败: {e}")
+    # 方法3: 检查常见的ffmpeg安装路径
+    common_paths = [
+        r"C:\ffmpeg\bin\ffmpeg.exe",
+        r"C:\Program Files\ffmpeg\bin\ffmpeg.exe",
+        r"C:\Program Files (x86)\ffmpeg\bin\ffmpeg.exe",
+        os.path.expanduser(r"~\ffmpeg\bin\ffmpeg.exe")
+    ]
+    for path in common_paths:
+        if os.path.exists(path):
+            print(f"✅ 找到ffmpeg: {path}")
+            return True
+    print("❌ 未找到ffmpeg")
+    return False
+def install_ffmpeg_conda():
+    """通过conda安装ffmpeg"""
+    print("\n📦 尝试通过conda安装ffmpeg...")
+    try:
+        result = subprocess.run(['conda', 'install', '-c', 'conda-forge', 'ffmpeg', '-y'],
+                              capture_output=True, text=True, timeout=60)
+        if result.returncode == 0:
+            print("✅ ffmpeg安装成功")
+            return True
+        else:
+            print(f"❌ ffmpeg安装失败: {result.stderr}")
+            return False
+    except Exception as e:
+        print(f"❌ conda安装失败: {e}")
+        return False
+def install_ffmpeg_pip():
+    """通过pip安装ffmpeg-python"""
+    print("\n📦 尝试通过pip安装ffmpeg-python...")
+    try:
+        result = subprocess.run([sys.executable, '-m', 'pip', 'install', 'ffmpeg-python'],
+                              capture_output=True, text=True, timeout=60)
+        if result.returncode == 0:
+            print("✅ ffmpeg-python安装成功")
+            return True
+        else:
+            print(f"❌ ffmpeg-python安装失败: {result.stderr}")
+            return False
+    except Exception as e:
+        print(f"❌ pip安装失败: {e}")
+        return False
+def test_audio_without_ffmpeg():
+    """测试不使用ffmpeg的音频处理"""
+    print("\n🎵 测试不使用ffmpeg的音频处理...")
+    try:
+        import yt_dlp
+        print("✅ yt-dlp可用")
+        # 测试下载音频（不转换格式）
+        test_url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
+        ydl_opts = {
+            'format': 'bestaudio/best',
+            'outtmpl': 'downloads/test_audio.%(ext)s',
+            'quiet': True
+        }
+        with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+            info = ydl.extract_info(test_url, download=True)
+            audio_path = ydl.prepare_filename(info)
+        if os.path.exists(audio_path):
+            print(f"✅ 音频下载成功: {audio_path}")
+            print(f"   文件大小: {os.path.getsize(audio_path)} bytes")
+            return True
+        else:
+            print(f"❌ 音频下载失败")
+            return False
+    except Exception as e:
+        print(f"❌ 音频处理测试失败: {e}")
+        return False
+def main():
+    print("🔧 ffmpeg检查和安装工具")
+    print("="*50)
+    # 检查ffmpeg
+    ffmpeg_available = check_ffmpeg()
+    if not ffmpeg_available:
+        print("\n📋 解决方案:")
+        print("1. 通过conda安装ffmpeg")
+        print("2. 手动下载ffmpeg并添加到PATH")
+        print("3. 使用不依赖ffmpeg的音频处理方法")
+        choice = input("\n选择解决方案 (1/2/3): ").strip()
+        if choice == "1":
+            install_ffmpeg_conda()
+        elif choice == "2":
+            print("请手动下载ffmpeg并添加到系统PATH")
+            print("下载地址: https://ffmpeg.org/download.html")
+        elif choice == "3":
+            test_audio_without_ffmpeg()
+        else:
+            print("无效选择")
+    else:
+        print("\n✅ ffmpeg已可用，可以正常进行音频处理")
+        test_audio_without_ffmpeg()
+if __name__ == "__main__":
+    main()

config.py ADDED Viewed

	@@ -0,0 +1,122 @@

+"""
+多模态智能体系统配置文件
+"""
+import os
+import json
+from typing import Optional, Dict, Any
+from pathlib import Path
+class Config:
+    """系统配置类"""
+    # API配置文件路径
+    API_KEYS_FILE: str = "api_keys.json"
+    # OpenAI配置
+    OPENAI_API_KEY: Optional[str] = None
+    OPENAI_MODEL: str = "gpt-4o"
+    OPENAI_TEMPERATURE: float = 0.7
+    # Hugging Face配置
+    HUGGINGFACE_API_KEY: Optional[str] = None
+    # 搜索引擎配置
+    SEARCH_ENGINE_TYPE: str = "duckduckgo"
+    SEARCH_ENGINE_API_KEY: Optional[str] = None
+    # 模型配置
+    IMAGE_CAPTION_MODEL: str = "Salesforce/blip-image-captioning-base"
+    IMAGE_CLASSIFICATION_MODEL: str = "microsoft/resnet-50"
+    OBJECT_DETECTION_MODEL: str = "facebook/detr-resnet-50"
+    GIT_MODEL: str = "microsoft/git-base"
+    # 系统配置
+    DEBUG: bool = os.getenv("DEBUG", "False").lower() == "true"
+    LOG_LEVEL: str = os.getenv("LOG_LEVEL", "INFO")
+    # 媒体处理配置
+    MAX_VIDEO_DURATION: int = 300  # 最大视频时长（秒）
+    FRAMES_TO_ANALYZE: int = 5     # 视频分析帧数
+    MAX_IMAGE_SIZE: int = 1024     # 最大图像尺寸
+    # 缓存配置
+    CACHE_DIR: str = "./cache"
+    TEMP_DIR: str = "./temp"
+    @classmethod
+    def load_api_keys(cls) -> bool:
+        """从文件加载API密钥"""
+        try:
+            api_file = Path(cls.API_KEYS_FILE)
+            if not api_file.exists():
+                print(f"⚠️  API配置文件 {cls.API_KEYS_FILE} 不存在")
+                print("请创建该文件并配置你的API密钥")
+                return False
+            with open(api_file, 'r', encoding='utf-8') as f:
+                api_config = json.load(f)
+            # 加载OpenAI配置
+            if 'openai' in api_config and api_config['openai'].get('api_key'):
+                cls.OPENAI_API_KEY = api_config['openai']['api_key']
+                print("✅ OpenAI API密钥已加载")
+            else:
+                print("⚠️  OpenAI API密钥未配置")
+            # 加载Hugging Face配置
+            if 'huggingface' in api_config and api_config['huggingface'].get('api_key'):
+                cls.HUGGINGFACE_API_KEY = api_config['huggingface']['api_key']
+                print("✅ Hugging Face API密钥已加载")
+            # 加载搜索引擎配置
+            if 'search_engine' in api_config:
+                search_config = api_config['search_engine']
+                cls.SEARCH_ENGINE_TYPE = search_config.get('type', 'duckduckgo')
+                cls.SEARCH_ENGINE_API_KEY = search_config.get('api_key')
+                print(f"✅ 搜索引擎类型: {cls.SEARCH_ENGINE_TYPE}")
+            return True
+        except json.JSONDecodeError as e:
+            print(f"❌ API配置文件格式错误: {e}")
+            return False
+        except Exception as e:
+            print(f"❌ 加载API配置失败: {e}")
+            return False
+    @classmethod
+    def validate(cls) -> bool:
+        """验证配置是否完整"""
+        # 首先尝试从文件加载API密钥
+        cls.load_api_keys()
+        # 如果文件加载失败，尝试从环境变量加载
+        if not cls.OPENAI_API_KEY:
+            cls.OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
+        if not cls.HUGGINGFACE_API_KEY:
+            cls.HUGGINGFACE_API_KEY = os.getenv("HUGGINGFACE_API_KEY")
+        # 验证必要的配置
+        if not cls.OPENAI_API_KEY:
+            print("❌ 缺少OpenAI API密钥")
+            print("请在 api_keys.json 文件中配置或设置环境变量 OPENAI_API_KEY")
+            return False
+        return True
+    @classmethod
+    def print_config(cls):
+        """打印当前配置"""
+        print("=== 多模态智能体系统配置 ===")
+        print(f"OpenAI模型: {cls.OPENAI_MODEL}")
+        print(f"OpenAI温度: {cls.OPENAI_TEMPERATURE}")
+        print(f"OpenAI API密钥: {'已配置' if cls.OPENAI_API_KEY else '未配置'}")
+        print(f"Hugging Face API密钥: {'已配置' if cls.HUGGINGFACE_API_KEY else '未配置'}")
+        print(f"搜索引擎类型: {cls.SEARCH_ENGINE_TYPE}")
+        print(f"图像描述模型: {cls.IMAGE_CAPTION_MODEL}")
+        print(f"图像分类模型: {cls.IMAGE_CLASSIFICATION_MODEL}")
+        print(f"对象检测模型: {cls.OBJECT_DETECTION_MODEL}")
+        print(f"调试模式: {cls.DEBUG}")
+        print(f"日志级别: {cls.LOG_LEVEL}")
+        print("=" * 30)

prompts.py ADDED Viewed

	@@ -0,0 +1,61 @@

+"""
+提示词配置文件
+包含系统提示和各种提示模板
+"""
+# 系统提示 - 用于智能体回答问题的格式规范
+SYSTEM_PROMPT = """You are a helpful assistant tasked with answering questions using a set of tools.
+Now, I will ask you a question. Report your thoughts, and finish your answer with the following template:
+[YOUR FINAL ANSWER].
+YOUR FINAL ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings. If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise. If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise. If you are asked for a comma separated list, Apply the rules above for each element (number or string), ensure there is exactly one space after each comma.
+Your answer should only contain the final answer without any prefix or additional text.
+IMPORTANT: Only provide the final answer without any explanations, reasoning, or additional text."""
+# 答案生成提示模板
+ANSWER_GENERATION_TEMPLATE = f"""{SYSTEM_PROMPT}
+基于以下信息回答问题：
+问题: {{question}}
+媒体分析结果: {{media_analysis}}
+搜索结果: {{search_results}}
+工具分析结果: {{tool_analysis}}
+请分析以上信息，然后直接使用指定的格式提供最终答案。不要包含任何解释或推理过程。"""
+# 错误回答模板
+ERROR_ANSWER_TEMPLATE = "抱歉，我无法生成答案。"
+# 工具使用提示
+TOOL_USAGE_PROMPT = """你是一个智能助手，可以使用各种工具来回答问题。
+请根据问题类型和可用信息，选择合适的工具来获取答案。
+记住最终答案应该简洁明了，不包含任何前缀。"""
+# 媒体分析提示
+MEDIA_ANALYSIS_PROMPT = """请分析提供的媒体内容（图像或视频），提取关键信息。
+重点关注：
+- 视觉内容描述
+- 文本内容（如果有）
+- 对象识别
+- 场景理解
+- 任何相关的数字或文本信息"""
+# 搜索提示
+SEARCH_PROMPT = """请使用搜索引擎查找相关信息来回答问题。
+搜索查询应该：
+- 简洁明确
+- 包含问题的关键信息
+- 避免过于宽泛或过于具体"""
+def get_answer_prompt(question: str, media_analysis: str, search_results: str, tool_analysis: str) -> str:
+    """生成答案提示词"""
+    return ANSWER_GENERATION_TEMPLATE.format(
+        question=question,
+        media_analysis=media_analysis,
+        search_results=search_results,
+        tool_analysis=tool_analysis
+    )

requirements.txt CHANGED Viewed

@@ -1,2 +1,24 @@
 gradio
-requests

 gradio
+requests
+langgraph
+langchain
+langchain-community
+langchain-openai
+transformers
+torch
+torchvision
+pillow
+opencv-python
+duckduckgo-search
+python-dotenv
+numpy
+pandas
+matplotlib
+seaborn
+PyPDF2
+PyMuPDF
+pdf2image
+beautifulsoup4
+pytube
+yt-dlp
+wikipedia-api

run.py ADDED Viewed

	@@ -0,0 +1,138 @@

+#!/usr/bin/env python3
+"""
+多模态智能体系统启动脚本
+"""
+import os
+import sys
+import argparse
+from pathlib import Path
+# 添加项目根目录到Python路径
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+from config import Config
+def check_environment():
+    """检查运行环境"""
+    print("🔍 检查运行环境...")
+    # 检查Python版本
+    if sys.version_info < (3, 8):
+        print("❌ Python版本过低，需要Python 3.8+")
+        return False
+    print(f"✅ Python版本: {sys.version}")
+    # 检查必要的环境变量
+    if not Config.validate():
+        print("❌ 环境变量配置不完整")
+        print("请设置以下环境变量:")
+        print("  - OPENAI_API_KEY")
+        return False
+    print("✅ 环境变量配置正确")
+    # 检查依赖包
+    try:
+        import torch
+        import transformers
+        import langchain
+        import langgraph
+        import gradio
+        print("✅ 核心依赖包已安装")
+    except ImportError as e:
+        print(f"❌ 缺少依赖包: {e}")
+        print("请运行: pip install -r requirements.txt")
+        return False
+    return True
+def run_web_interface():
+    """运行Web界面"""
+    print("🌐 启动Web界面...")
+    from app import demo
+    demo.launch(debug=Config.DEBUG, share=False)
+def run_test():
+    """运行测试"""
+    print("🧪 运行系统测试...")
+    from test_agent import main as test_main
+    test_main()
+def run_interactive():
+    """运行交互式模式"""
+    print("💬 启动交互式模式...")
+    from app import MultiModalAgent
+    agent = MultiModalAgent()
+    print("智能体已初始化，输入 'quit' 退出")
+    while True:
+        try:
+            question = input("\n请输入问题: ").strip()
+            if question.lower() in ['quit', 'exit', 'q']:
+                break
+            if not question:
+                continue
+            print("🤖 正在处理...")
+            answer = agent(question)
+            print(f"回答: {answer}")
+        except KeyboardInterrupt:
+            print("\n👋 再见！")
+            break
+        except Exception as e:
+            print(f"❌ 错误: {str(e)}")
+def main():
+    """主函数"""
+    parser = argparse.ArgumentParser(description="多模态智能体系统")
+    parser.add_argument(
+        "--mode",
+        choices=["web", "test", "interactive"],
+        default="web",
+        help="运行模式: web(Web界面), test(测试), interactive(交互式)"
+    )
+    parser.add_argument(
+        "--debug",
+        action="store_true",
+        help="启用调试模式"
+    )
+    args = parser.parse_args()
+    # 设置调试模式
+    if args.debug:
+        os.environ["DEBUG"] = "True"
+        os.environ["LOG_LEVEL"] = "DEBUG"
+    print("🚀 多模态智能体系统")
+    print("=" * 40)
+    # 检查环境
+    if not check_environment():
+        sys.exit(1)
+    # 打印配置
+    Config.print_config()
+    # 根据模式运行
+    try:
+        if args.mode == "web":
+            run_web_interface()
+        elif args.mode == "test":
+            run_test()
+        elif args.mode == "interactive":
+            run_interactive()
+    except KeyboardInterrupt:
+        print("\n👋 程序被用户中断")
+    except Exception as e:
+        print(f"❌ 运行错误: {str(e)}")
+        if Config.DEBUG:
+            import traceback
+            traceback.print_exc()
+if __name__ == "__main__":
+    main()

tools.py ADDED Viewed

	@@ -0,0 +1,2197 @@

+"""
+多模态智能体工具模块
+"""
+import os
+import json
+import requests
+import tempfile
+import ast
+import subprocess
+import sys
+from typing import Dict, List, Any, Optional
+from pathlib import Path
+import cv2
+import numpy as np
+from PIL import Image
+import torch
+from transformers import pipeline
+from langchain_core.tools import tool
+from langchain_community.tools import DuckDuckGoSearchRun
+from config import Config
+# PDF处理相关导入
+try:
+    import PyPDF2
+    import fitz  # PyMuPDF
+    from pdf2image import convert_from_path
+    PDF_AVAILABLE = True
+except ImportError:
+    PDF_AVAILABLE = False
+    print("⚠️  PDF处理功能需要安装: pip install PyPDF2 PyMuPDF pdf2image")
+# 网页处理相关导入
+try:
+    import requests
+    from bs4 import BeautifulSoup
+    import urllib.parse
+    from urllib.parse import urljoin, urlparse
+    import re
+    import time
+    WEB_AVAILABLE = True
+except ImportError:
+    WEB_AVAILABLE = False
+    print("⚠️  网页处理功能需要安装: pip install beautifulsoup4 requests")
+# YouTube处理相关导入
+try:
+    from pytube import YouTube
+    YOUTUBE_AVAILABLE = True
+    YT_DLP_AVAILABLE = False
+    try:
+        import yt_dlp
+        YT_DLP_AVAILABLE = True
+    except ImportError:
+        pass
+except ImportError:
+    YOUTUBE_AVAILABLE = False
+    YT_DLP_AVAILABLE = False
+    print("⚠️  YouTube处理功能需要安装: pip install pytube")
+# 音频处理相关导入
+try:
+    import speech_recognition as sr
+    from pydub import AudioSegment
+    AUDIO_PROCESSING_AVAILABLE = True
+except ImportError:
+    AUDIO_PROCESSING_AVAILABLE = False
+    print("⚠️  音频处理功能需要安装: pip install SpeechRecognition pydub")
+# Wikipedia处理相关导入
+try:
+    import wikipediaapi
+    import requests
+    from bs4 import BeautifulSoup
+    WIKIPEDIA_AVAILABLE = True
+except ImportError:
+    WIKIPEDIA_AVAILABLE = False
+    print("⚠️  Wikipedia处理功能需要安装: pip install wikipedia-api requests beautifulsoup4")
+class WebTools:
+    """网页内容分析工具类"""
+    @staticmethod
+    @tool
+    def fetch_webpage_content(url: str) -> Dict[str, Any]:
+        """获取网页内容"""
+        try:
+            if not WEB_AVAILABLE:
+                return {"error": "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"}
+            # 设置请求头，模拟浏览器
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            # 发送请求
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            # 解析HTML
+            soup = BeautifulSoup(response.content, 'html.parser')
+            # 提取基本信息
+            title = soup.find('title')
+            title_text = title.get_text().strip() if title else "无标题"
+            # 提取主要文本内容
+            # 移除脚本和样式标签
+            for script in soup(["script", "style"]):
+                script.decompose()
+            # 获取文本内容
+            text_content = soup.get_text()
+            lines = (line.strip() for line in text_content.splitlines())
+            chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
+            text_content = ' '.join(chunk for chunk in chunks if chunk)
+            # 提取链接
+            links = []
+            for link in soup.find_all('a', href=True):
+                href = link.get('href')
+                text = link.get_text().strip()
+                if href and text:
+                    full_url = urljoin(url, href)
+                    links.append({
+                        'url': full_url,
+                        'text': text[:100]  # 限制文本长度
+                    })
+            # 提取图片
+            images = []
+            for img in soup.find_all('img', src=True):
+                src = img.get('src')
+                alt = img.get('alt', '')
+                if src:
+                    full_url = urljoin(url, src)
+                    images.append({
+                        'url': full_url,
+                        'alt': alt[:100]
+                    })
+            # 提取元数据
+            meta_data = {}
+            for meta in soup.find_all('meta'):
+                name = meta.get('name') or meta.get('property')
+                content = meta.get('content')
+                if name and content:
+                    meta_data[name] = content
+            return {
+                'url': url,
+                'title': title_text,
+                'text_content': text_content[:5000],  # 限制文本长度
+                'links_count': len(links),
+                'images_count': len(images),
+                'links': links[:20],  # 限制链接数量
+                'images': images[:10],  # 限制图片数量
+                'meta_data': meta_data,
+                'status_code': response.status_code,
+                'content_type': response.headers.get('content-type', ''),
+                'encoding': response.encoding
+            }
+        except Exception as e:
+            return {"error": f"网页内容获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def extract_text_from_webpage(url: str) -> str:
+        """从网页中提取纯文本内容"""
+        try:
+            if not WEB_AVAILABLE:
+                return "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            soup = BeautifulSoup(response.content, 'html.parser')
+            # 移除不需要的标签
+            for tag in soup(['script', 'style', 'nav', 'footer', 'header']):
+                tag.decompose()
+            # 提取文本
+            text = soup.get_text()
+            lines = (line.strip() for line in text.splitlines())
+            chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
+            text = ' '.join(chunk for chunk in chunks if chunk)
+            return text if text.strip() else "网页中没有找到文本内容"
+        except Exception as e:
+            return f"文本提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def analyze_webpage_structure(url: str) -> Dict[str, Any]:
+        """分析网页结构"""
+        try:
+            if not WEB_AVAILABLE:
+                return {"error": "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"}
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            soup = BeautifulSoup(response.content, 'html.parser')
+            # 分析页面结构
+            structure = {
+                'url': url,
+                'title': soup.find('title').get_text().strip() if soup.find('title') else "无标题",
+                'headings': {},
+                'sections': [],
+                'forms': [],
+                'tables': [],
+                'lists': []
+            }
+            # 分析标题层级
+            for i in range(1, 7):
+                headings = soup.find_all(f'h{i}')
+                structure['headings'][f'h{i}'] = len(headings)
+            # 分析主要区域
+            main_sections = soup.find_all(['main', 'article', 'section', 'div'], class_=re.compile(r'main|content|article|post'))
+            for section in main_sections[:5]:  # 限制数量
+                section_text = section.get_text().strip()[:200]
+                structure['sections'].append({
+                    'tag': section.name,
+                    'class': section.get('class', []),
+                    'text_preview': section_text
+                })
+            # 分析表单
+            forms = soup.find_all('form')
+            for form in forms[:3]:
+                inputs = form.find_all('input')
+                structure['forms'].append({
+                    'action': form.get('action', ''),
+                    'method': form.get('method', ''),
+                    'input_count': len(inputs)
+                })
+            # 分析表格
+            tables = soup.find_all('table')
+            for table in tables[:3]:
+                rows = table.find_all('tr')
+                structure['tables'].append({
+                    'row_count': len(rows),
+                    'has_header': bool(table.find('th'))
+                })
+            # 分析列表
+            lists = soup.find_all(['ul', 'ol'])
+            for lst in lists[:5]:
+                items = lst.find_all('li')
+                structure['lists'].append({
+                    'type': lst.name,
+                    'item_count': len(items)
+                })
+            return structure
+        except Exception as e:
+            return {"error": f"网页结构分析失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def search_content_in_webpage(url: str, search_term: str) -> List[Dict[str, Any]]:
+        """在网页中搜索特定内容"""
+        try:
+            if not WEB_AVAILABLE:
+                return [{"error": "网页处��功能未安装，请运行: pip install beautifulsoup4 requests"}]
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            soup = BeautifulSoup(response.content, 'html.parser')
+            # 移除脚本和样式
+            for script in soup(["script", "style"]):
+                script.decompose()
+            text = soup.get_text()
+            # 搜索匹配项
+            search_results = []
+            lines = text.split('\n')
+            for i, line in enumerate(lines):
+                if search_term.lower() in line.lower():
+                    # 获取上下文
+                    start = max(0, i - 1)
+                    end = min(len(lines), i + 2)
+                    context = '\n'.join(lines[start:end])
+                    search_results.append({
+                        'line_number': i + 1,
+                        'matched_text': line.strip(),
+                        'context': context.strip()
+                    })
+                    if len(search_results) >= 10:  # 限制结果数量
+                        break
+            return search_results
+        except Exception as e:
+            return [{"error": f"网页内容搜索失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def extract_links_from_webpage(url: str) -> List[Dict[str, str]]:
+        """从网页中提取所有链接"""
+        try:
+            if not WEB_AVAILABLE:
+                return [{"error": "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"}]
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            soup = BeautifulSoup(response.content, 'html.parser')
+            links = []
+            for link in soup.find_all('a', href=True):
+                href = link.get('href')
+                text = link.get_text().strip()
+                if href and text:
+                    full_url = urljoin(url, href)
+                    parsed_url = urlparse(full_url)
+                    links.append({
+                        'url': full_url,
+                        'text': text[:100],
+                        'domain': parsed_url.netloc,
+                        'path': parsed_url.path
+                    })
+            return links[:50]  # 限制链接数量
+        except Exception as e:
+            return [{"error": f"链接提取失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def summarize_webpage_content(url: str) -> str:
+        """总结网页内容"""
+        try:
+            if not WEB_AVAILABLE:
+                return "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"
+            # 获取网页内容
+            content_result = WebTools.fetch_webpage_content(url)
+            if "error" in content_result:
+                return content_result["error"]
+            # 提取文本内容
+            text_content = content_result.get('text_content', '')
+            if not text_content:
+                return "网页中没有找到可总结的内容"
+            # 使用LLM总结内容
+            from langchain_openai import ChatOpenAI
+            from langchain_core.messages import HumanMessage
+            llm = ChatOpenAI(
+                model=Config.OPENAI_MODEL,
+                temperature=0.3,
+                api_key=Config.OPENAI_API_KEY
+            )
+            # 如果文本太长，分段处理
+            if len(text_content) > 4000:
+                text_content = text_content[:4000] + "..."
+            prompt = f"""
+请总结以下网页的主要内容：
+标题: {content_result.get('title', '无标题')}
+URL: {url}
+内容:
+{text_content}
+请提供：
+1. 网页的主要主题
+2. 关键信息点
+3. 重要内容摘要
+4. 网页类型和用途
+"""
+            response = llm.invoke([HumanMessage(content=prompt)])
+            return response.content
+        except Exception as e:
+            return f"网页内容总结失败: {str(e)}"
+    @staticmethod
+    @tool
+    def check_webpage_accessibility(url: str) -> Dict[str, Any]:
+        """检查网页的可访问性"""
+        try:
+            if not WEB_AVAILABLE:
+                return {"error": "网页处理功能未安装，请运行: pip install beautifulsoup4 requests"}
+            headers = {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
+            }
+            response = requests.get(url, headers=headers, timeout=30)
+            response.raise_for_status()
+            soup = BeautifulSoup(response.content, 'html.parser')
+            accessibility_report = {
+                'url': url,
+                'status_code': response.status_code,
+                'load_time': response.elapsed.total_seconds(),
+                'issues': [],
+                'recommendations': []
+            }
+            # 检查标题
+            title = soup.find('title')
+            if not title or not title.get_text().strip():
+                accessibility_report['issues'].append("缺少页面标题")
+                accessibility_report['recommendations'].append("添加有意义的页面标题")
+            # 检查图片alt属性
+            images = soup.find_all('img')
+            images_without_alt = [img for img in images if not img.get('alt')]
+            if images_without_alt:
+                accessibility_report['issues'].append(f"发现 {len(images_without_alt)} 张图片缺少alt属性")
+                accessibility_report['recommendations'].append("为所有图片添加alt属性")
+            # 检查链接文本
+            links = soup.find_all('a', href=True)
+            empty_links = [link for link in links if not link.get_text().strip()]
+            if empty_links:
+                accessibility_report['issues'].append(f"发现 {len(empty_links)} 个空链接")
+                accessibility_report['recommendations'].append("为所有链接添加描述性文本")
+            # 检查表单标签
+            forms = soup.find_all('form')
+            for form in forms:
+                inputs = form.find_all('input')
+                for input_field in inputs:
+                    if input_field.get('type') in ['text', 'email', 'password']:
+                        if not input_field.get('id') or not soup.find('label', {'for': input_field.get('id')}):
+                            accessibility_report['issues'].append("表单输入字段缺少标签")
+                            accessibility_report['recommendations'].append("为表单字段添加label标签")
+                            break
+            # 检查颜色对比度（简化版）
+            style_tags = soup.find_all('style')
+            if not style_tags:
+                accessibility_report['recommendations'].append("考虑添加CSS样式以提高可读性")
+            return accessibility_report
+        except Exception as e:
+            return {"error": f"可访问性检查失败: {str(e)}"}
+class PDFTools:
+    """PDF处理工具类"""
+    @staticmethod
+    @tool
+    def download_pdf_from_url(url: str) -> str:
+        """从URL下载PDF文件"""
+        try:
+            if not PDF_AVAILABLE:
+                return "PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"
+            # 创建临时文件
+            temp_path = tempfile.mktemp(suffix='.pdf')
+            # 下载PDF文件
+            response = requests.get(url, stream=True, timeout=30)
+            response.raise_for_status()
+            with open(temp_path, 'wb') as f:
+                for chunk in response.iter_content(chunk_size=8192):
+                    f.write(chunk)
+            return temp_path
+        except Exception as e:
+            return f"PDF下载失败: {str(e)}"
+    @staticmethod
+    @tool
+    def extract_text_from_pdf(pdf_path: str) -> str:
+        """从PDF中提取文本"""
+        try:
+            if not PDF_AVAILABLE:
+                return "PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"
+            # 使用PyMuPDF提取文本
+            doc = fitz.open(pdf_path)
+            text = ""
+            for page_num in range(len(doc)):
+                page = doc.load_page(page_num)
+                text += page.get_text()
+            doc.close()
+            return text if text.strip() else "PDF中没有找到文本内容"
+        except Exception as e:
+            return f"PDF文本提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def extract_images_from_pdf(pdf_path: str) -> List[str]:
+        """从PDF中提取图像"""
+        try:
+            if not PDF_AVAILABLE:
+                return ["PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"]
+            # 使用pdf2image转换PDF页面为图像
+            images = convert_from_path(pdf_path, dpi=200)
+            image_paths = []
+            for i, image in enumerate(images):
+                temp_path = tempfile.mktemp(suffix=f'_page_{i+1}.jpg')
+                image.save(temp_path, 'JPEG')
+                image_paths.append(temp_path)
+            return image_paths
+        except Exception as e:
+            return [f"PDF图像提取失败: {str(e)}"]
+    @staticmethod
+    @tool
+    def analyze_pdf_structure(pdf_path: str) -> Dict[str, Any]:
+        """分析PDF结构"""
+        try:
+            if not PDF_AVAILABLE:
+                return {"error": "PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"}
+            # 使用PyPDF2分析PDF结构
+            with open(pdf_path, 'rb') as file:
+                pdf_reader = PyPDF2.PdfReader(file)
+                # 获取基本信息
+                info = {
+                    "page_count": len(pdf_reader.pages),
+                    "title": pdf_reader.metadata.get('/Title', 'Unknown'),
+                    "author": pdf_reader.metadata.get('/Author', 'Unknown'),
+                    "subject": pdf_reader.metadata.get('/Subject', 'Unknown'),
+                    "creator": pdf_reader.metadata.get('/Creator', 'Unknown'),
+                    "producer": pdf_reader.metadata.get('/Producer', 'Unknown'),
+                    "creation_date": pdf_reader.metadata.get('/CreationDate', 'Unknown'),
+                    "modification_date": pdf_reader.metadata.get('/ModDate', 'Unknown')
+                }
+                # 分析每页内容
+                pages_info = []
+                for i, page in enumerate(pdf_reader.pages):
+                    page_text = page.extract_text()
+                    pages_info.append({
+                        "page_number": i + 1,
+                        "text_length": len(page_text),
+                        "has_text": bool(page_text.strip()),
+                        "rotation": page.get('/Rotate', 0)
+                    })
+                info["pages_info"] = pages_info
+                return info
+        except Exception as e:
+            return {"error": f"PDF结构分析失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def search_text_in_pdf(pdf_path: str, search_term: str) -> List[Dict[str, Any]]:
+        """在PDF中搜索文本"""
+        try:
+            if not PDF_AVAILABLE:
+                return [{"error": "PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"}]
+            # 使用PyMuPDF搜索文本
+            doc = fitz.open(pdf_path)
+            search_results = []
+            for page_num in range(len(doc)):
+                page = doc.load_page(page_num)
+                text_instances = page.search_for(search_term)
+                for inst in text_instances:
+                    search_results.append({
+                        "page_number": page_num + 1,
+                        "text": search_term,
+                        "bbox": inst,
+                        "context": page.get_text("text", clip=inst)
+                    })
+            doc.close()
+            return search_results
+        except Exception as e:
+            return [{"error": f"PDF文本搜索失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def summarize_pdf_content(pdf_path: str) -> str:
+        """总结PDF内容"""
+        try:
+            if not PDF_AVAILABLE:
+                return "PDF处理功能未安装，请运行: pip install PyPDF2 PyMuPDF pdf2image"
+            # 提取文本
+            doc = fitz.open(pdf_path)
+            text = ""
+            for page_num in range(len(doc)):
+                page = doc.load_page(page_num)
+                text += page.get_text()
+            doc.close()
+            if not text.strip():
+                return "PDF中没有找到文本内容"
+            # 使用LLM总结内容
+            from langchain_openai import ChatOpenAI
+            from langchain_core.messages import HumanMessage
+            llm = ChatOpenAI(
+                model=Config.OPENAI_MODEL,
+                temperature=0.3,
+                api_key=Config.OPENAI_API_KEY
+            )
+            # 如果文本太长，分段处理
+            if len(text) > 4000:
+                text = text[:4000] + "..."
+            prompt = f"""
+请总结以下PDF文档的主要内容：
+{text}
+请提供：
+1. 文档的主要主题
+2. 关键要点
+3. 重要信息摘要
+4. 文档类型和用途
+"""
+            response = llm.invoke([HumanMessage(content=prompt)])
+            return response.content
+        except Exception as e:
+            return f"PDF内容总结失败: {str(e)}"
+class MediaTools:
+    """媒体处理工具类"""
+    @staticmethod
+    @tool
+    def extract_text_from_image(image_path: str) -> str:
+        """从图像中提取文本"""
+        try:
+            # 使用OCR模型提取文本
+            ocr_pipeline = pipeline(
+                "image-to-text",
+                model="microsoft/trocr-base-handwritten",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            image = Image.open(image_path)
+            result = ocr_pipeline(image)
+            return result[0]['generated_text']
+        except Exception as e:
+            return f"文本提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def analyze_image_emotion(image_path: str) -> Dict[str, Any]:
+        """分析图像中的情感"""
+        try:
+            # 使用情感分析模型
+            emotion_pipeline = pipeline(
+                "image-classification",
+                model="microsoft/DialoGPT-medium",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            image = Image.open(image_path)
+            result = emotion_pipeline(image)
+            return {
+                "emotions": result[:3],  # 返回前3个最可能的情感
+                "confidence": result[0]['score'] if result else 0.0
+            }
+        except Exception as e:
+            return {"error": f"情感分析失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def extract_video_audio(video_path: str) -> str:
+        """从视频中提取音频信息"""
+        try:
+            # 简化版本：返回提示信息
+            return "视频音频分析功能需要安装moviepy包"
+        except Exception as e:
+            return f"音频提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def analyze_video_content(video_path: str) -> Dict[str, Any]:
+        """分析视频内容"""
+        try:
+            # 使用OpenCV分析视频
+            cap = cv2.VideoCapture(video_path)
+            if not cap.isOpened():
+                return {"error": "无法打开视频文件"}
+            # 获取视频基本信息
+            fps = cap.get(cv2.CAP_PROP_FPS)
+            frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            duration = frame_count / fps if fps > 0 else 0
+            width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+            height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+            # 分析前几帧
+            frames_analyzed = []
+            frame_interval = max(1, frame_count // 10)  # 分析10帧
+            for i in range(0, min(frame_count, 10)):
+                cap.set(cv2.CAP_PROP_POS_FRAMES, i * frame_interval)
+                ret, frame = cap.read()
+                if ret:
+                    # 转换为PIL图像进行分析
+                    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+                    pil_image = Image.fromarray(frame_rgb)
+                    # 使用图像描述模型
+                    caption_pipeline = pipeline(
+                        "image-to-text",
+                        model="Salesforce/blip-image-captioning-base",
+                        device=0 if torch.cuda.is_available() else -1
+                    )
+                    caption_result = caption_pipeline(pil_image)
+                    frames_analyzed.append({
+                        "frame_number": i * frame_interval,
+                        "caption": caption_result[0]['generated_text']
+                    })
+            cap.release()
+            return {
+                "video_info": {
+                    "duration": duration,
+                    "fps": fps,
+                    "frame_count": frame_count,
+                    "resolution": f"{width}x{height}"
+                },
+                "frames_analyzed": frames_analyzed,
+                "analysis_method": "OpenCV + BLIP"
+            }
+        except Exception as e:
+            return {"error": f"视频分析失败: {str(e)}"}
+class CodeAnalysisTools:
+    """代码分析工具类"""
+    @staticmethod
+    @tool
+    def analyze_python_code(code: str) -> Dict[str, Any]:
+        """分析Python代码"""
+        try:
+            # 语法检查
+            try:
+                ast.parse(code)
+                syntax_valid = True
+                syntax_error = None
+            except SyntaxError as e:
+                syntax_valid = False
+                syntax_error = str(e)
+            # 代码复杂度分析
+            tree = ast.parse(code) if syntax_valid else None
+            if tree:
+                functions = [node for node in ast.walk(tree) if isinstance(node, ast.FunctionDef)]
+                classes = [node for node in ast.walk(tree) if isinstance(node, ast.ClassDef)]
+                imports = [node for node in ast.walk(tree) if isinstance(node, (ast.Import, ast.ImportFrom))]
+                # 计算圈复杂度（简化版）
+                complexity = 0
+                for node in ast.walk(tree):
+                    if isinstance(node, (ast.If, ast.While, ast.For, ast.ExceptHandler)):
+                        complexity += 1
+                analysis = {
+                    "syntax_valid": syntax_valid,
+                    "syntax_error": syntax_error,
+                    "function_count": len(functions),
+                    "class_count": len(classes),
+                    "import_count": len(imports),
+                    "complexity": complexity,
+                    "functions": [f.name for f in functions],
+                    "classes": [c.name for c in classes]
+                }
+            else:
+                analysis = {
+                    "syntax_valid": syntax_valid,
+                    "syntax_error": syntax_error
+                }
+            return analysis
+        except Exception as e:
+            return {"error": f"代码分析失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def execute_python_code(code: str) -> str:
+        """执行Python代码"""
+        try:
+            # 创建临时文件
+            with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
+                f.write(code)
+                temp_file = f.name
+            # 执行代码
+            result = subprocess.run(
+                [sys.executable, temp_file],
+                capture_output=True,
+                text=True,
+                timeout=30  # 30秒超时
+            )
+            # 清理临时文件
+            os.unlink(temp_file)
+            if result.returncode == 0:
+                return f"执行成功:\n{result.stdout}"
+            else:
+                return f"执行失败:\n{result.stderr}"
+        except subprocess.TimeoutExpired:
+            return "代码执行超时"
+        except Exception as e:
+            return f"代码执行失败: {str(e)}"
+    @staticmethod
+    @tool
+    def explain_code(code: str) -> str:
+        """解释代码功能"""
+        try:
+            # 使用LLM解释代码
+            from langchain_openai import ChatOpenAI
+            from langchain_core.messages import HumanMessage
+            llm = ChatOpenAI(
+                model=Config.OPENAI_MODEL,
+                temperature=0.3,
+                api_key=Config.OPENAI_API_KEY
+            )
+            prompt = f"""
+请分析以下Python代码的功能和作用：
+```python
+{code}
+```
+请提供：
+1. 代码的主要功能
+2. 关键部分的解释
+3. 可能的改进建议
+"""
+            response = llm.invoke([HumanMessage(content=prompt)])
+            return response.content
+        except Exception as e:
+            return f"代码解释失败: {str(e)}"
+class SearchTools:
+    """搜索工具类"""
+    def __init__(self):
+        # 使用DuckDuckGo搜索，无需API密钥
+        self.search_tool = DuckDuckGoSearchRun()
+        print("✅ DuckDuckGo搜索引擎已初始化")
+    @tool
+    def web_search(self, query: str) -> str:
+        """执行网络搜索"""
+        try:
+            print(f"🔍 搜索查询: {query}")
+            results = self.search_tool.run(query)
+            return results if isinstance(results, str) else str(results)
+        except Exception as e:
+            print(f"❌ 搜索失败: {str(e)}")
+            return f"搜索失败: {str(e)}"
+    @tool
+    def search_images(self, query: str) -> List[str]:
+        """搜索相关图像"""
+        try:
+            search_query = f"{query} images"
+            print(f"🖼️ 图像搜索查询: {search_query}")
+            results = self.search_tool.run(search_query)
+            # 简单返回搜索结果，实际应用中需要解析图像URL
+            return [results] if isinstance(results, str) else results
+        except Exception as e:
+            print(f"❌ 图像搜索失败: {str(e)}")
+            return [f"图像搜索失败: {str(e)}"]
+    @tool
+    def search_videos(self, query: str) -> List[str]:
+        """搜索相关视频"""
+        try:
+            search_query = f"{query} videos"
+            print(f"🎥 视频搜索查询: {search_query}")
+            results = self.search_tool.run(search_query)
+            return [results] if isinstance(results, str) else results
+        except Exception as e:
+            print(f"❌ 视频搜索失败: {str(e)}")
+            return [f"视频搜索失败: {str(e)}"]
+    @tool
+    def search_pdfs(self, query: str) -> List[str]:
+        """搜索PDF文档"""
+        try:
+            search_query = f"{query} filetype:pdf"
+            print(f"📄 PDF搜索查询: {search_query}")
+            results = self.search_tool.run(search_query)
+            return [results] if isinstance(results, str) else results
+        except Exception as e:
+            print(f"❌ PDF搜索失败: {str(e)}")
+            return [f"PDF搜索失败: {str(e)}"]
+class AnalysisTools:
+    """分析工具类"""
+    @staticmethod
+    @tool
+    def analyze_text_sentiment(text: str) -> Dict[str, Any]:
+        """分析文本情感"""
+        try:
+            # 使用情感分析模型
+            sentiment_pipeline = pipeline(
+                "sentiment-analysis",
+                model="cardiffnlp/twitter-roberta-base-sentiment-latest",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            result = sentiment_pipeline(text)
+            return {
+                "sentiment": result[0]['label'],
+                "confidence": result[0]['score'],
+                "text": text
+            }
+        except Exception as e:
+            return {"error": f"情感分析失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def extract_keywords(text: str) -> List[str]:
+        """提取关键词"""
+        try:
+            # 使用关键词提取模型
+            keyword_pipeline = pipeline(
+                "token-classification",
+                model="dbmdz/bert-large-cased-finetuned-conll03-english",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            result = keyword_pipeline(text)
+            keywords = []
+            for item in result:
+                if item['entity'] in ['B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC']:
+                    keywords.append(item['word'])
+            return list(set(keywords)) if keywords else ["无关键词"]
+        except Exception as e:
+            return [f"关键词提取失败: {str(e)}"]
+    @staticmethod
+    @tool
+    def summarize_text(text: str, max_length: int = 150) -> str:
+        """文本摘要"""
+        try:
+            # 使用摘要模型
+            summarizer = pipeline(
+                "summarization",
+                model="facebook/bart-large-cnn",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            # 如果文本太长，分段处理
+            if len(text) > 1000:
+                chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]
+                summaries = []
+                for chunk in chunks[:3]:  # 只处理前3段
+                    result = summarizer(chunk, max_length=max_length//3, min_length=30, do_sample=False)
+                    summaries.append(result[0]['summary_text'])
+                return " ".join(summaries)
+            else:
+                result = summarizer(text, max_length=max_length, min_length=30, do_sample=False)
+                return result[0]['summary_text']
+        except Exception as e:
+            return f"摘要生成失败: {str(e)}"
+class UtilityTools:
+    """实用工具类"""
+    @staticmethod
+    @tool
+    def get_current_weather(location: str) -> str:
+        """获取当前天气"""
+        try:
+            # 这里可以集成天气API
+            return f"天气查询功能需要配置天气API密钥，查询位置: {location}"
+        except Exception as e:
+            return f"天气查询失败: {str(e)}"
+    @staticmethod
+    @tool
+    def translate_text(text: str, target_language: str = "中文") -> str:
+        """翻译文本"""
+        try:
+            # 使用翻译模型
+            translator = pipeline(
+                "translation",
+                model="Helsinki-NLP/opus-mt-en-zh" if target_language == "中文" else "Helsinki-NLP/opus-mt-en-fr",
+                device=0 if torch.cuda.is_available() else -1
+            )
+            result = translator(text)
+            return result[0]['translation_text']
+        except Exception as e:
+            return f"翻译失败: {str(e)}"
+    @staticmethod
+    @tool
+    def calculate_math_expression(expression: str) -> str:
+        """计算数学表达式"""
+        try:
+            # 安全地计算数学表达式
+            allowed_names = {
+                k: v for k, v in __builtins__.items()
+                if k in ['abs', 'round', 'min', 'max', 'sum', 'pow']
+            }
+            allowed_names.update({
+                'sin': lambda x: np.sin(x),
+                'cos': lambda x: np.cos(x),
+                'tan': lambda x: np.tan(x),
+                'sqrt': lambda x: np.sqrt(x),
+                'log': lambda x: np.log(x),
+                'pi': np.pi,
+                'e': np.e
+            })
+            result = eval(expression, {"__builtins__": {}}, allowed_names)
+            return str(result)
+        except Exception as e:
+            return f"计算失败: {str(e)}"
+class WikipediaTools:
+    """Wikipedia处理工具类"""
+    @staticmethod
+    @tool
+    def search_wikipedia(query: str, max_results: int = 5) -> List[Dict[str, Any]]:
+        """搜索Wikipedia页面"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return [{"error": "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"}]
+            # 创建Wikipedia API实例
+            wiki = wikipediaapi.Wikipedia(
+                language='zh',
+                user_agent='MultiModalAgent/1.0 (https://github.com/your-repo; your-email@example.com)'
+            )
+            # 搜索Wikipedia页面
+            search_results = wiki.search(query, results=max_results)
+            results = []
+            for title in search_results:
+                try:
+                    # 获取页面
+                    page = wiki.page(title)
+                    if page.exists():
+                        results.append({
+                            'title': page.title,
+                            'url': page.fullurl,
+                            'summary': page.summary[:300] + "..." if len(page.summary) > 300 else page.summary,
+                            'page_id': page.pageid
+                        })
+                    else:
+                        results.append({
+                            'title': title,
+                            'url': f"https://zh.wikipedia.org/wiki/{title.replace(' ', '_')}",
+                            'summary': "页面不存在",
+                            'page_id': None
+                        })
+                except Exception as e:
+                    # 如果获取页面失败，只返回标题
+                    results.append({
+                        'title': title,
+                        'url': f"https://zh.wikipedia.org/wiki/{title.replace(' ', '_')}",
+                        'summary': f"无法获取摘要: {str(e)}",
+                        'page_id': None
+                    })
+            return results
+        except Exception as e:
+            return [{"error": f"Wikipedia搜索失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def get_wikipedia_page(title: str) -> Dict[str, Any]:
+        """获取Wikipedia页面内容"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return {"error": "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"}
+            # 创建Wikipedia API实例
+            wiki = wikipediaapi.Wikipedia(
+                language='zh',
+                user_agent='MultiModalAgent/1.0 (https://github.com/your-repo; your-email@example.com)'
+            )
+            # 获取页面
+            page = wiki.page(title)
+            if not page.exists():
+                return {"error": f"Wikipedia页面 '{title}' 不存在"}
+            # 获取页面信息
+            page_info = {
+                'title': page.title,
+                'url': page.fullurl,
+                'summary': page.summary,
+                'content': page.text[:5000] + "..." if len(page.text) > 5000 else page.text,  # 限制内容长度
+                'page_id': page.pageid,
+                'categories': list(page.categories.keys())[:10],  # 限制分类数量
+                'links': list(page.links.keys())[:20],  # 限制链接数量
+                'content_length': len(page.text)
+            }
+            return page_info
+        except Exception as e:
+            return {"error": f"Wikipedia页面获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def get_wikipedia_summary(title: str) -> str:
+        """获取Wikipedia页面摘要"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 获取页面摘要
+            summary = wikipedia.summary(title, sentences=5, auto_suggest=False)
+            return summary
+        except Exception as e:
+            return f"Wikipedia摘要获取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def get_wikipedia_random_page() -> Dict[str, Any]:
+        """获取随机Wikipedia页面"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return {"error": "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"}
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 获取随机页面
+            random_title = wikipedia.random(1)
+            if random_title:
+                return WikipediaTools.get_wikipedia_page(random_title[0])
+            else:
+                return {"error": "无法获取随机页面"}
+        except Exception as e:
+            return {"error": f"随机Wikipedia页面获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def search_wikipedia_english(query: str, max_results: int = 5) -> List[Dict[str, Any]]:
+        """搜索英文Wikipedia页面"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return [{"error": "Wikipedia处理功能未安装���请运行: pip install wikipedia-api requests beautifulsoup4"}]
+            # 设置语言为英文
+            wikipedia.set_lang("en")
+            # 搜索Wikipedia页面
+            search_results = wikipedia.search(query, results=max_results)
+            results = []
+            for title in search_results:
+                try:
+                    # 获取页面摘要
+                    page = wikipedia.page(title, auto_suggest=False)
+                    results.append({
+                        'title': title,
+                        'url': page.url,
+                        'summary': page.summary[:300] + "..." if len(page.summary) > 300 else page.summary,
+                        'page_id': page.pageid
+                    })
+                except Exception as e:
+                    # 如果获取页面失败，只返回标题
+                    results.append({
+                        'title': title,
+                        'url': f"https://en.wikipedia.org/wiki/{title.replace(' ', '_')}",
+                        'summary': f"无法获取摘要: {str(e)}",
+                        'page_id': None
+                    })
+            return results
+        except Exception as e:
+            return [{"error": f"英文Wikipedia搜索失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def get_wikipedia_page_english(title: str) -> Dict[str, Any]:
+        """获取英文Wikipedia页面内容"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return {"error": "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"}
+            # 设置语言为英文
+            wikipedia.set_lang("en")
+            # 获取页面
+            page = wikipedia.page(title, auto_suggest=False)
+            # 获取页面内容
+            content = page.content
+            # 获取页面信息
+            page_info = {
+                'title': page.title,
+                'url': page.url,
+                'summary': page.summary,
+                'content': content[:5000] + "..." if len(content) > 5000 else content,  # 限制内容长度
+                'page_id': page.pageid,
+                'categories': page.categories[:10],  # 限制分类数量
+                'links': page.links[:20],  # 限制链接数量
+                'references': page.references[:10] if hasattr(page, 'references') else [],  # 限制引用数量
+                'images': page.images[:10] if hasattr(page, 'images') else [],  # 限制图片数量
+                'content_length': len(content)
+            }
+            return page_info
+        except Exception as e:
+            return {"error": f"英文Wikipedia页面获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def get_wikipedia_suggestions(query: str) -> List[str]:
+        """获取Wikipedia搜索建议"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return ["Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"]
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 获取搜索建议
+            suggestions = wikipedia.search(query, results=10)
+            return suggestions
+        except Exception as e:
+            return [f"Wikipedia搜索建议获取失败: {str(e)}"]
+    @staticmethod
+    @tool
+    def get_wikipedia_categories(title: str) -> List[str]:
+        """获取Wikipedia页面分类"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return ["Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"]
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 获取页面
+            page = wikipedia.page(title, auto_suggest=False)
+            # 获取分类
+            categories = page.categories
+            return categories[:20]  # 限制分类数量
+        except Exception as e:
+            return [f"Wikipedia分类获取失败: {str(e)}"]
+    @staticmethod
+    @tool
+    def get_wikipedia_links(title: str) -> List[str]:
+        """获取Wikipedia页面链接"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return ["Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"]
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 获取页面
+            page = wikipedia.page(title, auto_suggest=False)
+            # 获取链接
+            links = page.links
+            return links[:30]  # 限制链接数量
+        except Exception as e:
+            return [f"Wikipedia链接获取失败: {str(e)}"]
+    @staticmethod
+    @tool
+    def get_wikipedia_geosearch(latitude: float, longitude: float, radius: int = 1000) -> List[Dict[str, Any]]:
+        """根据地理坐标搜索附近的Wikipedia页面"""
+        try:
+            if not WIKIPEDIA_AVAILABLE:
+                return [{"error": "Wikipedia处理功能未安装，请运行: pip install wikipedia-api requests beautifulsoup4"}]
+            # 设置语言为中文
+            wikipedia.set_lang("zh")
+            # 地理搜索
+            nearby_pages = wikipedia.geosearch(latitude, longitude, radius=radius)
+            results = []
+            for page in nearby_pages:
+                try:
+                    results.append({
+                        'title': page.title,
+                        'url': page.url,
+                        'summary': page.summary[:200] + "..." if len(page.summary) > 200 else page.summary,
+                        'distance': page.distance if hasattr(page, 'distance') else None,
+                        'coordinates': page.coordinates if hasattr(page, 'coordinates') else None
+                    })
+                except Exception as e:
+                    results.append({
+                        'title': page.title,
+                        'url': page.url,
+                        'summary': f"无法获取摘要: {str(e)}",
+                        'distance': None,
+                        'coordinates': None
+                    })
+            return results
+        except Exception as e:
+            return [{"error": f"Wikipedia地理搜索失败: {str(e)}"}]
+class YouTubeTools:
+    """YouTube视频处理工具类"""
+    @staticmethod
+    @tool
+    def download_youtube_video(url: str) -> str:
+        """下载YouTube视频"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            if not YT_DLP_AVAILABLE:
+                return "YouTube视频下载需要安装yt-dlp，请运行: pip install yt-dlp"
+            # 使用yt-dlp下载视频（更稳定）
+            ydl_opts = {
+                'format': 'best[height<=720]',  # 限制分辨率
+                'outtmpl': '%(title)s.%(ext)s',
+                'quiet': True,
+                'no_warnings': True
+            }
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=True)
+                video_path = ydl.prepare_filename(info)
+            return video_path
+        except Exception as e:
+            return f"YouTube视频下载失败: {str(e)}"
+    @staticmethod
+    @tool
+    def get_youtube_info(url: str) -> Dict[str, Any]:
+        """获取YouTube视频信息"""
+        try:
+            # 提取视频ID
+            import re
+            video_id_match = re.search(r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)', url)
+            if not video_id_match:
+                return {"error": "无效的YouTube URL"}
+            video_id = video_id_match.group(1)
+            # 首先尝试使用yt-dlp（更稳定）
+            if YT_DLP_AVAILABLE:
+                try:
+                    import yt_dlp
+                    ydl_opts = {
+                        'quiet': True,
+                        'no_warnings': True,
+                        'extract_flat': True
+                    }
+                    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                        info = ydl.extract_info(url, download=False)
+                        video_info = {
+                            'title': info.get('title', f'YouTube视频 {video_id}'),
+                            'author': info.get('uploader', 'Unknown'),
+                            'length': info.get('duration', 0),
+                            'views': info.get('view_count', 0),
+                            'description': info.get('description', '')[:500] + "..." if len(info.get('description', '')) > 500 else info.get('description', ''),
+                            'publish_date': str(info.get('upload_date', 'Unknown')),
+                            'rating': info.get('average_rating', 0),
+                            'keywords': info.get('tags', []),
+                            'thumbnail_url': info.get('thumbnail', f"https://img.youtube.com/vi/{video_id}/maxresdefault.jpg"),
+                            'video_id': video_id,
+                            'url': url,
+                            'method': 'yt-dlp'
+                        }
+                        return video_info
+                except Exception as e:
+                    print(f"yt-dlp获取失败: {e}")
+            # 如果yt-dlp失败，尝试使用pytube
+            if YOUTUBE_AVAILABLE:
+                try:
+                    from pytube import YouTube
+                    yt = YouTube(url)
+                    # 获取视频信息
+                    video_info = {
+                        'title': yt.title,
+                        'author': yt.author,
+                        'length': yt.length,  # 秒
+                        'views': yt.views,
+                        'description': yt.description[:500] + "..." if len(yt.description) > 500 else yt.description,
+                        'publish_date': str(yt.publish_date) if yt.publish_date else "Unknown",
+                        'rating': yt.rating,
+                        'keywords': yt.keywords,
+                        'thumbnail_url': yt.thumbnail_url,
+                        'video_id': video_id,
+                        'url': url,
+                        'method': 'pytube'
+                    }
+                    return video_info
+                except Exception as e:
+                    print(f"pytube获取失败: {e}")
+            # 如果都失败了，返回基本信息
+            return {
+                'title': f"YouTube视频 {video_id}",
+                'author': "Unknown",
+                'length': 0,
+                'views': 0,
+                'description': "无法获取详细信息，可能需要更新YouTube处理库",
+                'publish_date': "Unknown",
+                'rating': 0,
+                'keywords': [],
+                'thumbnail_url': f"https://img.youtube.com/vi/{video_id}/maxresdefault.jpg",
+                'video_id': video_id,
+                'url': url,
+                'note': "所有YouTube处理库都失败，建议更新pytube或安装yt-dlp"
+            }
+        except Exception as e:
+            return {"error": f"YouTube信息获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def extract_youtube_audio(url: str) -> str:
+        """提取YouTube视频音频"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            if not YT_DLP_AVAILABLE:
+                return "YouTube音频提取需要安装yt-dlp，请运行: pip install yt-dlp"
+            # 使用yt-dlp提取音频
+            ydl_opts = {
+                'format': 'bestaudio/best',
+                'postprocessors': [{
+                    'key': 'FFmpegExtractAudio',
+                    'preferredcodec': 'mp3',
+                    'preferredquality': '192',
+                }],
+                'outtmpl': '%(title)s.%(ext)s',
+                'quiet': True,
+                'no_warnings': True
+            }
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=True)
+                audio_path = ydl.prepare_filename(info).replace('.webm', '.mp3').replace('.m4a', '.mp3')
+            return audio_path
+        except Exception as e:
+            return f"YouTube音频提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def download_youtube_thumbnail(url: str) -> str:
+        """下载YouTube视频缩略图"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            # 提取视频ID
+            import re
+            video_id_match = re.search(r'(?:youtube\.com\/watch\?v=|youtu\.be\/)([^&\n?#]+)', url)
+            if not video_id_match:
+                return "无效的YouTube URL"
+            video_id = video_id_match.group(1)
+            # 尝试使用pytube获取缩略图URL
+            try:
+                yt = YouTube(url)
+                thumbnail_url = yt.thumbnail_url
+            except Exception as e:
+                # 如果pytube失败，使用标准缩略图URL
+                thumbnail_url = f"https://img.youtube.com/vi/{video_id}/maxresdefault.jpg"
+            # 下载缩略图
+            import tempfile
+            import urllib.request
+            temp_path = tempfile.mktemp(suffix='.jpg')
+            urllib.request.urlretrieve(thumbnail_url, temp_path)
+            return temp_path
+        except Exception as e:
+            return f"YouTube缩略图下载失败: {str(e)}"
+    @staticmethod
+    @tool
+    def search_youtube_videos(query: str, max_results: int = 5) -> List[Dict[str, Any]]:
+        """搜索YouTube视频"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return [{"error": "YouTube处理功能未安装，请运行: pip install pytube"}]
+            # 使用DuckDuckGo搜索YouTube视频
+            from duckduckgo_search import DDGS
+            try:
+                with DDGS() as ddgs:
+                    search_results = list(ddgs.text(f"{query} site:youtube.com", max_results=max_results))
+                videos = []
+                for result in search_results:
+                    if result and 'youtube.com/watch' in result.get('link', ''):
+                        videos.append({
+                            'title': result.get('title', 'Unknown'),
+                            'url': result.get('link', ''),
+                            'duration': 0,
+                            'view_count': 0,
+                            'uploader': 'Unknown',
+                            'thumbnail': '',
+                            'description': result.get('body', '')[:200] + "..." if len(result.get('body', '')) > 200 else result.get('body', '')
+                        })
+                return videos
+            except Exception as search_error:
+                return [{"error": f"DuckDuckGo搜索失败: {str(search_error)}"}]
+        except Exception as e:
+            return [{"error": f"YouTube搜索失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def analyze_youtube_comments(url: str, max_comments: int = 10) -> List[Dict[str, Any]]:
+        """分析YouTube视频评论"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return [{"error": "YouTube处理功能未安装，请运行: pip install pytube yt-dlp"}]
+            # 使用yt-dlp获取评论
+            ydl_opts = {
+                'quiet': True,
+                'no_warnings': True,
+                'extract_flat': False,
+                'writecomments': True
+            }
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=False)
+            comments = []
+            if 'comments' in info:
+                for comment in info['comments'][:max_comments]:
+                    comments.append({
+                        'author': comment.get('author', 'Unknown'),
+                        'text': comment.get('text', ''),
+                        'like_count': comment.get('like_count', 0),
+                        'time': comment.get('time', ''),
+                        'reply_count': comment.get('reply_count', 0)
+                    })
+            return comments
+        except Exception as e:
+            return [{"error": f"YouTube评论分析失败: {str(e)}"}]
+    @staticmethod
+    @tool
+    def get_youtube_playlist_info(playlist_url: str) -> Dict[str, Any]:
+        """获取YouTube播放列表信息"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return {"error": "YouTube处理功能未安装，请运行: pip install pytube"}
+            if not YT_DLP_AVAILABLE:
+                return {"error": "YouTube播放列表功能需要安装yt-dlp，请运行: pip install yt-dlp"}
+            # 使用yt-dlp获取播放列表信息
+            ydl_opts = {
+                'quiet': True,
+                'no_warnings': True,
+                'extract_flat': True,
+                'playlist_items': '1-10'  # 只获取前10个视频
+            }
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(playlist_url, download=False)
+            playlist_info = {
+                'title': info.get('title', 'Unknown'),
+                'description': info.get('description', '')[:500] + "..." if len(info.get('description', '')) > 500 else info.get('description', ''),
+                'video_count': info.get('playlist_count', 0),
+                'uploader': info.get('uploader', 'Unknown'),
+                'videos': []
+            }
+            if 'entries' in info:
+                for entry in info['entries']:
+                    if entry:
+                        playlist_info['videos'].append({
+                            'title': entry.get('title', 'Unknown'),
+                            'url': entry.get('url', ''),
+                            'duration': entry.get('duration', 0),
+                            'uploader': entry.get('uploader', 'Unknown')
+                        })
+            return playlist_info
+        except Exception as e:
+            return {"error": f"YouTube播放列表信息获取失败: {str(e)}"}
+    @staticmethod
+    @tool
+    def download_youtube_video_for_watching(url: str, quality: str = "720p") -> str:
+        """下载YouTube视频用于观看"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            if not YT_DLP_AVAILABLE:
+                return "YouTube视频下载需要安装yt-dlp，请运行: pip install yt-dlp"
+            # 设置下载选项
+            ydl_opts = {
+                'format': f'best[height<={quality.replace("p", "")}]',
+                'outtmpl': 'downloads/%(title)s.%(ext)s',
+                'quiet': False,
+                'no_warnings': False,
+                'progress_hooks': [lambda d: print(f"下载进���: {d.get('_percent_str', '0%')}") if d['status'] == 'downloading' else None]
+            }
+            # 创建下载目录
+            import os
+            os.makedirs('downloads', exist_ok=True)
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=True)
+                video_path = ydl.prepare_filename(info)
+            return f"视频已下载到: {video_path}"
+        except Exception as e:
+            return f"YouTube视频下载失败: {str(e)}"
+    @staticmethod
+    @tool
+    def extract_youtube_audio_for_listening(url: str, format: str = "mp3") -> str:
+        """提取YouTube视频音频用于听取"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            if not YT_DLP_AVAILABLE:
+                return "YouTube音频提取需要安装yt-dlp，请运行: pip install yt-dlp"
+            # 设置下载选项（不使用ffmpeg后处理）
+            ydl_opts = {
+                'format': 'bestaudio/best',
+                'outtmpl': 'downloads/%(title)s.%(ext)s',
+                'quiet': False,
+                'no_warnings': False
+            }
+            # 创建下载目录
+            import os
+            os.makedirs('downloads', exist_ok=True)
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=True)
+                audio_path = ydl.prepare_filename(info)
+            return f"音频已提取到: {audio_path} (原始格式，可用播放器播放)"
+        except Exception as e:
+            return f"YouTube音频提取失败: {str(e)}"
+    @staticmethod
+    @tool
+    def transcribe_youtube_video(url: str) -> str:
+        """将YouTube视频转换为文字"""
+        try:
+            if not YOUTUBE_AVAILABLE:
+                return "YouTube处理功能未安装，请运行: pip install pytube"
+            if not YT_DLP_AVAILABLE:
+                return "YouTube视频转录需要安装yt-dlp，请运行: pip install yt-dlp"
+            if not AUDIO_PROCESSING_AVAILABLE:
+                return "音频转录功能需要安装SpeechRecognition和pydub，请运行: pip install SpeechRecognition pydub"
+            # 首先下载音频
+            ydl_opts = {
+                'format': 'bestaudio/best',
+                'outtmpl': 'downloads/%(title)s.%(ext)s',
+                'quiet': True,
+                'no_warnings': True
+            }
+            import os
+            os.makedirs('downloads', exist_ok=True)
+            with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                info = ydl.extract_info(url, download=True)
+                audio_path = ydl.prepare_filename(info)
+            # 转换为WAV格式用于语音识别
+            audio = AudioSegment.from_file(audio_path)
+            wav_path = audio_path.replace('.webm', '.wav').replace('.m4a', '.wav')
+            audio.export(wav_path, format="wav")
+            # 语音识别
+            recognizer = sr.Recognizer()
+            with sr.AudioFile(wav_path) as source:
+                audio_data = recognizer.record(source)
+                text = recognizer.recognize_google(audio_data, language='zh-CN')
+            # 清理临时文件
+            os.remove(wav_path)
+            return f"视频转录结果:\n{text}"
+        except Exception as e:
+            return f"YouTube视频转录失败: {str(e)}"
+    @staticmethod
+    @tool
+    def analyze_youtube_video_content(url: str) -> Dict[str, Any]:
+        """分析YouTube视频内容 - 真正让VLLM看视频和听视频"""
+        try:
+            # 获取视频信息
+            video_info = YouTubeTools.get_youtube_info(url)
+            if 'error' in video_info:
+                return video_info
+            analysis_result = {
+                'video_info': video_info,
+                'visual_analysis': "视频视觉分析功能不可用",
+                'audio_analysis': "音频分析功能不可用",
+                'transcription': "音频转录功能不可用"
+            }
+            # 1. 下载视频用于视觉分析
+            if YT_DLP_AVAILABLE:
+                try:
+                    # 下载视频文件
+                    ydl_opts = {
+                        'format': 'best[height<=720]',  # 限制分辨率
+                        'outtmpl': 'downloads/%(title)s.%(ext)s',
+                        'quiet': True,
+                        'no_warnings': True
+                    }
+                    import os
+                    os.makedirs('downloads', exist_ok=True)
+                    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                        info = ydl.extract_info(url, download=True)
+                        video_path = ydl.prepare_filename(info)
+                    # 2. 提取关键帧进行视觉分析
+                    try:
+                        import cv2
+                        import numpy as np
+                        from PIL import Image
+                        cap = cv2.VideoCapture(video_path)
+                        frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+                        fps = cap.get(cv2.CAP_PROP_FPS)
+                        duration = frame_count / fps if fps > 0 else 0
+                        # 提取关键帧（每秒1帧）
+                        key_frames = []
+                        frame_interval = max(1, int(fps))
+                        for i in range(0, frame_count, frame_interval):
+                            cap.set(cv2.CAP_PROP_POS_FRAMES, i)
+                            ret, frame = cap.read()
+                            if ret:
+                                # 转换为PIL图像
+                                frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+                                pil_image = Image.fromarray(frame_rgb)
+                                # 保存关键帧
+                                frame_path = f"downloads/frame_{i//frame_interval:03d}.jpg"
+                                pil_image.save(frame_path, "JPEG", quality=85)
+                                key_frames.append({
+                                    'frame_number': i,
+                                    'timestamp': i / fps if fps > 0 else 0,
+                                    'path': frame_path
+                                })
+                        cap.release()
+                        # 3. 使用VLLM分析关键帧
+                        try:
+                            from transformers import pipeline
+                            # 图像描述模型
+                            image_to_text = pipeline("image-to-text", model="nlpconnect/vit-gpt2-image-captioning")
+                            visual_descriptions = []
+                            for frame_info in key_frames[:10]:  # 限制分析前10帧
+                                try:
+                                    description = image_to_text(frame_info['path'])[0]['generated_text']
+                                    visual_descriptions.append({
+                                        'timestamp': frame_info['timestamp'],
+                                        'description': description
+                                    })
+                                except Exception as e:
+                                    print(f"帧分析失败: {e}")
+                            analysis_result['visual_analysis'] = {
+                                'video_path': video_path,
+                                'duration': duration,
+                                'fps': fps,
+                                'frame_count': frame_count,
+                                'key_frames_analyzed': len(visual_descriptions),
+                                'visual_descriptions': visual_descriptions,
+                                'summary': f"视频包含{len(visual_descriptions)}个关键场景"
+                            }
+                        except Exception as e:
+                            analysis_result['visual_analysis'] = f"VLLM视觉分析失败: {str(e)}"
+                    except Exception as e:
+                        analysis_result['visual_analysis'] = f"视频帧提取失败: {str(e)}"
+                except Exception as e:
+                    analysis_result['visual_analysis'] = f"视频下载失败: {str(e)}"
+            # 4. 音频分析和转录（不依赖ffmpeg）
+            if YT_DLP_AVAILABLE:
+                try:
+                    # 下载音频
+                    ydl_opts = {
+                        'format': 'bestaudio/best',
+                        'outtmpl': 'downloads/%(title)s_audio.%(ext)s',
+                        'quiet': True,
+                        'no_warnings': True
+                    }
+                    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
+                        info = ydl.extract_info(url, download=True)
+                        audio_path = ydl.prepare_filename(info)
+                    # 音频转录（使用多种方法，不依赖ffmpeg）
+                    try:
+                        # 方法1: 尝试使用whisper（推荐，不需要ffmpeg）
+                        try:
+                            import whisper
+                            print("🎤 使用whisper进行音频转录...")
+                            model = whisper.load_model("base")
+                            result = model.transcribe(audio_path)
+                            transcription_text = result["text"]
+                            analysis_result['transcription'] = transcription_text
+                            analysis_result['audio_analysis'] = {
+                                'audio_path': audio_path,
+                                'duration': result.get('duration', 0),
+                                'transcription': transcription_text,
+                                'method': 'whisper',
+                                'summary': f"音频时长{result.get('duration', 0):.1f}秒，已转录为文字"
+                            }
+                            print("✅ whisper转录成功")
+                        except ImportError:
+                            print("⚠️ whisper未安装，尝试其他方法...")
+                            # 方法2: 尝试使用pydub + speech_recognition（如果ffmpeg可用）
+                            try:
+                                from pydub import AudioSegment
+                                import speech_recognition as sr
+                                # 检查ffmpeg是否可用
+                                import subprocess
+                                try:
+                                    subprocess.run(['ffmpeg', '-version'], capture_output=True, check=True)
+                                    ffmpeg_available = True
+                                    print("✅ ffmpeg可用，使用pydub+speech_recognition")
+                                except:
+                                    ffmpeg_available = False
+                                    print("❌ ffmpeg不可用")
+                                if ffmpeg_available:
+                                    # 转换为WAV格式
+                                    audio = AudioSegment.from_file(audio_path)
+                                    wav_path = audio_path.replace('.webm', '.wav').replace('.m4a', '.wav')
+                                    audio.export(wav_path, format="wav")
+                                    # 语音识别
+                                    recognizer = sr.Recognizer()
+                                    with sr.AudioFile(wav_path) as source:
+                                        audio_data = recognizer.record(source)
+                                        transcription_text = recognizer.recognize_google(audio_data, language='zh-CN')
+                                    analysis_result['transcription'] = transcription_text
+                                    analysis_result['audio_analysis'] = {
+                                        'audio_path': audio_path,
+                                        'duration': len(audio) / 1000,  # 秒
+                                        'transcription': transcription_text,
+                                        'method': 'pydub+speech_recognition',
+                                        'summary': f"音频时长{len(audio)/1000:.1f}秒，已转录为文字"
+                                    }
+                                    # 清理临时文件
+                                    import os
+                                    if os.path.exists(wav_path):
+                                        os.remove(wav_path)
+                                else:
+                                    # 方法3: 只提供音频文件信息，不进行转录
+                                    analysis_result['transcription'] = "音频转录需要安装whisper或ffmpeg"
+                                    analysis_result['audio_analysis'] = {
+                                        'audio_path': audio_path,
+                                        'duration': 'unknown',
+                                        'transcription': '需要ffmpeg或whisper进行转录',
+                                        'method': 'audio_only',
+                                        'summary': f"音频已下载到: {audio_path}，需要安装whisper或ffmpeg进行转录"
+                                    }
+                            except Exception as e:
+                                print(f"❌ pydub+speech_recognition失败: {e}")
+                                analysis_result['transcription'] = f"音频转录失败: {str(e)}"
+                                analysis_result['audio_analysis'] = {
+                                    'audio_path': audio_path,
+                                    'duration': 'unknown',
+                                    'transcription': f'转录失败: {str(e)}',
+                                    'method': 'failed',
+                                    'summary': f"音频已下载，但转录失败: {str(e)}"
+                                }
+                    except Exception as e:
+                        analysis_result['transcription'] = f"音频转录失败: {str(e)}"
+                        analysis_result['audio_analysis'] = {
+                            'audio_path': audio_path,
+                            'duration': 'unknown',
+                            'transcription': f'转录失败: {str(e)}',
+                            'method': 'failed',
+                            'summary': f"音频已下载，但转录失败: {str(e)}"
+                        }
+                except Exception as e:
+                    analysis_result['audio_analysis'] = f"音频下载失败: {str(e)}"
+            # 5. 综合分析结果
+            analysis_result['summary'] = f"这是一个关于{video_info.get('title', '未知主题')}的视频，时长{video_info.get('length', 0)}秒"
+            analysis_result['key_points'] = [
+                "视频标题: " + video_info.get('title', 'Unknown'),
+                "作者: " + video_info.get('author', 'Unknown'),
+                "时长: " + str(video_info.get('length', 0)) + "秒",
+                "观看次数: " + str(video_info.get('views', 0)),
+                "视觉分析: " + ("已完成" if isinstance(analysis_result['visual_analysis'], dict) else "失败"),
+                "音频分析: " + ("已完成" if isinstance(analysis_result['audio_analysis'], dict) else "失败")
+            ]
+            return analysis_result
+        except Exception as e:
+            return {"error": f"YouTube视频内容分析失败: {str(e)}"}
+class ToolManager:
+    """工具管理器"""
+    def __init__(self):
+        self.media_tools = MediaTools()
+        self.code_tools = CodeAnalysisTools()
+        self.pdf_tools = PDFTools()
+        self.search_tools = SearchTools()
+        self.analysis_tools = AnalysisTools()
+        self.utility_tools = UtilityTools()
+        self.web_tools = WebTools() # 添加WebTools到管理器
+        self.youtube_tools = YouTubeTools() # 添加YouTubeTools到管理器
+        self.wikipedia_tools = WikipediaTools() # 添加WikipediaTools到管理器
+        # 注册所有工具
+        self.tools = {
+            # PDF工具
+            'download_pdf_from_url': self.pdf_tools.download_pdf_from_url,
+            'extract_text_from_pdf': self.pdf_tools.extract_text_from_pdf,
+            'extract_images_from_pdf': self.pdf_tools.extract_images_from_pdf,
+            'analyze_pdf_structure': self.pdf_tools.analyze_pdf_structure,
+            'search_text_in_pdf': self.pdf_tools.search_text_in_pdf,
+            'summarize_pdf_content': self.pdf_tools.summarize_pdf_content,
+            # 媒体工具
+            'extract_text_from_image': self.media_tools.extract_text_from_image,
+            'analyze_image_emotion': self.media_tools.analyze_image_emotion,
+            'extract_video_audio': self.media_tools.extract_video_audio,
+            'analyze_video_content': self.media_tools.analyze_video_content,
+            # 代码工具
+            'analyze_python_code': self.code_tools.analyze_python_code,
+            'execute_python_code': self.code_tools.execute_python_code,
+            'explain_code': self.code_tools.explain_code,
+            # 搜索工具
+            'web_search': self.search_tools.web_search,
+            'search_images': self.search_tools.search_images,
+            'search_videos': self.search_tools.search_videos,
+            'search_pdfs': self.search_tools.search_pdfs,
+            # 分析工具
+            'analyze_text_sentiment': self.analysis_tools.analyze_text_sentiment,
+            'extract_keywords': self.analysis_tools.extract_keywords,
+            'summarize_text': self.analysis_tools.summarize_text,
+            # 实用工具
+            'get_current_weather': self.utility_tools.get_current_weather,
+            'translate_text': self.utility_tools.translate_text,
+            'calculate_math_expression': self.utility_tools.calculate_math_expression,
+            # 网页工具
+            'fetch_webpage_content': self.web_tools.fetch_webpage_content,
+            'extract_text_from_webpage': self.web_tools.extract_text_from_webpage,
+            'analyze_webpage_structure': self.web_tools.analyze_webpage_structure,
+            'search_content_in_webpage': self.web_tools.search_content_in_webpage,
+            'extract_links_from_webpage': self.web_tools.extract_links_from_webpage,
+            'summarize_webpage_content': self.web_tools.summarize_webpage_content,
+            'check_webpage_accessibility': self.web_tools.check_webpage_accessibility,
+            # YouTube工具
+            'download_youtube_video': self.youtube_tools.download_youtube_video,
+            'get_youtube_info': self.youtube_tools.get_youtube_info,
+            'extract_youtube_audio': self.youtube_tools.extract_youtube_audio,
+            'download_youtube_thumbnail': self.youtube_tools.download_youtube_thumbnail,
+            'search_youtube_videos': self.youtube_tools.search_youtube_videos,
+            'analyze_youtube_comments': self.youtube_tools.analyze_youtube_comments,
+            'get_youtube_playlist_info': self.youtube_tools.get_youtube_playlist_info,
+            'download_youtube_video_for_watching': self.youtube_tools.download_youtube_video_for_watching,
+            'extract_youtube_audio_for_listening': self.youtube_tools.extract_youtube_audio_for_listening,
+            'transcribe_youtube_video': self.youtube_tools.transcribe_youtube_video,
+            'analyze_youtube_video_content': self.youtube_tools.analyze_youtube_video_content,
+            # Wikipedia工具
+            'search_wikipedia': self.wikipedia_tools.search_wikipedia,
+            'get_wikipedia_page': self.wikipedia_tools.get_wikipedia_page,
+            'get_wikipedia_summary': self.wikipedia_tools.get_wikipedia_summary,
+            'get_wikipedia_random_page': self.wikipedia_tools.get_wikipedia_random_page,
+            'search_wikipedia_english': self.wikipedia_tools.search_wikipedia_english,
+            'get_wikipedia_page_english': self.wikipedia_tools.get_wikipedia_page_english,
+            'get_wikipedia_suggestions': self.wikipedia_tools.get_wikipedia_suggestions,
+            'get_wikipedia_categories': self.wikipedia_tools.get_wikipedia_categories,
+            'get_wikipedia_links': self.wikipedia_tools.get_wikipedia_links,
+            'get_wikipedia_geosearch': self.wikipedia_tools.get_wikipedia_geosearch,
+        }
+    def get_tool(self, tool_name: str):
+        """获取工具"""
+        return self.tools.get(tool_name)
+    def list_tools(self) -> List[str]:
+        """列出所有可用工具"""
+        return list(self.tools.keys())
+    def execute_tool(self, tool_name: str, **kwargs) -> Any:
+        """执行工具"""
+        tool = self.get_tool(tool_name)
+        if tool:
+            # 直接调用工具函数
+            if hasattr(tool, 'func'):
+                # 如果是@tool装饰的函数，直接调用原始函数
+                return tool.func(**kwargs)
+            elif hasattr(tool, '__wrapped__'):
+                # 备用方法
+                return tool.__wrapped__(**kwargs)
+            else:
+                # 最后尝试run方法
+                return tool.run(**kwargs)
+        else:
+            raise ValueError(f"工具 '{tool_name}' 不存在")
+    def should_use_search(self, question: str, context: Dict[str, Any]) -> bool:
+        """判断是否需要使用搜索引擎"""
+        question_lower = question.lower()
+        # 不需要搜索的情况
+        no_search_keywords = [
+            '计算', 'calculate', 'math', '数学',
+            '代码', 'code', 'python', 'program',
+            '翻译', 'translate',
+            '天气', 'weather',
+            '情感', 'sentiment', 'emotion',
+            '关键词', 'keywords',
+            '摘要', 'summary', 'summarize',
+            'pdf', '文档', 'document'
+        ]
+        # 需要搜索的情况
+        search_keywords = [
+            '最新', 'latest', 'news', '新闻',
+            '什么是', 'what is', 'how to', '如何',
+            '价格', 'price', 'cost',
+            '地点', 'location', 'where',
+            '时间', 'time', 'when',
+            '比较', 'compare', 'vs',
+            '推荐', 'recommend', 'best'
+        ]
+        # 检查问题类型
+        for keyword in no_search_keywords:
+            if keyword in question_lower:
+                return False
+        for keyword in search_keywords:
+            if keyword in question_lower:
+                return True
+        # 如果问题包含具体实体或需要实时信息，使用搜索
+        if any(word in question_lower for word in ['2024', '2023', 'today', 'now', 'current']):
+            return True
+        # 默认不使用搜索，除非问题很长或很复杂
+        return len(question) > 50