Spaces:
Sleeping
Sleeping
feat: Implement dual-context workflow extraction
Browse files
GEMINI.md
CHANGED
|
@@ -10,26 +10,37 @@
|
|
| 10 |
|
| 11 |
# 项目目标
|
| 12 |
## 未完成
|
| 13 |
-
- [ ]
|
| 14 |
|
| 15 |
-
##
|
| 16 |
-
-
|
| 17 |
|
| 18 |
---
|
| 19 |
|
| 20 |
# 子目标
|
| 21 |
## 未完成
|
| 22 |
-
- [ ] **(进行中)**
|
|
|
|
| 23 |
- [ ] 实现自动化部署和验证流程。
|
| 24 |
|
| 25 |
## 已完成
|
|
|
|
| 26 |
- [x] 解决模型体积过大导致部署失败的问题。
|
| 27 |
- [x] 使用 LangGraph 实现一个可以路由两个模型的聊天网页应用。
|
| 28 |
|
| 29 |
---
|
| 30 |
|
| 31 |
# Todolist
|
|
|
|
|
|
|
|
|
|
| 32 |
## 已完成
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
- [x] 使用 Markdown 优化思考过程的显示效果。
|
| 34 |
- [x] 为“思考”和“正文” token 实现不同的颜色显示。
|
| 35 |
- [x] 实现调试模式以观察“思考”和“正文” token 的区别。
|
|
@@ -71,3 +82,59 @@
|
|
| 71 |
- **订阅:** HuggingFace Pro
|
| 72 |
- **推理资源:** 可以使用 ZeroGPU
|
| 73 |
- **文档参考:** 在必要的时候,主动搜索 HuggingFace 以及 Gradio 的在线 API 文档。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
# 项目目标
|
| 12 |
## 未完成
|
| 13 |
+
- [ ] 构建一个具备工作流提取与执行能力的 Agent 应用。
|
| 14 |
|
| 15 |
+
## 进行中
|
| 16 |
+
- [x] 构建一个能够综合利用 `Ring-mini-2.0` 的工作流应用。
|
| 17 |
|
| 18 |
---
|
| 19 |
|
| 20 |
# 子目标
|
| 21 |
## 未完成
|
| 22 |
+
- [ ] **(进行中)** 实现双 LLM 上下文架构(聊天 + 工作流提取)。
|
| 23 |
+
- [ ] 改造 Gradio UI 以展示双上下文结果。
|
| 24 |
- [ ] 实现自动化部署和验证流程。
|
| 25 |
|
| 26 |
## 已完成
|
| 27 |
+
- [x] 在 Gradio UI 中区分“思考”和“正文” token。
|
| 28 |
- [x] 解决模型体积过大导致部署失败的问题。
|
| 29 |
- [x] 使用 LangGraph 实现一个可以路由两个模型的聊天网页应用。
|
| 30 |
|
| 31 |
---
|
| 32 |
|
| 33 |
# Todolist
|
| 34 |
+
## 待办
|
| 35 |
+
(暂无)
|
| 36 |
+
|
| 37 |
## 已完成
|
| 38 |
+
- [x] 阅读 `app.py` 的当前代码。
|
| 39 |
+
- [x] 在 `app.py` 中,将 UI 从单聊天窗口改为“聊天 + 工作流”的上下布局。
|
| 40 |
+
- [x] 在 `app.py` 中,实现两个独立的聊天状态 (`gr.State`)。
|
| 41 |
+
- [x] 实现将“聊天上下文”的对话历史传递给“工作流提取上下文”的逻辑。
|
| 42 |
+
- [x] 为“工作流提取上下文”设计并集成系统提示词。
|
| 43 |
+
- [x] 更新 `GEMINI.md` 中的项目目标和子目标。
|
| 44 |
- [x] 使用 Markdown 优化思考过程的显示效果。
|
| 45 |
- [x] 为“思考”和“正文” token 实现不同的颜色显示。
|
| 46 |
- [x] 实现调试模式以观察“思考”和“正文” token 的区别。
|
|
|
|
| 82 |
- **订阅:** HuggingFace Pro
|
| 83 |
- **推理资源:** 可以使用 ZeroGPU
|
| 84 |
- **文档参考:** 在必要的时候,主动搜索 HuggingFace 以及 Gradio 的在线 API 文档。
|
| 85 |
+
|
| 86 |
+
---
|
| 87 |
+
|
| 88 |
+
# 项目需求文档:工作流提取与执行 Agent
|
| 89 |
+
|
| 90 |
+
## 1. 总体目标
|
| 91 |
+
|
| 92 |
+
构建一个具备双重上下文能力的 AI 应用。该应用能与用户进行自然语言交互,同时在后台自动提取、结构化用户的任务意图和执行步骤,形成一个动态的工作流。
|
| 93 |
+
|
| 94 |
+
## 2. 核心功能
|
| 95 |
+
|
| 96 |
+
### 2.1. 双重 LLM 上下文架构
|
| 97 |
+
|
| 98 |
+
应用需维护两个独立的 LLM 上下文:
|
| 99 |
+
|
| 100 |
+
1. **聊天上下文 (Chat Context):**
|
| 101 |
+
* **职责:** 直接与用户进行交互。
|
| 102 |
+
* **能力:** 理解并响应用户的指令和问题,进行多轮对话。
|
| 103 |
+
* **特点:** 无预设的系统提示词(System Prompt),行为完全由用户引导。
|
| 104 |
+
|
| 105 |
+
2. **工作流提取上下文 (Workflow Extraction Context):**
|
| 106 |
+
* **职责:** "观察"聊天上下文中的对话,并进行分析处理。
|
| 107 |
+
* **数据流:** 聊天上下文的完整对话记录(用户输入与模型输出)将作为输入实时或准实时地传送给此上下文。
|
| 108 |
+
* **能力:**
|
| 109 |
+
* **任务识别:** 根据对话内容,准确识别并提炼出用户当前的核心任务或意图。
|
| 110 |
+
* **步骤提炼:** 将用户与聊天上下文的交互过程,拆解为一系列清晰、可执行的步骤。
|
| 111 |
+
* **任务状态跟踪:** 能够判断用户任务的开始、进行中和结束状态。
|
| 112 |
+
* **特点:** 包含一个特定的系统提示词,指导其完成上述分析和提取任务。
|
| 113 |
+
|
| 114 |
+
### 2.2. Gradio 用户界面 (UI) 改造
|
| 115 |
+
|
| 116 |
+
为了清晰地展示双重上下文的工作状态,需要对现有 UI 进行重新布局。
|
| 117 |
+
|
| 118 |
+
* **移除:** 旧的 `[系统提示]` 输入框。
|
| 119 |
+
* **调整后布局:**
|
| 120 |
+
1. **`[聊天界面]` (Chatbot Interface):**
|
| 121 |
+
* **对接:** 聊天上下文。
|
| 122 |
+
* **功能:** 用户在此处输入问题,并看到聊天模型的直接回复。
|
| 123 |
+
2. **`[分割线]` (Separator):**
|
| 124 |
+
* **功能:** 在视觉上明确区分两个不同功能的区域。
|
| 125 |
+
3. **`[任务意图]` (Task Intent Display):**
|
| 126 |
+
* **形式:** 只读文本框 (Textbox)。
|
| 127 |
+
* **对接:** 工作流提取上下文。
|
| 128 |
+
* **内容:** 实时显示该上下文识别出的用户当前任务意图。
|
| 129 |
+
4. **`[步骤提炼]` (Extracted Steps Display):**
|
| 130 |
+
* **形式:** 只读文本框 (Textbox)。
|
| 131 |
+
* **对接:** 工作流提取上下文。
|
| 132 |
+
* **内容:** 实时展示该上下文从对话中提炼出的结构化步骤。
|
| 133 |
+
|
| 134 |
+
## 3. 技术实现要点
|
| 135 |
+
|
| 136 |
+
* **上下文管理:** 需要设计一种机制,在 `app.py` 中同时管理和维护两个独立的对话历史(`history`)。
|
| 137 |
+
* **数据同步:** 确保聊天上下文的每一次更新都能被工作流提取上下文捕获。
|
| 138 |
+
* **UI 更新:** Gradio 的界面元素需要与两个上下文的状态进行绑定,实现局部刷新,以展示实时分析结果。
|
| 139 |
+
|
| 140 |
+
---
|
app.py
CHANGED
|
@@ -1,13 +1,46 @@
|
|
| 1 |
import gradio as gr
|
| 2 |
from comp import generate_response
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
|
| 4 |
# --- Gradio UI ---
|
| 5 |
|
| 6 |
with gr.Blocks() as demo:
|
| 7 |
gr.Markdown("# Ling Playground")
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
def user(user_message, history):
|
| 13 |
return "", history + [[user_message, None]]
|
|
@@ -15,22 +48,44 @@ with gr.Blocks() as demo:
|
|
| 15 |
def bot(history):
|
| 16 |
user_message = history[-1][0]
|
| 17 |
history[-1][1] = ""
|
|
|
|
| 18 |
for response in generate_response(user_message, history[:-1]):
|
| 19 |
if "</think>" in response:
|
| 20 |
parts = response.split("</think>", 1)
|
| 21 |
thinking_text = parts[0].replace("<think>", "")
|
| 22 |
body_text = parts[1]
|
| 23 |
-
|
| 24 |
md_output = f"**Thinking...**\n```\n{thinking_text}\n```\n\n{body_text}"
|
| 25 |
history[-1][1] = md_output
|
| 26 |
else:
|
| 27 |
history[-1][1] = response
|
| 28 |
yield history
|
| 29 |
|
| 30 |
-
|
| 31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
)
|
| 33 |
-
clear.click(lambda: None, None, chatbot, queue=False)
|
| 34 |
|
| 35 |
if __name__ == "__main__":
|
| 36 |
-
demo.launch()
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
from comp import generate_response
|
| 3 |
+
import re
|
| 4 |
+
|
| 5 |
+
# --- Constants ---
|
| 6 |
+
WORKFLOW_SYSTEM_PROMPT = """You are an expert in analyzing conversations and extracting user workflows.
|
| 7 |
+
Based on the provided chat history, identify the user's main goal or intent.
|
| 8 |
+
Then, break down the conversation into a series of actionable steps that represent the workflow to achieve that goal.
|
| 9 |
+
The output should be in two parts, clearly separated:
|
| 10 |
+
**Intent**: [A concise description of the user's goal]
|
| 11 |
+
**Steps**:
|
| 12 |
+
[A numbered list of steps]
|
| 13 |
+
"""
|
| 14 |
+
|
| 15 |
+
# --- Helper Functions ---
|
| 16 |
+
def parse_workflow_response(response):
|
| 17 |
+
intent_match = re.search(r"\*\*Intent\*\*:\s*(.*)", response, re.IGNORECASE)
|
| 18 |
+
steps_match = re.search(r"\*\*Steps\*\*:\s*(.*)", response, re.DOTALL | re.IGNORECASE)
|
| 19 |
+
|
| 20 |
+
intent = intent_match.group(1).strip() if intent_match else "Could not determine intent."
|
| 21 |
+
steps = steps_match.group(1).strip() if steps_match else "Could not determine steps."
|
| 22 |
+
|
| 23 |
+
return intent, steps
|
| 24 |
|
| 25 |
# --- Gradio UI ---
|
| 26 |
|
| 27 |
with gr.Blocks() as demo:
|
| 28 |
gr.Markdown("# Ling Playground")
|
| 29 |
+
|
| 30 |
+
with gr.Row():
|
| 31 |
+
with gr.Column(scale=2):
|
| 32 |
+
gr.Markdown("## Chat")
|
| 33 |
+
chat_chatbot = gr.Chatbot(label="Chat", bubble_full_width=False)
|
| 34 |
+
chat_msg = gr.Textbox(label="Your Message")
|
| 35 |
+
|
| 36 |
+
with gr.Column(scale=1):
|
| 37 |
+
gr.Markdown("## Workflow Extraction")
|
| 38 |
+
intent_textbox = gr.Textbox(label="Task Intent", interactive=False)
|
| 39 |
+
steps_textbox = gr.Textbox(
|
| 40 |
+
label="Extracted Steps", interactive=False, lines=15
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
chat_clear = gr.ClearButton([chat_msg, chat_chatbot, intent_textbox, steps_textbox])
|
| 44 |
|
| 45 |
def user(user_message, history):
|
| 46 |
return "", history + [[user_message, None]]
|
|
|
|
| 48 |
def bot(history):
|
| 49 |
user_message = history[-1][0]
|
| 50 |
history[-1][1] = ""
|
| 51 |
+
# Main chat model call (uses default system prompt)
|
| 52 |
for response in generate_response(user_message, history[:-1]):
|
| 53 |
if "</think>" in response:
|
| 54 |
parts = response.split("</think>", 1)
|
| 55 |
thinking_text = parts[0].replace("<think>", "")
|
| 56 |
body_text = parts[1]
|
| 57 |
+
|
| 58 |
md_output = f"**Thinking...**\n```\n{thinking_text}\n```\n\n{body_text}"
|
| 59 |
history[-1][1] = md_output
|
| 60 |
else:
|
| 61 |
history[-1][1] = response
|
| 62 |
yield history
|
| 63 |
|
| 64 |
+
def update_workflow(history):
|
| 65 |
+
if not history or not history[-1][0]:
|
| 66 |
+
return "", ""
|
| 67 |
+
|
| 68 |
+
# The last user message is the main prompt for the workflow agent
|
| 69 |
+
user_message = history[-1][0]
|
| 70 |
+
# The rest of the conversation is the history
|
| 71 |
+
chat_history_for_workflow = history[:-1]
|
| 72 |
+
|
| 73 |
+
# Call the model with the workflow system prompt
|
| 74 |
+
full_response = ""
|
| 75 |
+
for response in generate_response(
|
| 76 |
+
user_message,
|
| 77 |
+
chat_history_for_workflow,
|
| 78 |
+
system_prompt=WORKFLOW_SYSTEM_PROMPT
|
| 79 |
+
):
|
| 80 |
+
full_response = response
|
| 81 |
+
|
| 82 |
+
intent, steps = parse_workflow_response(full_response)
|
| 83 |
+
return intent, steps
|
| 84 |
+
|
| 85 |
+
( chat_msg.submit(user, [chat_msg, chat_chatbot], [chat_msg, chat_chatbot], queue=False)
|
| 86 |
+
.then(bot, chat_chatbot, chat_chatbot)
|
| 87 |
+
.then(update_workflow, chat_chatbot, [intent_textbox, steps_textbox])
|
| 88 |
)
|
|
|
|
| 89 |
|
| 90 |
if __name__ == "__main__":
|
| 91 |
+
demo.launch(share=True)
|
comp.py
CHANGED
|
@@ -4,6 +4,7 @@ import spaces
|
|
| 4 |
|
| 5 |
# Model and tokenizer initialization
|
| 6 |
MODEL_NAME = "inclusionAI/Ring-mini-2.0"
|
|
|
|
| 7 |
|
| 8 |
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
|
| 9 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -14,26 +15,29 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 14 |
)
|
| 15 |
|
| 16 |
@spaces.GPU(duration=120)
|
| 17 |
-
def generate_response(message, history):
|
| 18 |
-
# (msg, history) -> str: stream response (yielding partial responses)
|
| 19 |
|
|
|
|
|
|
|
|
|
|
| 20 |
# To construct the 'chat', we start with system prompt
|
| 21 |
# then append user and assistant messages from history
|
| 22 |
messages = [
|
| 23 |
-
{"role": "system", "content":
|
| 24 |
]
|
| 25 |
|
| 26 |
# Add conversation history
|
| 27 |
# history is a list of (human, assistant) tuples
|
| 28 |
for human, assistant in history:
|
| 29 |
messages.append({"role": "user", "content": human})
|
| 30 |
-
|
|
|
|
| 31 |
|
| 32 |
# Add current message from user
|
| 33 |
messages.append({"role": "user", "content": message})
|
| 34 |
|
| 35 |
# Apply chat template
|
| 36 |
-
# Doc: https://github.com/huggingface/transformers/blob/main/src/transformers/tokenization_utils_base.py#L1510
|
| 37 |
text = tokenizer.apply_chat_template(
|
| 38 |
messages,
|
| 39 |
tokenize=False,
|
|
|
|
| 4 |
|
| 5 |
# Model and tokenizer initialization
|
| 6 |
MODEL_NAME = "inclusionAI/Ring-mini-2.0"
|
| 7 |
+
DEFAULT_SYSTEM_PROMPT = "你是 Ring,蚂蚁集团开发的智能助手,致力于为用户提供有用的信息和帮助,用中文回答用户的问题。"
|
| 8 |
|
| 9 |
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
|
| 10 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 15 |
)
|
| 16 |
|
| 17 |
@spaces.GPU(duration=120)
|
| 18 |
+
def generate_response(message, history, system_prompt=None):
|
| 19 |
+
# (msg, history, system_prompt) -> str: stream response (yielding partial responses)
|
| 20 |
|
| 21 |
+
# Determine the system prompt to use
|
| 22 |
+
prompt_to_use = system_prompt if system_prompt is not None else DEFAULT_SYSTEM_PROMPT
|
| 23 |
+
|
| 24 |
# To construct the 'chat', we start with system prompt
|
| 25 |
# then append user and assistant messages from history
|
| 26 |
messages = [
|
| 27 |
+
{"role": "system", "content": prompt_to_use}
|
| 28 |
]
|
| 29 |
|
| 30 |
# Add conversation history
|
| 31 |
# history is a list of (human, assistant) tuples
|
| 32 |
for human, assistant in history:
|
| 33 |
messages.append({"role": "user", "content": human})
|
| 34 |
+
if assistant: # Ensure assistant message is not None
|
| 35 |
+
messages.append({"role": "assistant", "content": assistant})
|
| 36 |
|
| 37 |
# Add current message from user
|
| 38 |
messages.append({"role": "user", "content": message})
|
| 39 |
|
| 40 |
# Apply chat template
|
|
|
|
| 41 |
text = tokenizer.apply_chat_template(
|
| 42 |
messages,
|
| 43 |
tokenize=False,
|