Spaces:
No application file
No application file
Commit ·
a5e74de
1
Parent(s): ed02112
worked
Browse files- .gitignore +1 -0
- browser_agent_data/browseruse_agent_data/extracted_content_0.md +0 -154
- browser_agent_data/browseruse_agent_data/todo.md +0 -10
- github_pricing_header.png +3 -0
- pyproject.toml +1 -0
- src/_agents.py +652 -91
- src/utils/chrome_playwright.py +142 -0
- uv.lock +33 -0
.gitignore
CHANGED
|
@@ -11,3 +11,4 @@ wheels/
|
|
| 11 |
.env
|
| 12 |
|
| 13 |
images/
|
|
|
|
|
|
| 11 |
.env
|
| 12 |
|
| 13 |
images/
|
| 14 |
+
tempimgs/
|
browser_agent_data/browseruse_agent_data/extracted_content_0.md
DELETED
|
@@ -1,154 +0,0 @@
|
|
| 1 |
-
<url>
|
| 2 |
-
https://firebase.google.com/pricing
|
| 3 |
-
</url>
|
| 4 |
-
<query>
|
| 5 |
-
Extract all pricing plan details, including plan names, features, and costs.
|
| 6 |
-
</query>
|
| 7 |
-
<result>
|
| 8 |
-
**Pricing Plans:**
|
| 9 |
-
|
| 10 |
-
**1. No-cost (Spark plan)**
|
| 11 |
-
* **Features:** Generous no-cost usage limits, no payment method needed.
|
| 12 |
-
* **Products & Costs:**
|
| 13 |
-
* **A/B Testing:** No-cost
|
| 14 |
-
* **Analytics:** No-cost
|
| 15 |
-
* **App Check:** No-cost, subject to quotas and limits that vary based on attestation provider.
|
| 16 |
-
* **App Distribution:** No-cost
|
| 17 |
-
* **App Hosting:**
|
| 18 |
-
* Outgoing bandwidth (Uncached/Cached): Not applicable
|
| 19 |
-
* Storage: Not applicable
|
| 20 |
-
* Cloud Products (Cloud Run, Cloud Build, Artifact Registry, Cloud Logging, Cloud Secrets Manager): Not applicable
|
| 21 |
-
* **Authentication:**
|
| 22 |
-
* Phone Auth - All regions: Not applicable
|
| 23 |
-
* Other Authentication services: Included
|
| 24 |
-
* With Identity Platform (Monthly active users): 50K MAUs
|
| 25 |
-
* With Identity Platform (Monthly active users - SAML/OIDC): 50 MAUs
|
| 26 |
-
* **Cloud Firestore (Standard edition):**
|
| 27 |
-
* Stored data: 1 GiB total
|
| 28 |
-
* Network egress: 10 GiB/month
|
| 29 |
-
* Document writes: 20K writes/day
|
| 30 |
-
* Document reads: 50K reads/day
|
| 31 |
-
* Document deletes: 20K deletes/day
|
| 32 |
-
* **Cloud Firestore (Enterprise edition):**
|
| 33 |
-
* Stored data: 1 GiB total
|
| 34 |
-
* Network egress: 10 GiB/month
|
| 35 |
-
* Document writes - includes writes and deletes: 40K writes/day
|
| 36 |
-
* Document reads: 50K reads/day
|
| 37 |
-
* **Cloud Functions:** Not applicable for Invocations, GB-seconds, CPU-seconds, Outbound networking, Cloud Build minutes, Container storage in Artifact Registry.
|
| 38 |
-
* **Cloud Messaging (FCM):** No-cost
|
| 39 |
-
* **Cloud Storage (`*.appspot.com` legacy buckets):**
|
| 40 |
-
* GB stored: 5 GB
|
| 41 |
-
* GB downloaded: 1 GB/day
|
| 42 |
-
* Upload operations: 20K/day
|
| 43 |
-
* Download operations: 50K/day
|
| 44 |
-
* Multiple buckets per project: Not included
|
| 45 |
-
* **Cloud Storage (`*.firebasestorage.app` and any additional buckets):** Not applicable for GB stored, GB downloaded, Upload operations, Download operations, Multiple buckets per project.
|
| 46 |
-
* **Crashlytics:** No-cost
|
| 47 |
-
* **Data Connect:** Not applicable for Network egress, Operation count, Cloud SQL for PostgreSQL.
|
| 48 |
-
* **Hosting:**
|
| 49 |
-
* Storage: 10 GB
|
| 50 |
-
* Data transfer: 360 MB/day
|
| 51 |
-
* Custom domain & SSL: Included
|
| 52 |
-
* Multiple sites per project: Included
|
| 53 |
-
* **In-App Messaging:** No-cost
|
| 54 |
-
* **Firebase ML:**
|
| 55 |
-
* Custom Model Deployment: Included
|
| 56 |
-
* Cloud Vision APIs: Not included
|
| 57 |
-
* **Performance Monitoring:** No-cost
|
| 58 |
-
* **Realtime Database:**
|
| 59 |
-
* Simultaneous connections: 100
|
| 60 |
-
* GB stored: 1 GB
|
| 61 |
-
* GB downloaded: 10 GB/month
|
| 62 |
-
* Multiple databases per project: Not included
|
| 63 |
-
* **Remote Config:** No-cost
|
| 64 |
-
* **Test Lab:**
|
| 65 |
-
* Virtual Device Tests: 10 tests/day
|
| 66 |
-
* Physical Device Tests: 5 tests/day
|
| 67 |
-
* Android Device Streaming: 30 no-cost minutes per project, per month
|
| 68 |
-
* **Firebase AI Logic client SDKs:** Included
|
| 69 |
-
* **Google Cloud (BigQuery):** Included (sandbox limits)
|
| 70 |
-
* **Google Cloud (Other IaaS):** Not included
|
| 71 |
-
* **Gemini in Firebase:** No-cost for individuals or groups not using Google Workspace. Google Workspace users require a valid Gemini Code Assist subscription.
|
| 72 |
-
* **Firebase Studio:** No-cost for three workspaces. Google Developer Program members can create: Standard (no-cost): 10 workspaces; Premium: 30 workspaces and an increased Gemini quota for the App Prototyping agent.
|
| 73 |
-
|
| 74 |
-
**2. Pay as you go (Blaze plan)**
|
| 75 |
-
* **Features:** Eligible developers can claim $300 of credits to get started, no-cost usage limits from Spark plan included*.
|
| 76 |
-
* **Products & Costs:**
|
| 77 |
-
* **A/B Testing:** No-cost
|
| 78 |
-
* **Analytics:** No-cost
|
| 79 |
-
* **App Check:** No-cost, subject to quotas and limits that vary based on attestation provider.
|
| 80 |
-
* **App Distribution:** No-cost
|
| 81 |
-
* **App Hosting:** (Starting August 1, 2025)
|
| 82 |
-
* Outgoing bandwidth (Uncached): No-cost up to 10 GiB/month, then $0.20/GiB
|
| 83 |
-
* Outgoing bandwidth (Cached): No-cost up to 10 GiB/month, then $0.15/GiB
|
| 84 |
-
* Storage: No-cost up to 5 GB, then $0.10/GB
|
| 85 |
-
* Cloud Products (Cloud Run, Cloud Build, Artifact Registry, Cloud Logging, Cloud Secrets Manager): Billed at Google Cloud pricing (links provided for each).
|
| 86 |
-
* **Authentication:**
|
| 87 |
-
* Phone Auth - All regions: Billed per SMS sent (see current rates)
|
| 88 |
-
* Other Authentication services: Included
|
| 89 |
-
* With Identity Platform (Monthly active users): No-cost up to 50K MAUs, then Google Cloud pricing
|
| 90 |
-
* With Identity Platform (Monthly active users - SAML/OIDC): No-cost up to 50 MAUs, then Google Cloud pricing
|
| 91 |
-
* **Cloud Firestore (Standard edition):**
|
| 92 |
-
* Stored data: No-cost up to 1 GiB total, then Google Cloud pricing
|
| 93 |
-
* Network egress: No-cost up to 10 GiB/month, then Google Cloud pricing
|
| 94 |
-
* Document writes: No-cost up to 20K writes/day, then Google Cloud pricing
|
| 95 |
-
* Document reads: No-cost up to 50K reads/day, then Google Cloud pricing
|
| 96 |
-
* Document deletes: No-cost up to 20K deletes/day, then Google Cloud standard edition pricing
|
| 97 |
-
* **Cloud Firestore (Enterprise edition):**
|
| 98 |
-
* Stored data: No-cost up to 1 GiB total, then Google Cloud enterprise edition pricing
|
| 99 |
-
* Network egress: No-cost up to 10 GiB/month, then Google Cloud enterprise edition pricing
|
| 100 |
-
* Document writes - includes writes and deletes: No-cost up to 40K writes/day, then Google Cloud enterprise edition pricing
|
| 101 |
-
* Document reads: No-cost up to 50K reads/day, then Google Cloud enterprise edition pricing
|
| 102 |
-
* **Cloud Functions:**
|
| 103 |
-
* Invocations: No-cost up to 2M/month, then $0.40/million
|
| 104 |
-
* GB-seconds: No-cost up to 400K/month, then Google Cloud pricing
|
| 105 |
-
* CPU-seconds: No-cost up to 200K/month, then Google Cloud pricing
|
| 106 |
-
* Outbound networking: No-cost up to 5 GB/month, then $0.12/GB
|
| 107 |
-
* Cloud Build minutes: No-cost up to 120 min/day, then $0.003/min
|
| 108 |
-
* Container storage in Artifact Registry: No-cost up to 500MB of storage, then Google Cloud pricing (pricing varies based on location)
|
| 109 |
-
* **Cloud Messaging (FCM):** No-cost
|
| 110 |
-
* **Cloud Storage (`*.appspot.com` legacy buckets):**
|
| 111 |
-
* GB stored: No-cost up to 5 GB, then $0.026/GB
|
| 112 |
-
* GB downloaded: No-cost up to 1 GB/day, then $0.12/GB
|
| 113 |
-
* Upload operations: No-cost up to 20K/day, then $0.05/10K
|
| 114 |
-
* Download operations: No-cost up to 50K/day, then $0.004/10K
|
| 115 |
-
* Multiple buckets per project: Included
|
| 116 |
-
* **Cloud Storage (`*.firebasestorage.app` and any additional buckets):** (No-cost quotas only for `us-central1`, `us-west1`, `us-east1`)
|
| 117 |
-
* GB stored: No-cost up to 5 GB-months, then Cloud Storage pricing
|
| 118 |
-
* GB downloaded: No-cost up to 100 GB/month, then Cloud Storage pricing
|
| 119 |
-
* Upload operations: No-cost up to 5K/month, then Cloud Storage pricing
|
| 120 |
-
* Download operations: No-cost up to 50K/month, then Cloud Storage pricing
|
| 121 |
-
* Multiple buckets per project: Included
|
| 122 |
-
* **Crashlytics:** No-cost
|
| 123 |
-
* **Data Connect:**
|
| 124 |
-
* Network egress: No-cost up to 10 GiB/month, then Google Cloud Internet Data Transfer Rate Premium Tier pricing
|
| 125 |
-
* Operation count: No-cost up to 250K operations per month, then $4.00 per million operations
|
| 126 |
-
* Cloud SQL for PostgreSQL: 3 month no-cost trial for the first default Cloud SQL instance, then starting as low as $9.37/month (pricing varies based on regions and configurations, see Google Cloud pricing).
|
| 127 |
-
* **Hosting:**
|
| 128 |
-
* Storage: No-cost up to 10 GB, then $0.026/GB
|
| 129 |
-
* Data transfer: No-cost up to 360 MB/day, then $0.15/GB
|
| 130 |
-
* Custom domain & SSL: Included
|
| 131 |
-
* Multiple sites per project: Included
|
| 132 |
-
* **In-App Messaging:** No-cost
|
| 133 |
-
* **Firebase ML:** (First 1000 Cloud Vision API calls/month have no costs)
|
| 134 |
-
* Custom Model Deployment: Included
|
| 135 |
-
* Cloud Vision APIs: $1.50/K (see Cloud Vision pricing)
|
| 136 |
-
* **Performance Monitoring:** No-cost
|
| 137 |
-
* **Realtime Database:**
|
| 138 |
-
* Simultaneous connections: 200K per database
|
| 139 |
-
* GB stored: No-cost up to 1 GB, then $5/GB
|
| 140 |
-
* GB downloaded: No-cost up to 10 GB/month, then $1/GB
|
| 141 |
-
* Multiple databases per project: Included
|
| 142 |
-
* **Remote Config:** No-cost
|
| 143 |
-
* **Test Lab:** (Charged for testing time only, rounded up to the nearest minute)
|
| 144 |
-
* Virtual Device Tests: No-cost up to 60 min/day, then $1/device/hour
|
| 145 |
-
* Physical Device Tests: No-cost up to 30 min/day, then $5/device/hour
|
| 146 |
-
* Android Device Streaming: 30 no-cost minutes per project, per month, then 15 cents for each additional minute
|
| 147 |
-
* **Firebase AI Logic client SDKs:** Billed according to current Google Cloud or Gemini Developer API pricing
|
| 148 |
-
* **Google Cloud (BigQuery):** Included
|
| 149 |
-
* **Google Cloud (Other IaaS):** Included
|
| 150 |
-
* **Gemini in Firebase:** No-cost for individuals or groups not using Google Workspace. Google Workspace users require a valid Gemini Code Assist subscription.
|
| 151 |
-
* **Firebase Studio:** No-cost for three workspaces. Google Developer Program members can create: Standard (no-cost): 10 workspaces; Premium: 30 workspaces and an increased Gemini quota for the App Prototyping agent.
|
| 152 |
-
|
| 153 |
-
*Note: No-cost usage on Blaze plan is calculated daily. Details differ slightly for Cloud Functions, Firebase ML, Phone Auth, and Test Lab. No-cost usage quotas apply at the project-level, not at the app-level or for individual resources.*
|
| 154 |
-
</result>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
browser_agent_data/browseruse_agent_data/todo.md
CHANGED
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
# Firebase Pricing and Brand Identity Extraction
|
| 2 |
-
|
| 3 |
-
## Goal: Extract pricing plan content and brand identity assets for a LinkedIn post.
|
| 4 |
-
|
| 5 |
-
## Tasks:
|
| 6 |
-
- [ ] Navigate to https://firebase.google.com/pricing
|
| 7 |
-
- [x] Extract content related to pricing plans.
|
| 8 |
-
- [x] Extract brand's visual identity (primary/secondary colors, full palette, typography, design system elements, social media brand kit details).
|
| 9 |
-
- [ ] Format and return the extracted data in the specified JSON schema.
|
| 10 |
-
- [ ] Call done action.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
github_pricing_header.png
ADDED
|
Git LFS Details
|
pyproject.toml
CHANGED
|
@@ -23,6 +23,7 @@ dependencies = [
|
|
| 23 |
"openai-agents>=0.2.8",
|
| 24 |
"pathlib>=1.0.1",
|
| 25 |
"pillow>=11.3.0",
|
|
|
|
| 26 |
"pydantic>=2.11.7",
|
| 27 |
"pydantic-ai[logfire]>=1.0.1",
|
| 28 |
"urljoin>=1.0.0",
|
|
|
|
| 23 |
"openai-agents>=0.2.8",
|
| 24 |
"pathlib>=1.0.1",
|
| 25 |
"pillow>=11.3.0",
|
| 26 |
+
"playwright>=1.55.0",
|
| 27 |
"pydantic>=2.11.7",
|
| 28 |
"pydantic-ai[logfire]>=1.0.1",
|
| 29 |
"urljoin>=1.0.0",
|
src/_agents.py
CHANGED
|
@@ -1,33 +1,37 @@
|
|
| 1 |
# type: ignore
|
| 2 |
-
from agents import Agent, RunContextWrapper
|
| 3 |
-
from model import get_model
|
| 4 |
import os
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
|
|
|
|
|
|
| 9 |
import re
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
import requests
|
| 11 |
-
|
| 12 |
from requests.exceptions import RequestException
|
| 13 |
-
from
|
| 14 |
from bs4 import BeautifulSoup
|
| 15 |
from urllib.parse import urljoin
|
|
|
|
| 16 |
from langchain_core.output_parsers import JsonOutputParser
|
| 17 |
-
import json
|
| 18 |
-
import time
|
| 19 |
-
import fal_client
|
| 20 |
-
from PIL import Image
|
| 21 |
-
from io import BytesIO
|
| 22 |
-
from IPython.display import display
|
| 23 |
from google import genai
|
| 24 |
-
import
|
| 25 |
-
import
|
| 26 |
-
from
|
| 27 |
-
from browser_use import Agent as AgentBrowser, ChatGoogle, ChatOpenAI as ChatOpenAIBrowserUse,
|
| 28 |
-
from
|
| 29 |
-
|
| 30 |
-
|
| 31 |
# anchor_client = Anchorbrowser(
|
| 32 |
# api_key=os.getenv("ANCHOR_API_KEY")
|
| 33 |
# )
|
|
@@ -87,9 +91,6 @@ content_agent = Agent(
|
|
| 87 |
|
| 88 |
|
| 89 |
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
post_schema = """
|
| 94 |
{
|
| 95 |
"meta": {
|
|
@@ -301,7 +302,7 @@ You are Media Agent, a professional and specialized in creating social media for
|
|
| 301 |
|
| 302 |
Your task:
|
| 303 |
1. Receive a high-level user brief describing a social media post idea.
|
| 304 |
-
2. Generate a detailed DesignSpec (JSON structured specification) from the brief using 'generate_designSpec_from_brief', including platform, style, content, visuals, colors, typography, composition, lighting, mood, and finishing
|
| 305 |
3. Using the generated DesignSpec, create a high-quality, brand-aligned social media image using 'generate_post_image' tool, (Don't change the schema use same as generated)
|
| 306 |
|
| 307 |
Be concise, professional, and strictly follow the structured DesignSpec and design guidelines provided.
|
|
@@ -481,106 +482,538 @@ WebInspectorAgent = Agent(
|
|
| 481 |
|
| 482 |
llm = ChatGoogle(model="gemini-2.5-flash", api_key=os.getenv("GEMINI_API_KEY"))
|
| 483 |
|
| 484 |
-
|
| 485 |
-
|
| 486 |
-
|
| 487 |
-
|
| 488 |
-
|
| 489 |
|
| 490 |
|
| 491 |
|
| 492 |
|
| 493 |
-
import asyncio
|
| 494 |
-
from datetime import datetime
|
| 495 |
-
from pathlib import Path
|
| 496 |
|
| 497 |
|
| 498 |
|
| 499 |
-
|
| 500 |
-
|
| 501 |
-
|
| 502 |
-
# from playwright.async_api import Page
|
| 503 |
-
import os
|
| 504 |
-
|
| 505 |
-
# Reuse the same Tools instance
|
| 506 |
tools = Tools()
|
| 507 |
|
|
|
|
|
|
|
| 508 |
class ElementScreenshotParams(BaseModel):
|
| 509 |
-
|
| 510 |
-
...,
|
|
|
|
| 511 |
)
|
| 512 |
filename: str = Field(
|
| 513 |
-
default="element_screenshot.png",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 514 |
)
|
| 515 |
|
| 516 |
@tools.action(
|
| 517 |
-
description="
|
|
|
|
| 518 |
)
|
| 519 |
async def element_screenshot(params: ElementScreenshotParams, browser_session: BrowserSession) -> ActionResult:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 520 |
try:
|
| 521 |
-
|
| 522 |
-
|
| 523 |
-
|
| 524 |
-
|
| 525 |
-
|
| 526 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 527 |
|
| 528 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 529 |
|
| 530 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 531 |
return ActionResult(
|
| 532 |
extracted_content=success_msg,
|
| 533 |
include_in_memory=True,
|
| 534 |
-
long_term_memory=f"Element screenshot taken: {
|
| 535 |
-
vision_content=[{"type": "image", "path": output_path}]
|
| 536 |
)
|
|
|
|
|
|
|
| 537 |
except Exception as e:
|
| 538 |
-
return ActionResult(error=f"Element screenshot failed: {str(e)}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 539 |
|
|
|
|
|
|
|
|
|
|
| 540 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 541 |
|
| 542 |
|
| 543 |
|
| 544 |
|
| 545 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 546 |
You are a Browser Intelligence Agent specialized in extracting website content and brand identity assets.
|
| 547 |
-
Your
|
| 548 |
|
| 549 |
Follow these steps strictly:
|
| 550 |
|
| 551 |
-
1.
|
|
|
|
|
|
|
|
|
|
| 552 |
|
| 553 |
2. Content Extraction:
|
| 554 |
-
- If
|
| 555 |
-
•
|
| 556 |
-
•
|
| 557 |
-
|
| 558 |
-
|
| 559 |
-
• Extract the full visible text from the landing page only.
|
| 560 |
|
| 561 |
3. Brand & Design Extraction:
|
| 562 |
-
-
|
| 563 |
-
|
| 564 |
-
|
| 565 |
-
|
| 566 |
-
|
| 567 |
-
|
| 568 |
-
|
| 569 |
-
4. Screenshots (
|
| 570 |
-
-
|
| 571 |
-
-
|
| 572 |
-
-
|
| 573 |
|
| 574 |
5. Output:
|
| 575 |
-
-
|
|
|
|
| 576 |
|
| 577 |
Today is {datetime.now().strftime('%Y-%m-%d')}
|
| 578 |
|
| 579 |
-
User's query: Go to https://
|
| 580 |
"""
|
| 581 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 582 |
|
| 583 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 584 |
|
| 585 |
class PageVisited(BaseModel):
|
| 586 |
url: str
|
|
@@ -639,21 +1072,149 @@ class BrowserAgentOutput(BaseModel):
|
|
| 639 |
|
| 640 |
|
| 641 |
async def run_search() -> None:
|
| 642 |
-
print('
|
| 643 |
-
|
| 644 |
-
|
| 645 |
-
|
| 646 |
-
|
| 647 |
-
|
| 648 |
-
|
| 649 |
-
|
| 650 |
-
|
| 651 |
-
|
| 652 |
-
|
| 653 |
-
|
| 654 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 655 |
|
| 656 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 657 |
|
| 658 |
|
| 659 |
|
|
|
|
| 1 |
# type: ignore
|
|
|
|
|
|
|
| 2 |
import os
|
| 3 |
+
import sys
|
| 4 |
+
import time
|
| 5 |
+
import json
|
| 6 |
+
import logging
|
| 7 |
+
import asyncio
|
| 8 |
+
|
| 9 |
import re
|
| 10 |
+
from playwright.async_api import TimeoutError as PlaywrightTimeoutError
|
| 11 |
+
import aiohttp
|
| 12 |
+
from typing import Any, Optional, Dict
|
| 13 |
+
from datetime import datetime
|
| 14 |
+
from pathlib import Path
|
| 15 |
+
from dotenv import load_dotenv
|
| 16 |
+
from pydantic import BaseModel, Field, conint
|
| 17 |
+
from PIL import Image
|
| 18 |
+
from io import BytesIO
|
| 19 |
+
from IPython.display import display
|
| 20 |
import requests
|
| 21 |
+
import base64
|
| 22 |
from requests.exceptions import RequestException
|
| 23 |
+
from markdownify import markdownify
|
| 24 |
from bs4 import BeautifulSoup
|
| 25 |
from urllib.parse import urljoin
|
| 26 |
+
from langchain_community.tools import DuckDuckGoSearchResults
|
| 27 |
from langchain_core.output_parsers import JsonOutputParser
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
from google import genai
|
| 29 |
+
import fal_client
|
| 30 |
+
from agents import Agent, AsyncOpenAI, Runner, function_tool, RunContextWrapper, AgentHooks, RunHooks, TContext
|
| 31 |
+
from model import get_model
|
| 32 |
+
from browser_use import Agent as AgentBrowser, ChatGoogle, ChatOpenAI as ChatOpenAIBrowserUse, Tools, ActionResult
|
| 33 |
+
from browser_use.browser import BrowserSession, BrowserProfile
|
| 34 |
+
from utils.chrome_playwright import start_chrome_with_debug_port, connect_playwright_to_cdp
|
|
|
|
| 35 |
# anchor_client = Anchorbrowser(
|
| 36 |
# api_key=os.getenv("ANCHOR_API_KEY")
|
| 37 |
# )
|
|
|
|
| 91 |
|
| 92 |
|
| 93 |
|
|
|
|
|
|
|
|
|
|
| 94 |
post_schema = """
|
| 95 |
{
|
| 96 |
"meta": {
|
|
|
|
| 302 |
|
| 303 |
Your task:
|
| 304 |
1. Receive a high-level user brief describing a social media post idea.
|
| 305 |
+
2. Generate a detailed DesignSpec (JSON structured specification) from the brief using 'generate_designSpec_from_brief', including platform, style, content, visuals, colors, typography, composition, lighting, mood, and finishing requirements.
|
| 306 |
3. Using the generated DesignSpec, create a high-quality, brand-aligned social media image using 'generate_post_image' tool, (Don't change the schema use same as generated)
|
| 307 |
|
| 308 |
Be concise, professional, and strictly follow the structured DesignSpec and design guidelines provided.
|
|
|
|
| 482 |
|
| 483 |
llm = ChatGoogle(model="gemini-2.5-flash", api_key=os.getenv("GEMINI_API_KEY"))
|
| 484 |
|
| 485 |
+
llm_browser = ChatOpenAIBrowserUse(
|
| 486 |
+
model='openai/gpt-4.1',
|
| 487 |
+
base_url='https://openrouter.ai/api/v1',
|
| 488 |
+
api_key=os.getenv('OPENROUTER_API_KEY'),
|
| 489 |
+
)
|
| 490 |
|
| 491 |
|
| 492 |
|
| 493 |
|
|
|
|
|
|
|
|
|
|
| 494 |
|
| 495 |
|
| 496 |
|
| 497 |
+
# Global Playwright variables and Tools instance
|
| 498 |
+
playwright_browser = None
|
| 499 |
+
playwright_page = None
|
|
|
|
|
|
|
|
|
|
|
|
|
| 500 |
tools = Tools()
|
| 501 |
|
| 502 |
+
|
| 503 |
+
|
| 504 |
class ElementScreenshotParams(BaseModel):
|
| 505 |
+
selectors: list[str] = Field(
|
| 506 |
+
...,
|
| 507 |
+
description="A list of CSS selectors to try for locating the element(s). The first valid selector will be used."
|
| 508 |
)
|
| 509 |
filename: str = Field(
|
| 510 |
+
default="element_screenshot.png",
|
| 511 |
+
description="Output filename for the screenshot."
|
| 512 |
+
)
|
| 513 |
+
highlight: bool = Field(
|
| 514 |
+
default=True,
|
| 515 |
+
description="If True, draw a red border around the element before taking the screenshot."
|
| 516 |
+
)
|
| 517 |
+
padding: conint(ge=0) = Field(
|
| 518 |
+
default=10,
|
| 519 |
+
description="Padding (in pixels) to add around the element in the screenshot."
|
| 520 |
+
)
|
| 521 |
+
scroll_if_needed: bool = Field(
|
| 522 |
+
default=True,
|
| 523 |
+
description="If True, scroll the element into view before taking the screenshot."
|
| 524 |
+
)
|
| 525 |
+
fallback_to_full_page: bool = Field(
|
| 526 |
+
default=True,
|
| 527 |
+
description="If no element is found, fallback to taking a full page screenshot."
|
| 528 |
)
|
| 529 |
|
| 530 |
@tools.action(
|
| 531 |
+
description="Captures a screenshot of one or more elements on a page using CSS selectors, with options for highlighting, padding, and scrolling. It can try multiple selectors and fall back to a full-page screenshot.",
|
| 532 |
+
param_model=ElementScreenshotParams,
|
| 533 |
)
|
| 534 |
async def element_screenshot(params: ElementScreenshotParams, browser_session: BrowserSession) -> ActionResult:
|
| 535 |
+
"""
|
| 536 |
+
A robust tool to capture screenshots of web elements.
|
| 537 |
+
- It can use JavaScript-based targeting for selectors.
|
| 538 |
+
- Tries multiple selectors to find the target element.
|
| 539 |
+
- Adds padding to provide context around the element.
|
| 540 |
+
|
| 541 |
+
"""
|
| 542 |
+
print("-----------------browser_session_---------")
|
| 543 |
+
page = await browser_session.get_current_page()
|
| 544 |
+
|
| 545 |
+
# Prefer a session-owned file system path if the BrowserSession provides one
|
| 546 |
try:
|
| 547 |
+
session_base = getattr(browser_session, 'file_system_path', None)
|
| 548 |
+
if session_base:
|
| 549 |
+
base_path = os.path.abspath(session_base)
|
| 550 |
+
else:
|
| 551 |
+
base_path = os.path.abspath(".")
|
| 552 |
+
|
| 553 |
+
# Create a unique directory for screenshots from this website and session
|
| 554 |
+
from urllib.parse import urlparse
|
| 555 |
+
import time
|
| 556 |
+
|
| 557 |
+
parsed_url = urlparse(await page.get_url())
|
| 558 |
+
# Sanitize website name to be filesystem-friendly
|
| 559 |
+
website_name = parsed_url.netloc.replace('www.', '').replace('.', '_').replace(':', '_')
|
| 560 |
+
timestamp = int(time.time())
|
| 561 |
+
|
| 562 |
+
screenshot_dir = os.path.join(base_path, "tempImgs", f"{website_name}-{timestamp}")
|
| 563 |
+
|
| 564 |
+
os.makedirs(screenshot_dir, exist_ok=True)
|
| 565 |
+
|
| 566 |
+
output_path = os.path.join(screenshot_dir, params.filename)
|
| 567 |
+
except Exception as e:
|
| 568 |
+
print(e)
|
| 569 |
+
# Fallback to current working directory if there's an issue creating the new one
|
| 570 |
+
output_path = os.path.join(os.path.abspath('.'), params.filename)
|
| 571 |
+
|
| 572 |
+
|
| 573 |
+
element = None
|
| 574 |
+
used_selector = None
|
| 575 |
+
error_messages = []
|
| 576 |
+
print("Trying to find element :", params)
|
| 577 |
+
for selector in params.selectors:
|
| 578 |
+
try:
|
| 579 |
+
print(selector)
|
| 580 |
+
loc = await page.evaluate("""
|
| 581 |
+
(selector, padding) => {
|
| 582 |
+
const el = document.querySelector(selector);
|
| 583 |
+
if (!el) {
|
| 584 |
+
return {
|
| 585 |
+
clip: { x: null, y: null, width: null, height: null },
|
| 586 |
+
tag: null,
|
| 587 |
+
selector: selector,
|
| 588 |
+
id: null,
|
| 589 |
+
classList: [],
|
| 590 |
+
};
|
| 591 |
+
}
|
| 592 |
+
const rect = el.getBoundingClientRect();
|
| 593 |
+
return {
|
| 594 |
+
clip: {
|
| 595 |
+
x: rect.x - padding,
|
| 596 |
+
y: rect.y - padding,
|
| 597 |
+
width: rect.width + 2 * padding,
|
| 598 |
+
height: rect.height + 2 * padding
|
| 599 |
+
},
|
| 600 |
+
tag: el.tagName,
|
| 601 |
+
selector: selector,
|
| 602 |
+
id: el.id || null,
|
| 603 |
+
classList: Array.from(el.classList || []),
|
| 604 |
|
| 605 |
+
};
|
| 606 |
+
}
|
| 607 |
+
""", selector, params.padding)
|
| 608 |
+
element = json.loads(loc)
|
| 609 |
+
# if await loc.count() > 0:
|
| 610 |
+
# element = loc.first() # Use the first element if multiple are found
|
| 611 |
+
# used_selector = selector
|
| 612 |
+
# await element.wait_for(state="attached", timeout=3000)
|
| 613 |
+
# break
|
| 614 |
+
# else:
|
| 615 |
+
# error_messages.append(f"Selector '{selector}' found no elements.")
|
| 616 |
+
except Exception as e:
|
| 617 |
+
error_messages.append(f"Error with selector '{selector}': {str(e)}")
|
| 618 |
+
print('Element found:', element)
|
| 619 |
+
print('at 1')
|
| 620 |
+
if not element:
|
| 621 |
+
# Full-page fallback screenshot disabled — prefer explicit errors instead of taking full-page screenshots.
|
| 622 |
+
# If you want to re-enable the fallback, uncomment the lines below.
|
| 623 |
+
# if params.fallback_to_full_page:
|
| 624 |
+
# try:
|
| 625 |
+
# await page.screenshot(path=output_path, full_page=True)
|
| 626 |
+
# fallback_msg = f"No element found for selectors {params.selectors}. Fell back to full-page screenshot at: {output_path}"
|
| 627 |
+
# return ActionResult(
|
| 628 |
+
# extracted_content=fallback_msg,
|
| 629 |
+
# long_term_memory=fallback_msg,
|
| 630 |
+
# vision_content=[{"type": "image", "path": output_path}]
|
| 631 |
+
# )
|
| 632 |
+
# except Exception as e:
|
| 633 |
+
# return ActionResult(error=f"Element not found and full-page screenshot failed: {str(e)}")
|
| 634 |
+
|
| 635 |
+
return ActionResult(error=f"Could not find any element using selectors: {params.selectors}. Errors: {'; '.join(error_messages)}")
|
| 636 |
+
|
| 637 |
+
print('at 2')
|
| 638 |
+
print(type(element))
|
| 639 |
+
try:
|
| 640 |
+
# Scroll element into view if needed
|
| 641 |
+
# if params.scroll_if_needed:
|
| 642 |
+
# await element.scroll_into_view_if_needed(timeout=5000)
|
| 643 |
|
| 644 |
+
# Wait for the element to be stable and visible
|
| 645 |
+
# await element.wait_for(state="visible", timeout=5000)
|
| 646 |
+
# await element.wait_for(state="attached", timeout=5000)
|
| 647 |
+
|
| 648 |
+
# # Highlight the element with a red border
|
| 649 |
+
# original_style = ""
|
| 650 |
+
# if params.highlight:
|
| 651 |
+
# original_style = await element.get_attribute("style") or ""
|
| 652 |
+
# print('evaluaiton 1')
|
| 653 |
+
# await element.evaluate("el => el.style.border = '3px solid red'")
|
| 654 |
+
|
| 655 |
+
# print('evaluaiton 2')
|
| 656 |
+
# Get bounding box and take screenshot with padding
|
| 657 |
+
clip_obj = dict(element).get('clip')
|
| 658 |
+
|
| 659 |
+
if not clip_obj or clip_obj.get('x') is None:
|
| 660 |
+
raise Exception("Could not get bounding box for the element.")
|
| 661 |
+
|
| 662 |
+
try:
|
| 663 |
+
# Get session id and client from the page wrapper
|
| 664 |
+
session_id = await page.session_id
|
| 665 |
+
client = page._client
|
| 666 |
+
|
| 667 |
+
params = {
|
| 668 |
+
'format': 'png',
|
| 669 |
+
'clip': {
|
| 670 |
+
'x': float(clip_obj['x']),
|
| 671 |
+
'y': float(clip_obj['y']),
|
| 672 |
+
'width': float(clip_obj['width']),
|
| 673 |
+
'height': float(clip_obj['height']),
|
| 674 |
+
'scale': 1,
|
| 675 |
+
},
|
| 676 |
+
}
|
| 677 |
+
result = await client.send.Page.captureScreenshot(params, session_id=session_id)
|
| 678 |
+
img_b64 = result.get('data')
|
| 679 |
+
if not img_b64:
|
| 680 |
+
raise Exception('CDP captureScreenshot returned no data')
|
| 681 |
+
with open(output_path, 'wb') as f:
|
| 682 |
+
f.write(base64.b64decode(img_b64))
|
| 683 |
+
except Exception as e:
|
| 684 |
+
# Re-raise with context
|
| 685 |
+
raise Exception(f'Failed to take clipped screenshot via CDP: {e}')
|
| 686 |
+
|
| 687 |
+
|
| 688 |
+
|
| 689 |
+
success_msg = f"Element screenshot saved at: {output_path} (selector: '{used_selector}')"
|
| 690 |
return ActionResult(
|
| 691 |
extracted_content=success_msg,
|
| 692 |
include_in_memory=True,
|
| 693 |
+
long_term_memory=f"Element screenshot taken: {used_selector} -> {output_path}",
|
| 694 |
+
vision_content=[{"type": "image", "path": output_path}]
|
| 695 |
)
|
| 696 |
+
except PlaywrightTimeoutError:
|
| 697 |
+
return ActionResult(error=f"Element screenshot failed: Timeout waiting for element '{used_selector}' to be visible or stable.")
|
| 698 |
except Exception as e:
|
| 699 |
+
return ActionResult(error=f"Element screenshot failed for selector '{used_selector}': {str(e)}")
|
| 700 |
+
|
| 701 |
+
|
| 702 |
+
# ------------------------ Custom helper tools ------------------------
|
| 703 |
+
|
| 704 |
+
|
| 705 |
+
@tools.action(
|
| 706 |
+
description="Finds a web page element using a natural language prompt and returns its selector, backend node id, and the element object.",
|
| 707 |
+
# param_model=,
|
| 708 |
+
)
|
| 709 |
+
async def find_element_by_prompt(query: str, browser_session: BrowserSession) -> dict:
|
| 710 |
+
"""
|
| 711 |
+
Use the page's must_get_element_by_prompt (LLM-powered) to robustly locate an element matching the query.
|
| 712 |
|
| 713 |
+
Args:
|
| 714 |
+
query (str): Natural language description of the element to find (e.g., "footer section", "pricing table").
|
| 715 |
+
browser_session (BrowserSession): The active browser session object.
|
| 716 |
|
| 717 |
+
Returns:
|
| 718 |
+
dict: {
|
| 719 |
+
"selector": <css selector or None>,
|
| 720 |
+
"backend_node_id": int,
|
| 721 |
+
"element": <element object or None>,
|
| 722 |
+
"reason": <string>
|
| 723 |
+
}
|
| 724 |
+
- selector: CSS selector string for the matched element, or None if not found.
|
| 725 |
+
- backend_node_id: Unique backend node id for direct reference (int or None).
|
| 726 |
+
- element: The matched element object, or None if not found.
|
| 727 |
+
- reason: Reason for match or error (string).
|
| 728 |
+
"""
|
| 729 |
+
page = await browser_session.get_current_page()
|
| 730 |
+
try:
|
| 731 |
+
# Use the LLM-powered method to get the element
|
| 732 |
+
element = await page.must_get_element_by_prompt(query)
|
| 733 |
+
# Try to build a selector from id/class/tag
|
| 734 |
+
selector = None
|
| 735 |
+
if hasattr(element, 'id') and element.id:
|
| 736 |
+
selector = f"#{element.id}"
|
| 737 |
+
elif hasattr(element, 'class_name') and element.class_name:
|
| 738 |
+
first_cls = element.class_name.split()[0]
|
| 739 |
+
selector = f".{first_cls}"
|
| 740 |
+
elif hasattr(element, 'tag_name') and element.tag_name:
|
| 741 |
+
selector = element.tag_name.lower()
|
| 742 |
+
# Always return backend_node_id for direct reference
|
| 743 |
+
backend_node_id = getattr(element, 'backend_node_id', None)
|
| 744 |
+
return {
|
| 745 |
+
"selector": selector,
|
| 746 |
+
"backend_node_id": backend_node_id,
|
| 747 |
+
"element": element,
|
| 748 |
+
"reason": "llm_match"
|
| 749 |
+
}
|
| 750 |
+
except Exception as e:
|
| 751 |
+
return {
|
| 752 |
+
"selector": None,
|
| 753 |
+
"backend_node_id": None,
|
| 754 |
+
"element": None,
|
| 755 |
+
"reason": f"llm_error: {e}"
|
| 756 |
+
}
|
| 757 |
|
| 758 |
|
| 759 |
|
| 760 |
|
| 761 |
+
@tools.action(
|
| 762 |
+
description="Injects or removes a visible red outline around the element identified by selector or selector dict for browser agent visual verification.",
|
| 763 |
+
)
|
| 764 |
+
async def highlight_element(selector_or_obj: str | dict, browser_session: BrowserSession) -> dict:
|
| 765 |
+
"""
|
| 766 |
+
Inject or remove a visible red outline around the element identified by selector (or dict{selector}).
|
| 767 |
+
|
| 768 |
+
Args:
|
| 769 |
+
selector_or_obj (str | dict): CSS selector string or dict with 'selector' key to identify the element.
|
| 770 |
+
browser_session (BrowserSession): The active browser session object.
|
| 771 |
+
remove (bool, optional): If True, removes the highlight. If False or omitted, adds the highlight.
|
| 772 |
+
|
| 773 |
+
Returns:
|
| 774 |
+
dict: {ok: True/False, selector: used_selector, reason: str}
|
| 775 |
+
"""
|
| 776 |
+
page = await browser_session.get_current_page()
|
| 777 |
+
remove = False
|
| 778 |
+
# Support dict with 'remove' key
|
| 779 |
+
if isinstance(selector_or_obj, dict):
|
| 780 |
+
selector = selector_or_obj.get('selector')
|
| 781 |
+
remove = selector_or_obj.get('remove', False)
|
| 782 |
+
else:
|
| 783 |
+
selector = selector_or_obj
|
| 784 |
+
|
| 785 |
+
if remove:
|
| 786 |
+
js = """
|
| 787 |
+
(sel) => {
|
| 788 |
+
const el = document.querySelector(sel);
|
| 789 |
+
if (!el) return { ok: false, reason: 'not_found', selector: sel };
|
| 790 |
+
if (el.dataset.__highlighted === '1') {
|
| 791 |
+
el.style.outline = el.dataset.__orig_outline || '';
|
| 792 |
+
delete el.dataset.__highlighted;
|
| 793 |
+
delete el.dataset.__orig_outline;
|
| 794 |
+
return { ok: true, selector: sel, reason: 'highlight_removed' };
|
| 795 |
+
}
|
| 796 |
+
return { ok: false, selector: sel, reason: 'no_highlight_to_remove' };
|
| 797 |
+
}
|
| 798 |
+
"""
|
| 799 |
+
else:
|
| 800 |
+
js = """
|
| 801 |
+
(sel) => {
|
| 802 |
+
const el = document.querySelector(sel);
|
| 803 |
+
if (!el) return { ok: false, reason: 'not_found', selector: sel };
|
| 804 |
+
// store original outline to restore later
|
| 805 |
+
el.dataset.__orig_outline = el.style.outline || '';
|
| 806 |
+
el.style.outline = '3px solid red';
|
| 807 |
+
el.dataset.__highlighted = '1';
|
| 808 |
+
return { ok: true, selector: sel, reason: 'highlight_applied' };
|
| 809 |
+
}
|
| 810 |
+
"""
|
| 811 |
+
|
| 812 |
+
try:
|
| 813 |
+
raw = await page.evaluate(js, selector)
|
| 814 |
+
return json.loads(raw)
|
| 815 |
+
except Exception as e:
|
| 816 |
+
return {"ok": False, "reason": str(e), "selector": selector}
|
| 817 |
+
|
| 818 |
+
|
| 819 |
+
@tools.action(
|
| 820 |
+
description="Returns the bounding box (x, y, width, height) for a given CSS selector or selector dict on the current page. Useful for element positioning, cropping, or screenshot tasks.",
|
| 821 |
+
)
|
| 822 |
+
async def get_bounding_box(selector_or_obj: str | dict, browser_session: BrowserSession) -> dict:
|
| 823 |
+
"""
|
| 824 |
+
Description:
|
| 825 |
+
Returns the bounding box for a given CSS selector or selector dict on the current page.
|
| 826 |
+
|
| 827 |
+
Args:
|
| 828 |
+
selector_or_obj (str | dict): CSS selector string or dict with 'selector' key to identify the element.
|
| 829 |
+
browser_session (BrowserSession): The active browser session object.
|
| 830 |
+
|
| 831 |
+
Returns:
|
| 832 |
+
dict: {x: float or None, y: float or None, width: float or None, height: float or None, error: str (optional)}
|
| 833 |
+
- x, y: Top-left coordinates of the element (relative to viewport)
|
| 834 |
+
- width, height: Size of the element
|
| 835 |
+
- error: Error message if bounding box could not be retrieved
|
| 836 |
+
"""
|
| 837 |
+
page = await browser_session.get_current_page()
|
| 838 |
+
if isinstance(selector_or_obj, dict):
|
| 839 |
+
selector = selector_or_obj.get('selector')
|
| 840 |
+
else:
|
| 841 |
+
selector = selector_or_obj
|
| 842 |
+
|
| 843 |
+
js = """
|
| 844 |
+
(sel) => {
|
| 845 |
+
const el = document.querySelector(sel);
|
| 846 |
+
if (!el) return { x: null, y: null, width: null, height: null };
|
| 847 |
+
const r = el.getBoundingClientRect();
|
| 848 |
+
return { x: r.x, y: r.y, width: r.width, height: r.height };
|
| 849 |
+
}
|
| 850 |
+
"""
|
| 851 |
+
|
| 852 |
+
try:
|
| 853 |
+
raw = await page.evaluate(js, selector)
|
| 854 |
+
return json.loads(raw)
|
| 855 |
+
except Exception as e:
|
| 856 |
+
return {"x": None, "y": None, "width": None, "height": None, "error": str(e)}
|
| 857 |
+
|
| 858 |
+
|
| 859 |
+
@tools.action(
|
| 860 |
+
description="Takes a screenshot of a specific region (clip) of the current page, defined by x, y, width, height. Returns the saved image path and status.",
|
| 861 |
+
)
|
| 862 |
+
async def element_screenshot_clip(clip: dict, filename: str = 'element_clip.png', browser_session: BrowserSession = None) -> dict:
|
| 863 |
+
"""
|
| 864 |
+
Description:
|
| 865 |
+
Takes a screenshot of a specific region (clip) of the current page, defined by x, y, width, height.
|
| 866 |
+
|
| 867 |
+
Args:
|
| 868 |
+
clip (dict): Dictionary with keys 'x', 'y', 'width', 'height' (all float/int) specifying the region to capture.
|
| 869 |
+
filename (str, optional): Output filename for the screenshot. Defaults to 'element_clip.png'.
|
| 870 |
+
browser_session (BrowserSession, optional): The active browser session object. Required.
|
| 871 |
+
|
| 872 |
+
Returns:
|
| 873 |
+
dict: {ok: True/False, path: str (if ok), error: str (if not ok)}
|
| 874 |
+
- ok: True if screenshot was successful, False otherwise
|
| 875 |
+
- path: Absolute path to the saved screenshot image (if ok)
|
| 876 |
+
- error: Error message if screenshot failed
|
| 877 |
+
"""
|
| 878 |
+
if browser_session is None:
|
| 879 |
+
return {"ok": False, "error": "browser_session required"}
|
| 880 |
+
|
| 881 |
+
page = await browser_session.get_current_page()
|
| 882 |
+
try:
|
| 883 |
+
session_id = await page.session_id
|
| 884 |
+
client = page._client
|
| 885 |
+
params = {
|
| 886 |
+
'format': 'png',
|
| 887 |
+
'clip': {
|
| 888 |
+
'x': float(clip['x']),
|
| 889 |
+
'y': float(clip['y']),
|
| 890 |
+
'width': float(clip['width']),
|
| 891 |
+
'height': float(clip['height']),
|
| 892 |
+
'scale': 1,
|
| 893 |
+
},
|
| 894 |
+
}
|
| 895 |
+
result = await client.send.Page.captureScreenshot(params, session_id=session_id)
|
| 896 |
+
img_b64 = result.get('data')
|
| 897 |
+
if not img_b64:
|
| 898 |
+
return {"ok": False, "error": 'no_data'}
|
| 899 |
+
|
| 900 |
+
# save in tempImgs root next to script
|
| 901 |
+
out_path = os.path.abspath(filename)
|
| 902 |
+
with open(out_path, 'wb') as f:
|
| 903 |
+
f.write(base64.b64decode(img_b64))
|
| 904 |
+
|
| 905 |
+
return {"ok": True, "path": out_path}
|
| 906 |
+
except Exception as e:
|
| 907 |
+
return {"ok": False, "error": str(e)}
|
| 908 |
+
|
| 909 |
+
|
| 910 |
+
@function_tool
|
| 911 |
+
async def verify_element_visual(query: str, screenshot_path: str, browser_session: BrowserSession, tolerance: int = 20) -> dict:
|
| 912 |
+
"""Verify that the screenshot corresponds to the element found for `query`.
|
| 913 |
+
|
| 914 |
+
Strategy: find element by prompt, get bounding box, compare image size to bbox within tolerance.
|
| 915 |
+
Returns {verified: bool, selector: str or None, screenshot: path, details: ...}
|
| 916 |
+
"""
|
| 917 |
+
# 1) locate element
|
| 918 |
+
found = await find_element_by_prompt(query, browser_session)
|
| 919 |
+
selector = found.get('selector')
|
| 920 |
+
if not selector:
|
| 921 |
+
return {"verified": False, "selector": None, "screenshot": screenshot_path, "details": "could_not_find_element"}
|
| 922 |
+
|
| 923 |
+
# 2) get bbox
|
| 924 |
+
bbox = await get_bounding_box(selector, browser_session)
|
| 925 |
+
if not bbox or bbox.get('width') is None:
|
| 926 |
+
return {"verified": False, "selector": selector, "screenshot": screenshot_path, "details": "could_not_get_bbox"}
|
| 927 |
+
|
| 928 |
+
# 3) load screenshot and compare sizes
|
| 929 |
+
try:
|
| 930 |
+
img = Image.open(screenshot_path)
|
| 931 |
+
w, h = img.size
|
| 932 |
+
except Exception as e:
|
| 933 |
+
return {"verified": False, "selector": selector, "screenshot": screenshot_path, "details": f"could_not_open_image: {e}"}
|
| 934 |
+
|
| 935 |
+
# Compare pixel sizes to bbox width/height
|
| 936 |
+
bw = int(round(bbox['width']))
|
| 937 |
+
bh = int(round(bbox['height']))
|
| 938 |
+
|
| 939 |
+
if abs(bw - w) <= tolerance and abs(bh - h) <= tolerance:
|
| 940 |
+
return {"verified": True, "selector": selector, "screenshot": screenshot_path, "details": "size_match"}
|
| 941 |
+
else:
|
| 942 |
+
return {"verified": False, "selector": selector, "screenshot": screenshot_path, "details": {"bbox": bbox, "image_size": [w, h]}}
|
| 943 |
+
|
| 944 |
+
|
| 945 |
+
|
| 946 |
+
|
| 947 |
+
|
| 948 |
+
|
| 949 |
+
|
| 950 |
+
task_old_1 = f"""
|
| 951 |
You are a Browser Intelligence Agent specialized in extracting website content and brand identity assets.
|
| 952 |
+
Your goal is to visit the given website URL and return a structured, comprehensive extraction.
|
| 953 |
|
| 954 |
Follow these steps strictly:
|
| 955 |
|
| 956 |
+
1. Website Navigation:
|
| 957 |
+
- Open the provided URL.
|
| 958 |
+
- If a user query is provided, search across multiple related internal pages (navigation links, relevant subpages) that may contain information about the query.
|
| 959 |
+
- If no query is provided, focus on the landing page only.
|
| 960 |
|
| 961 |
2. Content Extraction:
|
| 962 |
+
- If a query is provided:
|
| 963 |
+
• Extract and summarize text relevant to the query from all visited pages.
|
| 964 |
+
• Provide a coherent summary that highlights key points across pages.
|
| 965 |
+
- If no query:
|
| 966 |
+
• Extract the full visible text from the landing page.
|
|
|
|
| 967 |
|
| 968 |
3. Brand & Design Extraction:
|
| 969 |
+
- Identify and extract the brand’s visual identity, including:
|
| 970 |
+
• Primary and secondary colors (hex codes).
|
| 971 |
+
• Extended color palette if available.
|
| 972 |
+
• Typography (fonts, weights, styles).
|
| 973 |
+
• Design system or style guide elements.
|
| 974 |
+
• Social media brand kit details (logos, icons, button styles, heading styles).
|
| 975 |
+
|
| 976 |
+
4. Screenshots (via custom tools):
|
| 977 |
+
- Capture screenshots of **topic-related content** (e.g., pricing tables, signup buttons, hero sections if the query is “pricing plans”).
|
| 978 |
+
- Capture screenshots of **brand identity elements** (e.g., color swatches, typography samples, buttons, logos, icons, headings).
|
| 979 |
+
- Save screenshots with clear, descriptive filenames (e.g., `pricing_table.png`, `signup_button.png`, `primary_colors.png`, `typography_styles.png`).
|
| 980 |
|
| 981 |
5. Output:
|
| 982 |
+
- Return the extracted content, brand identity data, and screenshot metadata in a clean and structured JSON format.
|
| 983 |
+
- Do not include free text or commentary outside the JSON.
|
| 984 |
|
| 985 |
Today is {datetime.now().strftime('%Y-%m-%d')}
|
| 986 |
|
| 987 |
+
User's query: Go to https://github.com/pricing and extract content and brand identity assets and screenshots for linkedin post, Topic is pricing plans.
|
| 988 |
"""
|
| 989 |
|
| 990 |
+
task_old_2="""
|
| 991 |
+
|
| 992 |
+
###Selector Discovery, Verification & Screenshot Instructions
|
| 993 |
+
|
| 994 |
+
When identifying selectors for taking elements or sections screenshots:
|
| 995 |
+
Verify each selector's element or section, then capture its screenshot immediately after successful verification.
|
| 996 |
+
|
| 997 |
+
1. **Analyze** the HTML DOM structure of the page to identify potential selectors for the target elements or sections based on the query.
|
| 998 |
+
2. **Generate** a list of possible selectors that could uniquely identify each target element.
|
| 999 |
+
3. **Locate the Target Section or Element:**
|
| 1000 |
+
- Identify the element or section that visually and contextually matches the target.
|
| 1001 |
+
- Focus on the most relevant container or element that directly represents the intended target — not its parent or unrelated siblings.
|
| 1002 |
+
4. For each candidate selector:
|
| 1003 |
+
- Use the `"execute_js"` tool to verify that the selector matches exactly the target.
|
| 1004 |
+
- **Highlight** the matched element by injecting a visible red border (`2px solid red`) or a temporary background color.
|
| 1005 |
+
5. **Validate the Finalized Selector Against the Query:**
|
| 1006 |
+
- Once a selector is finalized, confirm that it accurately represents the element or section described in the query.
|
| 1007 |
+
- Ensure it precisely corresponds to the query intent and does not include unrelated, broader, or nested regions.
|
| 1008 |
+
6. **Remove injected visual styles or modifications** from the DOM to restore the page to its original state before proceeding to the next selector.
|
| 1009 |
+
7. **After verification**, immediately **capture a screenshot** of the verified element or section.
|
| 1010 |
+
8. Continue this process until **all target selectors** have been verified and their screenshots captured.
|
| 1011 |
|
| 1012 |
|
| 1013 |
+
After successful verification, remove all injected visual styles or temporary DOM modifications.
|
| 1014 |
+
User's query: Go to https://github.com/pricing and take screenshot of header and pricing details
|
| 1015 |
+
"""
|
| 1016 |
+
|
| 1017 |
|
| 1018 |
class PageVisited(BaseModel):
|
| 1019 |
url: str
|
|
|
|
| 1072 |
|
| 1073 |
|
| 1074 |
async def run_search() -> None:
|
| 1075 |
+
print('====================================================')
|
| 1076 |
+
print('Starting run_search() function')
|
| 1077 |
+
print('====================================================')
|
| 1078 |
+
|
| 1079 |
+
# Check installed packages that might be relevant
|
| 1080 |
+
try:
|
| 1081 |
+
import importlib
|
| 1082 |
+
packages = ['browser_use', 'playwright', 'aiohttp']
|
| 1083 |
+
for package in packages:
|
| 1084 |
+
try:
|
| 1085 |
+
mod = importlib.import_module(package)
|
| 1086 |
+
print(f"✅ {package} is installed: {getattr(mod, '__version__', 'unknown version')}")
|
| 1087 |
+
except ImportError:
|
| 1088 |
+
print(f"❌ {package} is NOT installed")
|
| 1089 |
+
except Exception as e:
|
| 1090 |
+
print(f"Error checking packages: {e}")
|
| 1091 |
+
|
| 1092 |
+
# Check environment variables (redacted for security)
|
| 1093 |
+
for key in ['GEMINI_API_KEY', 'OPENROUTER_API_KEY']:
|
| 1094 |
+
if os.environ.get(key):
|
| 1095 |
+
print(f"✅ {key} environment variable is set")
|
| 1096 |
+
else:
|
| 1097 |
+
print(f"❌ {key} environment variable is NOT set")
|
| 1098 |
+
|
| 1099 |
+
chrome_process = None
|
| 1100 |
+
browser_session = None
|
| 1101 |
+
|
| 1102 |
+
try:
|
| 1103 |
+
# Launch the browser via BrowserSession so only the agent opens a window.
|
| 1104 |
+
print('🔄 Launching browser via BrowserSession (agent-managed launch)')
|
| 1105 |
+
browser_profile = BrowserProfile(
|
| 1106 |
+
is_local=True,
|
| 1107 |
+
headless=False,
|
| 1108 |
+
launch_args=[
|
| 1109 |
+
'--no-first-run',
|
| 1110 |
+
'--no-default-browser-check',
|
| 1111 |
+
'--disable-extensions',
|
| 1112 |
+
'--disable-background-networking',
|
| 1113 |
+
'--disable-background-timer-throttling',
|
| 1114 |
+
'--disable-backgrounding-occluded-windows',
|
| 1115 |
+
'--disable-popup-blocking',
|
| 1116 |
+
'--disable-renderer-backgrounding',
|
| 1117 |
+
'--force-color-profile=srgb',
|
| 1118 |
+
'--metrics-recording-only',
|
| 1119 |
+
'--mute-audio',
|
| 1120 |
+
],
|
| 1121 |
+
)
|
| 1122 |
|
| 1123 |
+
print('Creating BrowserSession (this will launch Chrome once, managed by browser-use)')
|
| 1124 |
+
browser_session = BrowserSession(browser_profile=browser_profile)
|
| 1125 |
+
print(f"✅ Browser session created successfully: {browser_session}")
|
| 1126 |
+
|
| 1127 |
+
# Build the Browser Agent using the created session. Skip internal launch to avoid duplicates.
|
| 1128 |
+
print('🔄 Creating Browser Agent with provided BrowserSession...')
|
| 1129 |
+
browser_agent = AgentBrowser(
|
| 1130 |
+
task=task,
|
| 1131 |
+
llm=llm_browser,
|
| 1132 |
+
use_vision=True,
|
| 1133 |
+
generate_gif=False,
|
| 1134 |
+
max_failures=3,
|
| 1135 |
+
file_system_path="./browser_agent_data",
|
| 1136 |
+
tools=tools,
|
| 1137 |
+
output_model_schema=BrowserAgentOutput,
|
| 1138 |
+
browser_session=browser_session,
|
| 1139 |
+
skip_browser_launch=True,
|
| 1140 |
+
)
|
| 1141 |
+
print('✅ Browser Agent created with provided session')
|
| 1142 |
+
|
| 1143 |
+
print('🚀 Running browser agent...')
|
| 1144 |
+
try:
|
| 1145 |
+
print("Starting browser agent.run() with max_steps=15")
|
| 1146 |
+
history = await browser_agent.run(max_steps=15)
|
| 1147 |
+
print("-------------Agent run completed---------------")
|
| 1148 |
+
print("Steps executed:", len(history.steps) if hasattr(history, 'steps') else "Unknown")
|
| 1149 |
+
print("-------------Final result---------------")
|
| 1150 |
+
print(history.final_result)
|
| 1151 |
+
except Exception as run_error:
|
| 1152 |
+
print(f'❌ Error during browser agent run: {type(run_error).__name__}: {run_error}')
|
| 1153 |
+
import traceback
|
| 1154 |
+
print("Detailed traceback:")
|
| 1155 |
+
traceback.print_exc()
|
| 1156 |
+
raise
|
| 1157 |
+
except Exception as e:
|
| 1158 |
+
print(f'❌ Error: {e}')
|
| 1159 |
+
raise
|
| 1160 |
+
finally:
|
| 1161 |
+
# Clean up resources in proper order
|
| 1162 |
+
print('🧹 Cleaning up resources...')
|
| 1163 |
+
|
| 1164 |
+
# First close the browser session which will close its page
|
| 1165 |
+
try:
|
| 1166 |
+
if browser_session:
|
| 1167 |
+
print(f"Attempting to close browser session: {browser_session}")
|
| 1168 |
+
await browser_session.close()
|
| 1169 |
+
print('✅ Closed browser session')
|
| 1170 |
+
else:
|
| 1171 |
+
print('ℹ️ No browser session was created')
|
| 1172 |
+
except Exception as e:
|
| 1173 |
+
print(f'⚠️ Error closing browser session: {type(e).__name__}: {e}')
|
| 1174 |
+
import traceback
|
| 1175 |
+
traceback.print_exc()
|
| 1176 |
+
|
| 1177 |
+
# Then close the playwright browser
|
| 1178 |
+
if playwright_browser:
|
| 1179 |
+
try:
|
| 1180 |
+
print(f"Attempting to close Playwright browser: {playwright_browser}")
|
| 1181 |
+
await playwright_browser.close()
|
| 1182 |
+
print('✅ Closed Playwright browser')
|
| 1183 |
+
except Exception as e:
|
| 1184 |
+
print(f'⚠️ Error closing Playwright browser: {type(e).__name__}: {e}')
|
| 1185 |
+
import traceback
|
| 1186 |
+
traceback.print_exc()
|
| 1187 |
+
|
| 1188 |
+
# Finally terminate the Chrome process
|
| 1189 |
+
if chrome_process:
|
| 1190 |
+
try:
|
| 1191 |
+
print(f"Attempting to terminate Chrome process (PID: {chrome_process.pid})")
|
| 1192 |
+
chrome_process.terminate()
|
| 1193 |
+
print("Waiting for Chrome process to exit (timeout: 5s)")
|
| 1194 |
+
await asyncio.wait_for(chrome_process.wait(), 5)
|
| 1195 |
+
print('✅ Terminated Chrome process')
|
| 1196 |
+
except asyncio.TimeoutError:
|
| 1197 |
+
print('⚠️ Chrome process did not exit after 5s timeout, forcing kill')
|
| 1198 |
+
chrome_process.kill()
|
| 1199 |
+
print("Sent SIGKILL to Chrome process")
|
| 1200 |
+
except Exception as e:
|
| 1201 |
+
print(f'⚠️ Error terminating Chrome process: {type(e).__name__}: {e}')
|
| 1202 |
+
import traceback
|
| 1203 |
+
traceback.print_exc()
|
| 1204 |
+
|
| 1205 |
+
# Check if Chrome is still running via CDP
|
| 1206 |
+
try:
|
| 1207 |
+
print("Checking if Chrome CDP is still accessible...")
|
| 1208 |
+
async with aiohttp.ClientSession() as session:
|
| 1209 |
+
async with session.get('http://localhost:9222/json/version', timeout=aiohttp.ClientTimeout(total=1)) as response:
|
| 1210 |
+
if response.status == 200:
|
| 1211 |
+
print('⚠️ WARNING: Chrome with CDP is still running after cleanup!')
|
| 1212 |
+
else:
|
| 1213 |
+
print('✅ Chrome CDP no longer accessible (status code != 200)')
|
| 1214 |
+
except Exception:
|
| 1215 |
+
print('✅ Chrome CDP no longer accessible (connection failed)')
|
| 1216 |
+
|
| 1217 |
+
print('✅ All cleanup complete')
|
| 1218 |
|
| 1219 |
|
| 1220 |
|
src/utils/chrome_playwright.py
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import tempfile
|
| 3 |
+
import asyncio
|
| 4 |
+
import aiohttp
|
| 5 |
+
from playwright.async_api import async_playwright
|
| 6 |
+
|
| 7 |
+
async def start_chrome_with_debug_port(port: int = 9222):
|
| 8 |
+
"""
|
| 9 |
+
Start Chrome with remote debugging enabled.
|
| 10 |
+
Returns the Chrome process.
|
| 11 |
+
"""
|
| 12 |
+
user_data_dir = tempfile.mkdtemp(prefix='chrome_cdp_')
|
| 13 |
+
print(f"Created temp user data dir: {user_data_dir}")
|
| 14 |
+
|
| 15 |
+
chrome_paths = [
|
| 16 |
+
r'C:\Program Files\Google\Chrome\Application\chrome.exe',
|
| 17 |
+
'chrome.exe',
|
| 18 |
+
'chrome',
|
| 19 |
+
]
|
| 20 |
+
|
| 21 |
+
chrome_exe = None
|
| 22 |
+
print(f"Looking for Chrome executable in these locations: {chrome_paths}")
|
| 23 |
+
for path in chrome_paths:
|
| 24 |
+
if os.path.exists(path):
|
| 25 |
+
print(f"Found Chrome at: {path}")
|
| 26 |
+
try:
|
| 27 |
+
print(f"Testing executable: {path}")
|
| 28 |
+
test_proc = await asyncio.create_subprocess_exec(
|
| 29 |
+
path, '--version', stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
|
| 30 |
+
)
|
| 31 |
+
stdout, stderr = await test_proc.communicate()
|
| 32 |
+
if test_proc.returncode == 0:
|
| 33 |
+
version = stdout.decode().strip() if stdout else "Unknown version"
|
| 34 |
+
print(f"Chrome executable works! Version: {version}")
|
| 35 |
+
chrome_exe = path
|
| 36 |
+
break
|
| 37 |
+
else:
|
| 38 |
+
error = stderr.decode().strip() if stderr else "Unknown error"
|
| 39 |
+
print(f"Chrome executable test failed: {error}")
|
| 40 |
+
except Exception as e:
|
| 41 |
+
print(f"Error testing Chrome executable {path}: {e}")
|
| 42 |
+
continue
|
| 43 |
+
elif path in ['chrome', 'chromium', 'chrome.exe']:
|
| 44 |
+
print(f"Checking PATH for {path}")
|
| 45 |
+
try:
|
| 46 |
+
test_proc = await asyncio.create_subprocess_exec(
|
| 47 |
+
path, '--version', stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
|
| 48 |
+
)
|
| 49 |
+
stdout, stderr = await test_proc.communicate()
|
| 50 |
+
if test_proc.returncode == 0:
|
| 51 |
+
version = stdout.decode().strip() if stdout else "Unknown version"
|
| 52 |
+
print(f"Chrome executable works via PATH! Version: {version}")
|
| 53 |
+
chrome_exe = path
|
| 54 |
+
break
|
| 55 |
+
else:
|
| 56 |
+
error = stderr.decode().strip() if stderr else "Unknown error"
|
| 57 |
+
print(f"Chrome executable test via PATH failed: {error}")
|
| 58 |
+
except Exception as e:
|
| 59 |
+
print(f"Error testing Chrome executable via PATH {path}: {e}")
|
| 60 |
+
continue
|
| 61 |
+
|
| 62 |
+
if not chrome_exe:
|
| 63 |
+
raise RuntimeError('❌ Chrome not found. Please install Chrome or Chromium.')
|
| 64 |
+
|
| 65 |
+
cmd = [
|
| 66 |
+
chrome_exe,
|
| 67 |
+
f'--remote-debugging-port={port}',
|
| 68 |
+
f'--user-data-dir={user_data_dir}',
|
| 69 |
+
'--no-first-run',
|
| 70 |
+
'--no-default-browser-check',
|
| 71 |
+
'--disable-extensions',
|
| 72 |
+
'--disable-background-networking',
|
| 73 |
+
'--disable-background-timer-throttling',
|
| 74 |
+
'--disable-backgrounding-occluded-windows',
|
| 75 |
+
'--disable-breakpad',
|
| 76 |
+
'--disable-component-extensions-with-background-pages',
|
| 77 |
+
'--disable-features=TranslateUI,BlinkGenPropertyTrees',
|
| 78 |
+
'--disable-ipc-flooding-protection',
|
| 79 |
+
'--disable-popup-blocking',
|
| 80 |
+
'--disable-prompt-on-repost',
|
| 81 |
+
'--disable-renderer-backgrounding',
|
| 82 |
+
'--force-color-profile=srgb',
|
| 83 |
+
'--metrics-recording-only',
|
| 84 |
+
'--mute-audio',
|
| 85 |
+
'about:blank',
|
| 86 |
+
]
|
| 87 |
+
|
| 88 |
+
print(f"Starting Chrome with command: {cmd}")
|
| 89 |
+
process = await asyncio.create_subprocess_exec(*cmd, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE)
|
| 90 |
+
print(f"Chrome process started with PID: {process.pid}")
|
| 91 |
+
|
| 92 |
+
print(f"Waiting for Chrome CDP to be available at http://localhost:{port}/json/version...")
|
| 93 |
+
cdp_ready = False
|
| 94 |
+
for attempt in range(20):
|
| 95 |
+
try:
|
| 96 |
+
async with aiohttp.ClientSession() as session:
|
| 97 |
+
print(f"CDP check attempt {attempt+1}/20...")
|
| 98 |
+
async with session.get(
|
| 99 |
+
f'http://localhost:{port}/json/version', timeout=aiohttp.ClientTimeout(total=1)
|
| 100 |
+
) as response:
|
| 101 |
+
if response.status == 200:
|
| 102 |
+
data = await response.json()
|
| 103 |
+
print(f"CDP connected successfully! Chrome version: {data.get('Browser', 'Unknown')}")
|
| 104 |
+
cdp_ready = True
|
| 105 |
+
break
|
| 106 |
+
else:
|
| 107 |
+
print(f"CDP check failed with status: {response.status}")
|
| 108 |
+
except Exception as e:
|
| 109 |
+
print(f"CDP check failed with error: {type(e).__name__}: {e}")
|
| 110 |
+
await asyncio.sleep(1)
|
| 111 |
+
|
| 112 |
+
if not cdp_ready:
|
| 113 |
+
print(f"ERROR: Chrome DevTools Protocol not available after timeout on port {port}")
|
| 114 |
+
stdout_data, stderr_data = await process.communicate()
|
| 115 |
+
print(f"Chrome STDOUT: {stdout_data.decode('utf-8', errors='ignore')}")
|
| 116 |
+
print(f"Chrome STDERR: {stderr_data.decode('utf-8', errors='ignore')}")
|
| 117 |
+
process.terminate()
|
| 118 |
+
raise RuntimeError('❌ Chrome failed to start with CDP')
|
| 119 |
+
|
| 120 |
+
return process
|
| 121 |
+
|
| 122 |
+
async def connect_playwright_to_cdp(cdp_url: str):
|
| 123 |
+
"""
|
| 124 |
+
Connect Playwright to the same Chrome instance Browser-Use is using.
|
| 125 |
+
Returns the Playwright browser and page.
|
| 126 |
+
"""
|
| 127 |
+
print(f"Connecting Playwright to CDP URL: {cdp_url}")
|
| 128 |
+
playwright = await async_playwright().start()
|
| 129 |
+
playwright_browser = await playwright.chromium.connect_over_cdp(cdp_url)
|
| 130 |
+
print(f"Playwright connected to browser")
|
| 131 |
+
|
| 132 |
+
if playwright_browser and playwright_browser.contexts and playwright_browser.contexts[0].pages:
|
| 133 |
+
playwright_page = playwright_browser.contexts[0].pages[0]
|
| 134 |
+
print(f"Using existing page: {await playwright_page.title()}")
|
| 135 |
+
elif playwright_browser:
|
| 136 |
+
print("No existing pages found, creating a new context and page")
|
| 137 |
+
context = await playwright_browser.new_context()
|
| 138 |
+
playwright_page = await context.new_page()
|
| 139 |
+
else:
|
| 140 |
+
playwright_page = None
|
| 141 |
+
print(f"Playwright page setup complete")
|
| 142 |
+
return playwright_browser, playwright_page
|
uv.lock
CHANGED
|
@@ -2175,6 +2175,25 @@ wheels = [
|
|
| 2175 |
{ url = "https://files.pythonhosted.org/packages/34/e7/ae39f538fd6844e982063c3a5e4598b8ced43b9633baa3a85ef33af8c05c/pillow-11.3.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:c84d689db21a1c397d001aa08241044aa2069e7587b398c8cc63020390b1c1b8", size = 6984598 },
|
| 2176 |
]
|
| 2177 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2178 |
[[package]]
|
| 2179 |
name = "portalocker"
|
| 2180 |
version = "2.10.1"
|
|
@@ -2606,6 +2625,18 @@ wheels = [
|
|
| 2606 |
{ url = "https://files.pythonhosted.org/packages/58/f0/427018098906416f580e3cf1366d3b1abfb408a0652e9f31600c24a1903c/pydantic_settings-2.10.1-py3-none-any.whl", hash = "sha256:a60952460b99cf661dc25c29c0ef171721f98bfcb52ef8d9ea4c943d7c8cc796", size = 45235 },
|
| 2607 |
]
|
| 2608 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2609 |
[[package]]
|
| 2610 |
name = "pygments"
|
| 2611 |
version = "2.19.2"
|
|
@@ -5733,6 +5764,7 @@ dependencies = [
|
|
| 5733 |
{ name = "openai-agents" },
|
| 5734 |
{ name = "pathlib" },
|
| 5735 |
{ name = "pillow" },
|
|
|
|
| 5736 |
{ name = "pydantic" },
|
| 5737 |
{ name = "pydantic-ai" },
|
| 5738 |
{ name = "urljoin" },
|
|
@@ -5758,6 +5790,7 @@ requires-dist = [
|
|
| 5758 |
{ name = "openai-agents", specifier = ">=0.2.8" },
|
| 5759 |
{ name = "pathlib", specifier = ">=1.0.1" },
|
| 5760 |
{ name = "pillow", specifier = ">=11.3.0" },
|
|
|
|
| 5761 |
{ name = "pydantic", specifier = ">=2.11.7" },
|
| 5762 |
{ name = "pydantic-ai", extras = ["logfire"], specifier = ">=1.0.1" },
|
| 5763 |
{ name = "urljoin", specifier = ">=1.0.0" },
|
|
|
|
| 2175 |
{ url = "https://files.pythonhosted.org/packages/34/e7/ae39f538fd6844e982063c3a5e4598b8ced43b9633baa3a85ef33af8c05c/pillow-11.3.0-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:c84d689db21a1c397d001aa08241044aa2069e7587b398c8cc63020390b1c1b8", size = 6984598 },
|
| 2176 |
]
|
| 2177 |
|
| 2178 |
+
[[package]]
|
| 2179 |
+
name = "playwright"
|
| 2180 |
+
version = "1.55.0"
|
| 2181 |
+
source = { registry = "https://pypi.org/simple" }
|
| 2182 |
+
dependencies = [
|
| 2183 |
+
{ name = "greenlet" },
|
| 2184 |
+
{ name = "pyee" },
|
| 2185 |
+
]
|
| 2186 |
+
wheels = [
|
| 2187 |
+
{ url = "https://files.pythonhosted.org/packages/80/3a/c81ff76df266c62e24f19718df9c168f49af93cabdbc4608ae29656a9986/playwright-1.55.0-py3-none-macosx_10_13_x86_64.whl", hash = "sha256:d7da108a95001e412effca4f7610de79da1637ccdf670b1ae3fdc08b9694c034", size = 40428109 },
|
| 2188 |
+
{ url = "https://files.pythonhosted.org/packages/cf/f5/bdb61553b20e907196a38d864602a9b4a461660c3a111c67a35179b636fa/playwright-1.55.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:8290cf27a5d542e2682ac274da423941f879d07b001f6575a5a3a257b1d4ba1c", size = 38687254 },
|
| 2189 |
+
{ url = "https://files.pythonhosted.org/packages/4a/64/48b2837ef396487807e5ab53c76465747e34c7143fac4a084ef349c293a8/playwright-1.55.0-py3-none-macosx_11_0_universal2.whl", hash = "sha256:25b0d6b3fd991c315cca33c802cf617d52980108ab8431e3e1d37b5de755c10e", size = 40428108 },
|
| 2190 |
+
{ url = "https://files.pythonhosted.org/packages/08/33/858312628aa16a6de97839adc2ca28031ebc5391f96b6fb8fdf1fcb15d6c/playwright-1.55.0-py3-none-manylinux1_x86_64.whl", hash = "sha256:c6d4d8f6f8c66c483b0835569c7f0caa03230820af8e500c181c93509c92d831", size = 45905643 },
|
| 2191 |
+
{ url = "https://files.pythonhosted.org/packages/83/83/b8d06a5b5721931aa6d5916b83168e28bd891f38ff56fe92af7bdee9860f/playwright-1.55.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:29a0777c4ce1273acf90c87e4ae2fe0130182100d99bcd2ae5bf486093044838", size = 45296647 },
|
| 2192 |
+
{ url = "https://files.pythonhosted.org/packages/06/2e/9db64518aebcb3d6ef6cd6d4d01da741aff912c3f0314dadb61226c6a96a/playwright-1.55.0-py3-none-win32.whl", hash = "sha256:29e6d1558ad9d5b5c19cbec0a72f6a2e35e6353cd9f262e22148685b86759f90", size = 35476046 },
|
| 2193 |
+
{ url = "https://files.pythonhosted.org/packages/46/4f/9ba607fa94bb9cee3d4beb1c7b32c16efbfc9d69d5037fa85d10cafc618b/playwright-1.55.0-py3-none-win_amd64.whl", hash = "sha256:7eb5956473ca1951abb51537e6a0da55257bb2e25fc37c2b75af094a5c93736c", size = 35476048 },
|
| 2194 |
+
{ url = "https://files.pythonhosted.org/packages/21/98/5ca173c8ec906abde26c28e1ecb34887343fd71cc4136261b90036841323/playwright-1.55.0-py3-none-win_arm64.whl", hash = "sha256:012dc89ccdcbd774cdde8aeee14c08e0dd52ddb9135bf10e9db040527386bd76", size = 31225543 },
|
| 2195 |
+
]
|
| 2196 |
+
|
| 2197 |
[[package]]
|
| 2198 |
name = "portalocker"
|
| 2199 |
version = "2.10.1"
|
|
|
|
| 2625 |
{ url = "https://files.pythonhosted.org/packages/58/f0/427018098906416f580e3cf1366d3b1abfb408a0652e9f31600c24a1903c/pydantic_settings-2.10.1-py3-none-any.whl", hash = "sha256:a60952460b99cf661dc25c29c0ef171721f98bfcb52ef8d9ea4c943d7c8cc796", size = 45235 },
|
| 2626 |
]
|
| 2627 |
|
| 2628 |
+
[[package]]
|
| 2629 |
+
name = "pyee"
|
| 2630 |
+
version = "13.0.0"
|
| 2631 |
+
source = { registry = "https://pypi.org/simple" }
|
| 2632 |
+
dependencies = [
|
| 2633 |
+
{ name = "typing-extensions" },
|
| 2634 |
+
]
|
| 2635 |
+
sdist = { url = "https://files.pythonhosted.org/packages/95/03/1fd98d5841cd7964a27d729ccf2199602fe05eb7a405c1462eb7277945ed/pyee-13.0.0.tar.gz", hash = "sha256:b391e3c5a434d1f5118a25615001dbc8f669cf410ab67d04c4d4e07c55481c37", size = 31250 }
|
| 2636 |
+
wheels = [
|
| 2637 |
+
{ url = "https://files.pythonhosted.org/packages/9b/4d/b9add7c84060d4c1906abe9a7e5359f2a60f7a9a4f67268b2766673427d8/pyee-13.0.0-py3-none-any.whl", hash = "sha256:48195a3cddb3b1515ce0695ed76036b5ccc2ef3a9f963ff9f77aec0139845498", size = 15730 },
|
| 2638 |
+
]
|
| 2639 |
+
|
| 2640 |
[[package]]
|
| 2641 |
name = "pygments"
|
| 2642 |
version = "2.19.2"
|
|
|
|
| 5764 |
{ name = "openai-agents" },
|
| 5765 |
{ name = "pathlib" },
|
| 5766 |
{ name = "pillow" },
|
| 5767 |
+
{ name = "playwright" },
|
| 5768 |
{ name = "pydantic" },
|
| 5769 |
{ name = "pydantic-ai" },
|
| 5770 |
{ name = "urljoin" },
|
|
|
|
| 5790 |
{ name = "openai-agents", specifier = ">=0.2.8" },
|
| 5791 |
{ name = "pathlib", specifier = ">=1.0.1" },
|
| 5792 |
{ name = "pillow", specifier = ">=11.3.0" },
|
| 5793 |
+
{ name = "playwright", specifier = ">=1.55.0" },
|
| 5794 |
{ name = "pydantic", specifier = ">=2.11.7" },
|
| 5795 |
{ name = "pydantic-ai", extras = ["logfire"], specifier = ">=1.0.1" },
|
| 5796 |
{ name = "urljoin", specifier = ">=1.0.0" },
|