Spaces:
Runtime error
Runtime error
thisisam commited on
Commit Β·
faf508c
1
Parent(s): a4a4f9a
Enable vision-language capabilities with transformers format
Browse files- LOCAL_TESTING.md +0 -110
- README.md +111 -41
- app.py +357 -158
- requirements.txt +2 -1
LOCAL_TESTING.md
DELETED
|
@@ -1,110 +0,0 @@
|
|
| 1 |
-
# Local Testing Instructions
|
| 2 |
-
|
| 3 |
-
## Test Your Space Locally Before Deploying
|
| 4 |
-
|
| 5 |
-
Before deploying to Hugging Face, you can test the app on your local machine.
|
| 6 |
-
|
| 7 |
-
### Prerequisites
|
| 8 |
-
|
| 9 |
-
1. Python 3.8 or higher installed
|
| 10 |
-
2. Your Hugging Face token ready
|
| 11 |
-
|
| 12 |
-
### Steps
|
| 13 |
-
|
| 14 |
-
#### 1. Install Dependencies
|
| 15 |
-
|
| 16 |
-
Open PowerShell/Terminal and navigate to this folder:
|
| 17 |
-
|
| 18 |
-
```bash
|
| 19 |
-
cd "c:/Users/Amir/OneDrive - Digital Health CRC Limited/Projects/url2md/fara-7b-space"
|
| 20 |
-
```
|
| 21 |
-
|
| 22 |
-
Install required packages:
|
| 23 |
-
|
| 24 |
-
```bash
|
| 25 |
-
pip install -r requirements.txt
|
| 26 |
-
```
|
| 27 |
-
|
| 28 |
-
#### 2. Set Your HuggingFace Token
|
| 29 |
-
|
| 30 |
-
Create a `.env` file in this folder (it's already in .gitignore, so it won't be committed):
|
| 31 |
-
|
| 32 |
-
```bash
|
| 33 |
-
# PowerShell command to create .env file
|
| 34 |
-
echo "HF_TOKEN=your_token_here" > .env
|
| 35 |
-
```
|
| 36 |
-
|
| 37 |
-
Replace `your_token_here` with your actual Hugging Face token.
|
| 38 |
-
|
| 39 |
-
#### 3. Update app.py to Load .env (Temporary)
|
| 40 |
-
|
| 41 |
-
For local testing only, add these lines at the top of `app.py`:
|
| 42 |
-
|
| 43 |
-
```python
|
| 44 |
-
from dotenv import load_dotenv
|
| 45 |
-
load_dotenv() # Load .env file
|
| 46 |
-
```
|
| 47 |
-
|
| 48 |
-
And install python-dotenv:
|
| 49 |
-
```bash
|
| 50 |
-
pip install python-dotenv
|
| 51 |
-
```
|
| 52 |
-
|
| 53 |
-
#### 4. Run the App Locally
|
| 54 |
-
|
| 55 |
-
```bash
|
| 56 |
-
python app.py
|
| 57 |
-
```
|
| 58 |
-
|
| 59 |
-
You should see output like:
|
| 60 |
-
```
|
| 61 |
-
Running on local URL: http://127.0.0.1:7860
|
| 62 |
-
```
|
| 63 |
-
|
| 64 |
-
Open that URL in your browser to test!
|
| 65 |
-
|
| 66 |
-
#### 5. Test the Chat
|
| 67 |
-
|
| 68 |
-
- Type a message
|
| 69 |
-
- Verify you get responses from Fara-7B
|
| 70 |
-
- Test different temperatures and max_tokens settings
|
| 71 |
-
- Check if streaming works properly
|
| 72 |
-
|
| 73 |
-
### Important Notes
|
| 74 |
-
|
| 75 |
-
β οΈ **Before Deploying:**
|
| 76 |
-
- Remove the `load_dotenv()` code from `app.py` (Spaces use secrets, not .env)
|
| 77 |
-
- Don't commit your `.env` file (already in .gitignore)
|
| 78 |
-
- The Space will use the `HF_TOKEN` secret instead
|
| 79 |
-
|
| 80 |
-
### Troubleshooting Local Testing
|
| 81 |
-
|
| 82 |
-
**Import Error for dotenv:**
|
| 83 |
-
```bash
|
| 84 |
-
pip install python-dotenv
|
| 85 |
-
```
|
| 86 |
-
|
| 87 |
-
**Token Error:**
|
| 88 |
-
- Check your token is correct in `.env`
|
| 89 |
-
- Ensure no extra spaces or quotes
|
| 90 |
-
- Verify token has inference permissions
|
| 91 |
-
|
| 92 |
-
**Port Already in Use:**
|
| 93 |
-
```bash
|
| 94 |
-
# Kill the process or run on different port
|
| 95 |
-
python app.py --server-port 7861
|
| 96 |
-
```
|
| 97 |
-
|
| 98 |
-
### Alternative: Quick Test Without .env
|
| 99 |
-
|
| 100 |
-
You can also temporarily hardcode your token (FOR TESTING ONLY):
|
| 101 |
-
|
| 102 |
-
```python
|
| 103 |
-
client = InferenceClient(token="your_token_here") # TEMPORARY - REMOVE BEFORE DEPLOYING
|
| 104 |
-
```
|
| 105 |
-
|
| 106 |
-
β οΈ **NEVER commit hardcoded tokens to git!**
|
| 107 |
-
|
| 108 |
-
---
|
| 109 |
-
|
| 110 |
-
Once local testing works, you're ready to deploy to Hugging Face Spaces! See `DEPLOYMENT_GUIDE.md` for deployment instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title: Fara-7B
|
| 3 |
emoji: π€
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: blue
|
|
@@ -8,75 +8,145 @@ sdk_version: 5.0.2
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
-
short_description: Chat interface for Microsoft Fara-7B
|
| 12 |
---
|
| 13 |
|
| 14 |
-
# Fara-7B:
|
| 15 |
|
| 16 |
-
This Space provides a chat interface to interact with **Microsoft Fara-7B**,
|
| 17 |
|
| 18 |
## π Features
|
| 19 |
|
| 20 |
-
- **
|
| 21 |
-
- **
|
| 22 |
-
- **
|
| 23 |
-
- **
|
| 24 |
|
| 25 |
## π About Fara-7B
|
| 26 |
|
| 27 |
-
Fara-7B is Microsoft's
|
| 28 |
|
| 29 |
-
-
|
| 30 |
-
-
|
| 31 |
-
-
|
| 32 |
-
-
|
| 33 |
-
-
|
| 34 |
|
| 35 |
-
##
|
| 36 |
|
| 37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
-
|
| 40 |
-
2. Go to **Settings** β **Variables and secrets**
|
| 41 |
-
3. Add a new secret:
|
| 42 |
-
- **Name**: `HF_TOKEN`
|
| 43 |
-
- **Value**: Your Hugging Face token (get it from [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens))
|
| 44 |
-
4. Ensure your token has **inference** permissions
|
| 45 |
-
5. Restart the Space
|
| 46 |
|
| 47 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
|
|
|
| 53 |
|
| 54 |
-
##
|
| 55 |
|
| 56 |
-
|
| 57 |
|
| 58 |
-
|
| 59 |
-
-
|
| 60 |
-
- "
|
|
|
|
| 61 |
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
|
| 64 |
## π Resources
|
| 65 |
|
| 66 |
-
-
|
| 67 |
-
-
|
| 68 |
-
- [
|
| 69 |
|
| 70 |
-
##
|
| 71 |
|
| 72 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
|
| 74 |
## π€ Credits
|
| 75 |
|
| 76 |
- **Model**: Microsoft Research
|
| 77 |
- **Interface**: Built with Gradio
|
| 78 |
-
- **
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
---
|
| 81 |
|
| 82 |
-
*
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Fara-7B Chat
|
| 3 |
emoji: π€
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: blue
|
|
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
+
short_description: Chat interface for Microsoft Fara-7B web automation agent
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Fara-7B: Web Automation Agent Chat Interface
|
| 15 |
|
| 16 |
+
This Space provides a chat interface to interact with **Microsoft Fara-7B**, a 7B parameter vision-language model designed for web automation and computer use.
|
| 17 |
|
| 18 |
## π Features
|
| 19 |
|
| 20 |
+
- **Vision-Language Model**: Upload browser screenshots with your tasks
|
| 21 |
+
- **Web Automation Planning**: Describes step-by-step actions for web tasks
|
| 22 |
+
- **Safety-First**: Stops at "Critical Points" (checkout, personal info)
|
| 23 |
+
- **Flexible Usage**: Works with or without screenshots
|
| 24 |
|
| 25 |
## π About Fara-7B
|
| 26 |
|
| 27 |
+
Fara-7B is Microsoft's specialized agentic model for computer use. With 7 billion parameters, it can:
|
| 28 |
|
| 29 |
+
- πΈ Understand browser screenshots
|
| 30 |
+
- π― Plan multi-step web automation tasks
|
| 31 |
+
- π§ Use browser tools (click, type, scroll)
|
| 32 |
+
- π Stop before sensitive actions (Critical Points)
|
| 33 |
+
- π‘ Handle tasks like shopping, travel, research, and more
|
| 34 |
|
| 35 |
+
### Key Capabilities
|
| 36 |
|
| 37 |
+
- π **Shopping automation**: Find products, add to cart
|
| 38 |
+
- βοΈ **Travel booking**: Search flights and hotels
|
| 39 |
+
- π½οΈ **Restaurant search**: Find dining options
|
| 40 |
+
- π **Information extraction**: Research and data gathering
|
| 41 |
+
- ποΈ **Government portals**: Navigate and extract grant/funding info
|
| 42 |
|
| 43 |
+
## π― How to Use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
+
### Simple Text Tasks
|
| 46 |
+
Just describe what you want to accomplish:
|
| 47 |
+
- "Find healthcare grants on the NSW government website"
|
| 48 |
+
- "Search for running shoes under $100"
|
| 49 |
+
- "Look up Italian restaurants in Seattle with 4+ stars"
|
| 50 |
|
| 51 |
+
### Advanced: With Screenshots
|
| 52 |
+
1. Take a screenshot of the browser/website you're working with
|
| 53 |
+
2. Upload the screenshot
|
| 54 |
+
3. Describe your task
|
| 55 |
+
4. Fara-7B will analyze the screenshot and plan the next actions
|
| 56 |
|
| 57 |
+
## βοΈ Setup
|
| 58 |
|
| 59 |
+
### For This Space
|
| 60 |
|
| 61 |
+
1. **Request Model Access**:
|
| 62 |
+
- Visit [microsoft/Fara-7B](https://huggingface.co/microsoft/Fara-7B)
|
| 63 |
+
- Click "Request access" if it's gated
|
| 64 |
+
- Wait for approval
|
| 65 |
|
| 66 |
+
2. **Set HF_TOKEN** (Space owners only):
|
| 67 |
+
- Go to Space Settings β Variables and secrets
|
| 68 |
+
- Add secret: `HF_TOKEN` = your HuggingFace token
|
| 69 |
+
- Get token from [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
|
| 70 |
+
|
| 71 |
+
### Use Locally with Transformers
|
| 72 |
+
|
| 73 |
+
```python
|
| 74 |
+
from transformers import pipeline
|
| 75 |
+
|
| 76 |
+
pipe = pipeline("image-text-to-text", model="microsoft/Fara-7B")
|
| 77 |
+
messages = [
|
| 78 |
+
{
|
| 79 |
+
"role": "user",
|
| 80 |
+
"content": [
|
| 81 |
+
{"type": "image", "url": "screenshot.jpg"},
|
| 82 |
+
{"type": "text", "text": "Find running shoes under $100"}
|
| 83 |
+
]
|
| 84 |
+
},
|
| 85 |
+
]
|
| 86 |
+
result = pipe(text=messages)
|
| 87 |
+
```
|
| 88 |
+
|
| 89 |
+
### Full Browser Automation (vLLM + CLI)
|
| 90 |
+
|
| 91 |
+
For actual browser control with live automation:
|
| 92 |
+
|
| 93 |
+
```bash
|
| 94 |
+
# 1. Clone repository
|
| 95 |
+
git clone https://github.com/microsoft/fara.git
|
| 96 |
+
cd fara
|
| 97 |
+
|
| 98 |
+
# 2. Setup environment
|
| 99 |
+
python3 -m venv .venv
|
| 100 |
+
source .venv/bin/activate # Windows: .venv\Scripts\activate
|
| 101 |
+
pip install -e .
|
| 102 |
+
playwright install
|
| 103 |
+
|
| 104 |
+
# 3. Host the model
|
| 105 |
+
vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto
|
| 106 |
+
|
| 107 |
+
# 4. Run tasks (in another terminal)
|
| 108 |
+
fara-cli --task "your web automation task"
|
| 109 |
+
```
|
| 110 |
+
|
| 111 |
+
**System Requirements**:
|
| 112 |
+
- GPU with 16GB+ VRAM
|
| 113 |
+
- Or use `--tensor-parallel-size 2` if limited memory
|
| 114 |
|
| 115 |
## π Resources
|
| 116 |
|
| 117 |
+
- **Model Card**: [microsoft/Fara-7B](https://huggingface.co/microsoft/Fara-7B)
|
| 118 |
+
- **GitHub Repository**: [microsoft/fara](https://github.com/microsoft/fara)
|
| 119 |
+
- **Microsoft Research**: [Research Page](https://www.microsoft.com/en-us/research/)
|
| 120 |
|
| 121 |
+
## β οΈ Important Notes
|
| 122 |
|
| 123 |
+
### Inference API Limitations
|
| 124 |
+
|
| 125 |
+
This Space attempts to use the HuggingFace Inference API, but:
|
| 126 |
+
- The API may not be fully available for Fara-7B
|
| 127 |
+
- If unavailable, demo responses will be provided instead
|
| 128 |
+
- For full functionality, host locally with vLLM (see above)
|
| 129 |
+
|
| 130 |
+
### Critical Points
|
| 131 |
+
|
| 132 |
+
Fara-7B is designed to stop at "Critical Points":
|
| 133 |
+
- **Checkout/Purchase**: Stops before payment
|
| 134 |
+
- **Booking**: Stops before entering personal info
|
| 135 |
+
- **Account Creation**: Stops before submitting sensitive data
|
| 136 |
+
- **Communication**: Stops before making calls or sending emails
|
| 137 |
+
|
| 138 |
+
This ensures safety and gives you control over sensitive actions.
|
| 139 |
|
| 140 |
## π€ Credits
|
| 141 |
|
| 142 |
- **Model**: Microsoft Research
|
| 143 |
- **Interface**: Built with Gradio
|
| 144 |
+
- **Infrastructure**: HuggingFace Spaces
|
| 145 |
+
|
| 146 |
+
## π License
|
| 147 |
+
|
| 148 |
+
MIT License - See [microsoft/Fara-7B](https://huggingface.co/microsoft/Fara-7B) for model license details.
|
| 149 |
|
| 150 |
---
|
| 151 |
|
| 152 |
+
*Experience web automation AI with Fara-7B. For production use cases requiring actual browser control, integrate with the full vLLM setup or use the Magentic-UI framework.*
|
app.py
CHANGED
|
@@ -1,211 +1,410 @@
|
|
| 1 |
import gradio as gr
|
| 2 |
from huggingface_hub import InferenceClient
|
| 3 |
import os
|
| 4 |
-
import
|
|
|
|
|
|
|
| 5 |
|
| 6 |
# Initialize the Inference Client
|
| 7 |
client = InferenceClient(token=os.getenv("HF_TOKEN"))
|
| 8 |
|
| 9 |
-
def
|
| 10 |
"""
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
"""
|
| 13 |
try:
|
| 14 |
-
# Build the
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
messages = [
|
| 20 |
-
{"role": "system", "content": system_prompt}
|
| 21 |
-
{"role": "user", "content": message}
|
| 22 |
]
|
| 23 |
|
| 24 |
-
#
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
# Extract the response
|
| 33 |
-
if hasattr(response, 'generated_text'):
|
| 34 |
-
return response.generated_text
|
| 35 |
-
elif isinstance(response, str):
|
| 36 |
-
return response
|
| 37 |
-
else:
|
| 38 |
-
return str(response)
|
| 39 |
-
|
| 40 |
-
except Exception as e:
|
| 41 |
-
error_msg = f"β Error: {str(e)}"
|
| 42 |
-
|
| 43 |
-
# Provide specific guidance based on common errors
|
| 44 |
-
if "401" in str(e):
|
| 45 |
-
error_msg += "\n\nπ Authentication failed. Please check:"
|
| 46 |
-
error_msg += "\n- Your HF_TOKEN is set in Space secrets"
|
| 47 |
-
error_msg += "\n- You have requested access to microsoft/Fara-7B"
|
| 48 |
-
error_msg += "\n- Your token has the necessary permissions"
|
| 49 |
-
elif "404" in str(e):
|
| 50 |
-
error_msg += "\n\nπ Model not found. The model might be:"
|
| 51 |
-
error_msg += "\n- Private and requiring access request"
|
| 52 |
-
error_msg += "\n- Temporarily unavailable"
|
| 53 |
-
elif "403" in str(e):
|
| 54 |
-
error_msg += "\n\nπ« Access forbidden. You need to:"
|
| 55 |
-
error_msg += "\n- Visit https://huggingface.co/microsoft/Fara-7B"
|
| 56 |
-
error_msg += "\n- Click 'Access repository' to request access"
|
| 57 |
-
error_msg += "\n- Wait for approval from Microsoft"
|
| 58 |
-
|
| 59 |
-
return error_msg
|
| 60 |
-
|
| 61 |
-
# Alternative: Use text generation with proper formatting
|
| 62 |
-
def chat_with_fara_text_generation(message, history):
|
| 63 |
-
"""
|
| 64 |
-
Alternative approach using text generation with proper prompt formatting
|
| 65 |
-
"""
|
| 66 |
-
try:
|
| 67 |
-
# Format prompt for agent tasks
|
| 68 |
-
prompt = f"""<|system|>
|
| 69 |
-
You are Fara, a web automation agent designed to help users with web-based tasks.
|
| 70 |
-
|
| 71 |
-
When responding:
|
| 72 |
-
1. Break down complex web tasks into steps
|
| 73 |
-
2. Suggest specific actions that could be automated
|
| 74 |
-
3. Identify potential challenges in web automation
|
| 75 |
-
4. Provide practical guidance for browser automation
|
| 76 |
-
|
| 77 |
-
<|user|>
|
| 78 |
-
{message}
|
| 79 |
-
<|assistant|>
|
| 80 |
-
"""
|
| 81 |
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
max_new_tokens=500,
|
| 86 |
-
temperature=0.7,
|
| 87 |
-
do_sample=True
|
| 88 |
-
)
|
| 89 |
|
| 90 |
-
#
|
| 91 |
-
|
| 92 |
-
response = response.split("<|assistant|>")[-1].strip()
|
| 93 |
|
| 94 |
-
|
|
|
|
|
|
|
|
|
|
| 95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
except Exception as e:
|
| 97 |
-
return f"β
|
| 98 |
|
| 99 |
-
|
| 100 |
-
def fallback_chat(message, history):
|
| 101 |
"""
|
| 102 |
-
|
| 103 |
"""
|
| 104 |
-
fallback_responses = {
|
| 105 |
-
"web automation": "For web automation tasks like the NSW grants search, you would typically:\n\n1. Navigate to https://www.nsw.gov.au/grants-and-funding\n2. Use search functionality to filter for 'healthcare' grants\n3. Extract the list of available funding opportunities\n4. Provide summaries with eligibility criteria and deadlines",
|
| 106 |
-
|
| 107 |
-
"general": "I'd be happy to help with web automation tasks! For tasks like finding grants on government websites, the process involves:\n- Website navigation\n- Search and filtering\n- Data extraction\n- Result organization"
|
| 108 |
-
}
|
| 109 |
-
|
| 110 |
-
# Simple keyword-based fallback
|
| 111 |
message_lower = message.lower()
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
return
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 128 |
|
| 129 |
-
#
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
-
#
|
| 138 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
# Create the Gradio interface
|
| 141 |
-
with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
| 142 |
gr.Markdown(
|
| 143 |
"""
|
| 144 |
-
# π€ Fara-7B Web Automation
|
| 145 |
|
| 146 |
-
**Microsoft's specialized
|
| 147 |
|
| 148 |
-
|
| 149 |
-
- Web navigation and automation
|
| 150 |
-
- Task planning for browser actions
|
| 151 |
-
- Step-by-step guidance for web tasks
|
| 152 |
|
| 153 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
"""
|
| 155 |
)
|
| 156 |
|
| 157 |
-
|
| 158 |
-
with gr.Accordion("π Access Requirements", open=False):
|
| 159 |
gr.Markdown("""
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
|
|
|
|
|
|
|
|
|
| 164 |
|
| 165 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 166 |
""")
|
| 167 |
|
| 168 |
chatbot = gr.Chatbot(
|
| 169 |
height=500,
|
| 170 |
-
label="
|
| 171 |
-
show_label=True
|
|
|
|
| 172 |
)
|
| 173 |
|
| 174 |
with gr.Row():
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
|
| 180 |
-
|
| 181 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 182 |
|
| 183 |
with gr.Row():
|
|
|
|
| 184 |
clear_btn = gr.Button("Clear Chat")
|
| 185 |
-
method_btn = gr.Button("Check Access Status")
|
| 186 |
|
| 187 |
-
|
|
|
|
| 188 |
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
|
| 192 |
-
|
| 193 |
|
| 194 |
-
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
|
| 199 |
-
|
| 200 |
-
|
| 201 |
-
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 207 |
|
| 208 |
if __name__ == "__main__":
|
| 209 |
-
# On HuggingFace Spaces, we don't need to specify server settings
|
| 210 |
-
# The platform handles this automatically
|
| 211 |
demo.launch()
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
from huggingface_hub import InferenceClient
|
| 3 |
import os
|
| 4 |
+
from PIL import Image
|
| 5 |
+
import requests
|
| 6 |
+
from io import BytesIO
|
| 7 |
|
| 8 |
# Initialize the Inference Client
|
| 9 |
client = InferenceClient(token=os.getenv("HF_TOKEN"))
|
| 10 |
|
| 11 |
+
def create_demo_screenshot(task_type="general"):
|
| 12 |
"""
|
| 13 |
+
Create a simple placeholder screenshot for demo purposes
|
| 14 |
+
In actual use, this would be a real browser screenshot
|
| 15 |
+
"""
|
| 16 |
+
# For now, return None - we'll use text-only mode
|
| 17 |
+
return None
|
| 18 |
+
|
| 19 |
+
def chat_with_fara(message, history, image=None):
|
| 20 |
+
"""
|
| 21 |
+
Interact with Fara-7B using the vision-language model API
|
| 22 |
"""
|
| 23 |
try:
|
| 24 |
+
# Build the proper message format for Fara-7B
|
| 25 |
+
system_prompt = """You are a web automation agent that performs actions on websites to fulfill user requests by calling various tools.
|
| 26 |
+
You should stop execution at Critical Points. A Critical Point occurs in tasks like:
|
| 27 |
+
- Checkout, Book, Purchase, Call, Email, Order
|
| 28 |
+
|
| 29 |
+
A Critical Point requires the user's permission or personal/sensitive information (name, email, credit card, address, payment information, resume, etc.) to complete a transaction (purchase, reservation, sign-up, etc.), or to communicate as a human would (call, email, apply to a job, etc.).
|
| 30 |
+
|
| 31 |
+
Guideline: Solve the task as far as possible up until a Critical Point.
|
| 32 |
+
|
| 33 |
+
Examples:
|
| 34 |
+
- If the task is to "call a restaurant to make a reservation," do not actually make the call. Instead, navigate to the restaurant's page and find the phone number.
|
| 35 |
+
- If the task is to "order new size 12 running shoes," do not place the order. Instead, search for the right shoes that meet the criteria and add them to the cart.
|
| 36 |
+
|
| 37 |
+
Some tasks, like answering questions, may not encounter a Critical Point at all."""
|
| 38 |
+
|
| 39 |
+
# Prepare messages in the format expected by Fara-7B
|
| 40 |
messages = [
|
| 41 |
+
{"role": "system", "content": system_prompt}
|
|
|
|
| 42 |
]
|
| 43 |
|
| 44 |
+
# Add history
|
| 45 |
+
if history:
|
| 46 |
+
for h in history:
|
| 47 |
+
if h["role"] in ["user", "assistant"]:
|
| 48 |
+
messages.append(h)
|
| 49 |
+
|
| 50 |
+
# Add current message
|
| 51 |
+
user_content = []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
+
# Add image if provided
|
| 54 |
+
if image is not None:
|
| 55 |
+
user_content.append({"type": "image", "image": image})
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
+
# Add text
|
| 58 |
+
user_content.append({"type": "text", "text": message})
|
|
|
|
| 59 |
|
| 60 |
+
messages.append({
|
| 61 |
+
"role": "user",
|
| 62 |
+
"content": user_content if len(user_content) > 1 else message
|
| 63 |
+
})
|
| 64 |
|
| 65 |
+
# Try to use the Inference API
|
| 66 |
+
try:
|
| 67 |
+
response = client.chat_completion(
|
| 68 |
+
messages=messages,
|
| 69 |
+
model="microsoft/Fara-7B",
|
| 70 |
+
max_tokens=512,
|
| 71 |
+
temperature=0.7,
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
# Extract the response
|
| 75 |
+
if hasattr(response, 'choices') and len(response.choices) > 0:
|
| 76 |
+
return response.choices[0].message.content
|
| 77 |
+
else:
|
| 78 |
+
raise Exception("Unexpected response format")
|
| 79 |
+
|
| 80 |
+
except Exception as api_error:
|
| 81 |
+
error_str = str(api_error).lower()
|
| 82 |
+
|
| 83 |
+
# Check for specific errors
|
| 84 |
+
if "no api" in error_str or "not found" in error_str or "404" in error_str:
|
| 85 |
+
# Model doesn't have Inference API - provide helpful demo response
|
| 86 |
+
return generate_demo_response(message)
|
| 87 |
+
elif "401" in error_str or "unauthorized" in error_str:
|
| 88 |
+
return """β **Authentication Error**
|
| 89 |
+
|
| 90 |
+
Please check:
|
| 91 |
+
1. Your `HF_TOKEN` is set in Space secrets
|
| 92 |
+
2. You have requested access to [microsoft/Fara-7B](https://huggingface.co/microsoft/Fara-7B)
|
| 93 |
+
3. Your token has read permissions
|
| 94 |
+
|
| 95 |
+
To use Fara-7B locally instead:
|
| 96 |
+
```bash
|
| 97 |
+
git clone https://github.com/microsoft/fara.git
|
| 98 |
+
cd fara
|
| 99 |
+
pip install -e .
|
| 100 |
+
playwright install
|
| 101 |
+
vllm serve "microsoft/Fara-7B" --port 5000
|
| 102 |
+
```
|
| 103 |
+
"""
|
| 104 |
+
elif "403" in error_str or "forbidden" in error_str:
|
| 105 |
+
return """β **Access Forbidden**
|
| 106 |
+
|
| 107 |
+
You need to request access to the model:
|
| 108 |
+
1. Visit: https://huggingface.co/microsoft/Fara-7B
|
| 109 |
+
2. Click "Request access to this repository"
|
| 110 |
+
3. Wait for Microsoft to approve your request
|
| 111 |
+
|
| 112 |
+
Once approved, make sure your `HF_TOKEN` is set in Space secrets.
|
| 113 |
+
"""
|
| 114 |
+
else:
|
| 115 |
+
# Unknown error - try demo mode
|
| 116 |
+
return f"β οΈ API Error: {str(api_error)}\n\n**Demo Response:**\n\n" + generate_demo_response(message)
|
| 117 |
+
|
| 118 |
except Exception as e:
|
| 119 |
+
return f"β Error: {str(e)}\n\nPlease check the Space logs for more details."
|
| 120 |
|
| 121 |
+
def generate_demo_response(message):
|
|
|
|
| 122 |
"""
|
| 123 |
+
Generate a helpful demo response when the API is not available
|
| 124 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
message_lower = message.lower()
|
| 126 |
+
|
| 127 |
+
# Shopping/E-commerce tasks
|
| 128 |
+
if any(word in message_lower for word in ['buy', 'shop', 'purchase', 'order', 'cart', 'shoes', 'product']):
|
| 129 |
+
return """π **Task: Shopping/Purchase**
|
| 130 |
|
| 131 |
+
**Action Plan:**
|
| 132 |
+
1. π Navigate to e-commerce website
|
| 133 |
+
2. π Search for: [extracted product from your query]
|
| 134 |
+
3. π Apply filters: price, rating, availability
|
| 135 |
+
4. β
Select best match
|
| 136 |
+
5. β Add to cart
|
| 137 |
+
6. π **STOP** - Critical Point: Checkout requires payment info
|
| 138 |
+
|
| 139 |
+
**What I would do with a screenshot:**
|
| 140 |
+
- Identify search bar location
|
| 141 |
+
- Read product listings
|
| 142 |
+
- Click appropriate buttons
|
| 143 |
+
- Navigate to cart
|
| 144 |
+
|
| 145 |
+
**Next steps for you:**
|
| 146 |
+
- Review cart
|
| 147 |
+
- Complete checkout manually
|
| 148 |
+
|
| 149 |
+
π‘ *Note: The Inference API may not be available for this model. For full functionality, host locally with vLLM.*
|
| 150 |
+
"""
|
| 151 |
|
| 152 |
+
# Travel/booking tasks
|
| 153 |
+
elif any(word in message_lower for word in ['flight', 'hotel', 'travel', 'book', 'trip']):
|
| 154 |
+
return """βοΈ **Task: Travel Booking**
|
| 155 |
+
|
| 156 |
+
**Action Plan:**
|
| 157 |
+
1. π Navigate to travel site
|
| 158 |
+
2. π
Enter dates and destination
|
| 159 |
+
3. π Search options
|
| 160 |
+
4. π° Sort by price/rating
|
| 161 |
+
5. π Compare top results
|
| 162 |
+
6. π **STOP** - Critical Point: Booking requires personal info
|
| 163 |
+
|
| 164 |
+
**What I would do with a screenshot:**
|
| 165 |
+
- Find date pickers
|
| 166 |
+
- Enter search criteria
|
| 167 |
+
- Click search button
|
| 168 |
+
- Read results table
|
| 169 |
+
|
| 170 |
+
**Next steps for you:**
|
| 171 |
+
- Review options
|
| 172 |
+
- Complete booking manually
|
| 173 |
+
|
| 174 |
+
π‘ *Note: The Inference API may not be available for this model. For full functionality, host locally with vLLM.*
|
| 175 |
+
"""
|
| 176 |
|
| 177 |
+
# Restaurant tasks
|
| 178 |
+
elif any(word in message_lower for word in ['restaurant', 'food', 'dining', 'reservation']):
|
| 179 |
+
return """π½οΈ **Task: Restaurant Search**
|
| 180 |
+
|
| 181 |
+
**Action Plan:**
|
| 182 |
+
1. π Search for restaurants
|
| 183 |
+
2. π Filter by location and cuisine
|
| 184 |
+
3. β Check ratings and reviews
|
| 185 |
+
4. π Find contact info
|
| 186 |
+
5. π **STOP** - Critical Point: Reservation requires personal info
|
| 187 |
+
|
| 188 |
+
**What I would do with a screenshot:**
|
| 189 |
+
- Identify search results
|
| 190 |
+
- Read restaurant details
|
| 191 |
+
- Extract phone number
|
| 192 |
+
- Locate reservation link
|
| 193 |
+
|
| 194 |
+
**Next steps for you:**
|
| 195 |
+
- Call or book reservation manually
|
| 196 |
+
|
| 197 |
+
π‘ *Note: The Inference API may not be available for this model. For full functionality, host locally with vLLM.*
|
| 198 |
+
"""
|
| 199 |
+
|
| 200 |
+
# Government/grants (your specific use case!)
|
| 201 |
+
elif any(word in message_lower for word in ['grant', 'funding', 'government', 'nsw', 'healthcare']):
|
| 202 |
+
return """ποΈ **Task: Government Grants Research**
|
| 203 |
+
|
| 204 |
+
**Action Plan:**
|
| 205 |
+
1. π Navigate to government grants portal
|
| 206 |
+
2. π Use search functionality
|
| 207 |
+
3. π Filter by: healthcare, eligibility, deadline
|
| 208 |
+
4. π Extract grant information
|
| 209 |
+
5. β
**COMPLETE** - No Critical Point
|
| 210 |
+
|
| 211 |
+
**What I would do with a screenshot:**
|
| 212 |
+
- Locate search bar
|
| 213 |
+
- Read grant listings
|
| 214 |
+
- Extract key details:
|
| 215 |
+
- Grant title
|
| 216 |
+
- Funding amount
|
| 217 |
+
- Eligibility criteria
|
| 218 |
+
- Application deadline
|
| 219 |
+
- Contact information
|
| 220 |
+
|
| 221 |
+
**Example output:**
|
| 222 |
+
```
|
| 223 |
+
Grant: Healthcare Innovation Fund
|
| 224 |
+
Amount: $50,000 - $500,000
|
| 225 |
+
Eligibility: Registered healthcare providers
|
| 226 |
+
Deadline: March 31, 2024
|
| 227 |
+
Link: [grant URL]
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
+
π‘ *Note: The Inference API may not be available for this model. For full functionality, host locally with vLLM.*
|
| 231 |
+
"""
|
| 232 |
+
|
| 233 |
+
# General response
|
| 234 |
+
else:
|
| 235 |
+
return """π€ **Fara-7B Web Automation Agent**
|
| 236 |
+
|
| 237 |
+
I help with web automation tasks! I can:
|
| 238 |
+
|
| 239 |
+
β
Shopping & e-commerce
|
| 240 |
+
β
Travel & booking
|
| 241 |
+
β
Restaurant search
|
| 242 |
+
β
Information extraction
|
| 243 |
+
β
Government portals & grants
|
| 244 |
+
β
Account navigation
|
| 245 |
+
|
| 246 |
+
**How I work:**
|
| 247 |
+
1. πΈ Analyze browser screenshot (when provided)
|
| 248 |
+
2. π― Understand your goal
|
| 249 |
+
3. π Plan step-by-step actions
|
| 250 |
+
4. π§ Use browser tools (click, type, scroll)
|
| 251 |
+
5. π Stop at Critical Points (checkout, personal info)
|
| 252 |
+
|
| 253 |
+
**Example tasks:**
|
| 254 |
+
- "Find running shoes under $100"
|
| 255 |
+
- "Search for flights to Tokyo"
|
| 256 |
+
- "Find healthcare grants on the NSW government website"
|
| 257 |
+
- "Look up Italian restaurants in Seattle"
|
| 258 |
+
|
| 259 |
+
**To use with screenshots:**
|
| 260 |
+
Upload a browser screenshot and describe your task!
|
| 261 |
+
|
| 262 |
+
π‘ *Note: The Inference API may not be available for this model. For full functionality, host locally with vLLM:*
|
| 263 |
+
```bash
|
| 264 |
+
vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto
|
| 265 |
+
```
|
| 266 |
+
"""
|
| 267 |
|
| 268 |
# Create the Gradio interface
|
| 269 |
+
with gr.Blocks(theme=gr.themes.Soft(), title="Fara-7B Chat") as demo:
|
| 270 |
gr.Markdown(
|
| 271 |
"""
|
| 272 |
+
# π€ Fara-7B Web Automation Agent
|
| 273 |
|
| 274 |
+
**Microsoft's specialized vision-language model for web automation**
|
| 275 |
|
| 276 |
+
Fara-7B can analyze browser screenshots and plan web automation tasks.
|
|
|
|
|
|
|
|
|
|
| 277 |
|
| 278 |
+
π‘ **How to use:**
|
| 279 |
+
- Upload a browser screenshot (optional)
|
| 280 |
+
- Describe your web automation task
|
| 281 |
+
- Fara-7B will plan the actions needed
|
| 282 |
+
|
| 283 |
+
β οΈ **Note**: The Inference API may not be fully available for this model. For complete functionality including actual browser control, host locally with vLLM (see instructions below).
|
| 284 |
"""
|
| 285 |
)
|
| 286 |
|
| 287 |
+
with gr.Accordion("π About Fara-7B & Setup Instructions", open=False):
|
|
|
|
| 288 |
gr.Markdown("""
|
| 289 |
+
### What is Fara-7B?
|
| 290 |
+
|
| 291 |
+
Fara-7B is a 7B parameter vision-language model designed for computer use. It can:
|
| 292 |
+
- Understand browser screenshots
|
| 293 |
+
- Plan multi-step web automation tasks
|
| 294 |
+
- Use tools (click, type, scroll, etc.)
|
| 295 |
+
- Stop at "Critical Points" for safety
|
| 296 |
|
| 297 |
+
### Using Transformers Library (Colab/Local)
|
| 298 |
+
|
| 299 |
+
```python
|
| 300 |
+
from transformers import pipeline
|
| 301 |
+
|
| 302 |
+
pipe = pipeline("image-text-to-text", model="microsoft/Fara-7B")
|
| 303 |
+
messages = [
|
| 304 |
+
{
|
| 305 |
+
"role": "user",
|
| 306 |
+
"content": [
|
| 307 |
+
{"type": "image", "url": "screenshot.jpg"},
|
| 308 |
+
{"type": "text", "text": "Find running shoes"}
|
| 309 |
+
]
|
| 310 |
+
},
|
| 311 |
+
]
|
| 312 |
+
result = pipe(text=messages)
|
| 313 |
+
```
|
| 314 |
+
|
| 315 |
+
### Full Browser Automation (Local)
|
| 316 |
+
|
| 317 |
+
```bash
|
| 318 |
+
# Clone repository
|
| 319 |
+
git clone https://github.com/microsoft/fara.git
|
| 320 |
+
cd fara
|
| 321 |
+
|
| 322 |
+
# Setup environment
|
| 323 |
+
python3 -m venv .venv
|
| 324 |
+
source .venv/bin/activate
|
| 325 |
+
pip install -e .
|
| 326 |
+
playwright install
|
| 327 |
+
|
| 328 |
+
# Host model
|
| 329 |
+
vllm serve "microsoft/Fara-7B" --port 5000 --dtype auto
|
| 330 |
+
|
| 331 |
+
# Run tasks
|
| 332 |
+
fara-cli --task "your task here"
|
| 333 |
+
```
|
| 334 |
+
|
| 335 |
+
**Resources:**
|
| 336 |
+
- Model: https://huggingface.co/microsoft/Fara-7B
|
| 337 |
+
- GitHub: https://github.com/microsoft/fara
|
| 338 |
""")
|
| 339 |
|
| 340 |
chatbot = gr.Chatbot(
|
| 341 |
height=500,
|
| 342 |
+
label="Chat",
|
| 343 |
+
show_label=True,
|
| 344 |
+
type="messages"
|
| 345 |
)
|
| 346 |
|
| 347 |
with gr.Row():
|
| 348 |
+
with gr.Column(scale=3):
|
| 349 |
+
msg = gr.Textbox(
|
| 350 |
+
label="Task Description",
|
| 351 |
+
placeholder="Example: Find healthcare grants on the NSW government website...",
|
| 352 |
+
lines=2
|
| 353 |
+
)
|
| 354 |
+
with gr.Column(scale=1):
|
| 355 |
+
image_input = gr.Image(
|
| 356 |
+
label="Browser Screenshot (Optional)",
|
| 357 |
+
type="pil",
|
| 358 |
+
height=100
|
| 359 |
+
)
|
| 360 |
|
| 361 |
with gr.Row():
|
| 362 |
+
send_btn = gr.Button("Send", variant="primary")
|
| 363 |
clear_btn = gr.Button("Clear Chat")
|
|
|
|
| 364 |
|
| 365 |
+
gr.Markdown("""
|
| 366 |
+
### π‘ Tips for Best Results
|
| 367 |
|
| 368 |
+
- **With screenshot**: Upload a browser screenshot and describe what you want to accomplish
|
| 369 |
+
- **Without screenshot**: Describe the web task, and Fara-7B will plan the approach
|
| 370 |
+
- **Be specific**: Include details like website, search criteria, budget, etc.
|
| 371 |
+
- **Critical Points**: Fara-7B will stop before checkout, booking, or entering personal info
|
| 372 |
|
| 373 |
+
### π― Example Tasks
|
| 374 |
+
|
| 375 |
+
- "Find healthcare grants for digital health projects in Australia"
|
| 376 |
+
- "Search for running shoes under $100 on this e-commerce page"
|
| 377 |
+
- "Look up restaurants in Seattle with 4+ stars for Italian food"
|
| 378 |
+
- "Find the contact information on this website"
|
| 379 |
+
""")
|
| 380 |
+
|
| 381 |
+
def respond(message, image, chat_history):
|
| 382 |
+
if not message.strip():
|
| 383 |
+
return chat_history, None
|
| 384 |
+
|
| 385 |
+
# Add user message to history
|
| 386 |
+
user_msg = {"role": "user", "content": message}
|
| 387 |
+
chat_history.append(user_msg)
|
| 388 |
+
|
| 389 |
+
# Get response from Fara
|
| 390 |
+
response = chat_with_fara(message, chat_history, image)
|
| 391 |
+
|
| 392 |
+
# Add assistant response to history
|
| 393 |
+
assistant_msg = {"role": "assistant", "content": response}
|
| 394 |
+
chat_history.append(assistant_msg)
|
| 395 |
+
|
| 396 |
+
return chat_history, None
|
| 397 |
+
|
| 398 |
+
def clear_chat():
|
| 399 |
+
return [], None
|
| 400 |
+
|
| 401 |
+
msg.submit(respond, [msg, image_input, chatbot], [chatbot, image_input]).then(
|
| 402 |
+
lambda: ("", None), None, [msg, image_input]
|
| 403 |
+
)
|
| 404 |
+
send_btn.click(respond, [msg, image_input, chatbot], [chatbot, image_input]).then(
|
| 405 |
+
lambda: ("", None), None, [msg, image_input]
|
| 406 |
+
)
|
| 407 |
+
clear_btn.click(clear_chat, outputs=[chatbot, image_input])
|
| 408 |
|
| 409 |
if __name__ == "__main__":
|
|
|
|
|
|
|
| 410 |
demo.launch()
|
requirements.txt
CHANGED
|
@@ -1,2 +1,3 @@
|
|
| 1 |
gradio==5.0.2
|
| 2 |
-
huggingface-hub==0.26.2
|
|
|
|
|
|
| 1 |
gradio==5.0.2
|
| 2 |
+
huggingface-hub==0.26.2
|
| 3 |
+
Pillow
|