---
license: apache-2.0
base_model: ByteDance-Seed/UI-TARS-1.5-7B
tags:
- vision
- web-agents
- browser-automation
- websight
library_name: transformers
pipeline_tag: image-text-to-text
---

# Websight-7B (Merged)

This is a merged version of the Websight-7B model, ready for deployment and inference.

## Model Details

- **Base Model**: ByteDance-Seed/UI-TARS-1.5-7B
- **Source PEFT Model**: Asanshay/websight-7B (previous model saved here)
- **Model Type**: Vision-Language Model for Web Agent Tasks
- **License**: Apache 2.0

## Usage

```python
from transformers import pipeline

# Load the model
pipe = pipeline("image-text-to-text", model="tanvirb/websight-7B")

# Use for web agent tasks
result = pipe(text="Click the login button", images=[screenshot])
```

## Deployment

This model is ready for:
- Hugging Face Inference Endpoints
- Local inference
- Integration with web automation pipelines

## Training

This model was fine-tuned using PEFT (Parameter Efficient Fine-Tuning) techniques on web interaction data.