websight-7B / README.md

tanvirb

bring back readme

0559a8f 5 months ago

preview code

raw

history blame contribute delete

1.03 kB

metadata

license: apache-2.0
base_model: ByteDance-Seed/UI-TARS-1.5-7B
tags:
  - vision
  - web-agents
  - browser-automation
  - websight
library_name: transformers
pipeline_tag: image-text-to-text

Websight-7B (Merged)

This is a merged version of the Websight-7B model, ready for deployment and inference.

Model Details

Base Model: ByteDance-Seed/UI-TARS-1.5-7B
Source PEFT Model: Asanshay/websight-7B (previous model saved here)
Model Type: Vision-Language Model for Web Agent Tasks
License: Apache 2.0

Usage

from transformers import pipeline

# Load the model
pipe = pipeline("image-text-to-text", model="tanvirb/websight-7B")

# Use for web agent tasks
result = pipe(text="Click the login button", images=[screenshot])

Deployment

This model is ready for:

Hugging Face Inference Endpoints
Local inference
Integration with web automation pipelines

Training

This model was fine-tuned using PEFT (Parameter Efficient Fine-Tuning) techniques on web interaction data.