--- license: apache-2.0 base_model: ByteDance-Seed/UI-TARS-1.5-7B tags: - vision - web-agents - browser-automation - websight library_name: transformers pipeline_tag: image-text-to-text --- # Websight-7B (Merged) This is a merged version of the Websight-7B model, ready for deployment and inference. ## Model Details - **Base Model**: ByteDance-Seed/UI-TARS-1.5-7B - **Source PEFT Model**: Asanshay/websight-7B (previous model saved here) - **Model Type**: Vision-Language Model for Web Agent Tasks - **License**: Apache 2.0 ## Usage ```python from transformers import pipeline # Load the model pipe = pipeline("image-text-to-text", model="tanvirb/websight-7B") # Use for web agent tasks result = pipe(text="Click the login button", images=[screenshot]) ``` ## Deployment This model is ready for: - Hugging Face Inference Endpoints - Local inference - Integration with web automation pipelines ## Training This model was fine-tuned using PEFT (Parameter Efficient Fine-Tuning) techniques on web interaction data.