Spaces:

yutengz
/

Action2Vision

Runtime error

App Files Files Community

Action2Vision / README.md

yutengz

Update README.md

e7a7448 verified 9 months ago

preview code

raw

history blame contribute delete

1.5 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: Action2Vision
emoji: 🤖
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false

Action2Vision: InstructPix2Pix Fine-tuning for Robotic Action Frame Prediction

GitHub: https://github.com/yutengzhang03/Action2Vision

Example

To use InstructPix2Pix, install diffusers using main for now. The pipeline will be available in the next release

pip install diffusers accelerate safetensors transformers

import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

model_id = "yutengz/Action2Vision"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, safety_checker=None)
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)


to_tensor = transforms.ToTensor()
resize = transforms.Resize((256, 256))

def download_image(url):
    image = PIL.Image.open(requests.get(url, stream=True).raw).convert("RGB").resize((256, 256))
    return image

url = "https://github.com/yutengzhang03/Action2Vision/blob/main/img/source.png"
image = download_image(url)
prompt = "There is a hammer and a block in the middle of the table. If the block is closer to the left robotic arm, it uses the left arm to pick up the hammer and strike the block; otherwise, it does the opposite."
images = pipe(prompt, image=image).images
images[0]