Spaces:

SudaisKhan211
/

Virtual-Try-on

Running

App Files Files Community

Virtual-Try-on / tryon /api /README.md

sudais14446

initial commit

83039b5 2 months ago

preview code

raw

history blame contribute delete

18.6 kB

OpenTryOn API Adapters

The tryon.api module provides adapters for various virtual try-on and image generation APIs. These adapters offer a unified interface for interacting with different cloud-based AI services, making it easy to switch between providers or use multiple services in your applications.

Overview
Virtual Try-On APIs
Image Generation APIs
- Nano Banana (Gemini 2.5 Flash Image)
- Nano Banana Pro (Gemini 3 Pro Image Preview)
Installation
Quick Start
API Comparison
Common Patterns
Error Handling
Best Practices

Overview

The API adapters in this module follow a consistent design pattern:

Unified Interface: All adapters provide similar methods (generate(), generate_and_decode()) for consistency
Flexible Input: Support for file paths, URLs, PIL Images, file-like objects, and base64 strings
Automatic Handling: Automatic image format conversion and validation
Error Handling: Comprehensive error messages and validation
Environment Variables: Support for configuration via environment variables

Virtual Try-On APIs

Amazon Nova Canvas

Amazon Nova Canvas provides virtual try-on capabilities through AWS Bedrock, allowing you to combine a source image (person) with a reference image (garment) to create realistic try-on results.

Features:

Automatic garment detection and masking
Custom mask image support
Multiple garment classes (Upper body, Lower body, Full body, Footwear)
AWS region support (us-east-1, ap-northeast-1, eu-west-1)
Maximum image size: 4.1M pixels (2048x2048)

Reference: AWS Blog Post

Prerequisites:

AWS account with Bedrock access
Nova Canvas model enabled in AWS Bedrock console
AWS credentials configured (via .env or AWS CLI)

Example:

from tryon.api import AmazonNovaCanvasVTONAdapter

# Initialize adapter
adapter = AmazonNovaCanvasVTONAdapter(region="us-east-1")

# Generate virtual try-on
images = adapter.generate_and_decode(
    source_image="person.jpg",
    reference_image="shirt.jpg",
    mask_type="GARMENT",  # Options: "GARMENT", "IMAGE"
    garment_class="UPPER_BODY"  # Options: "UPPER_BODY", "LOWER_BODY", "FULL_BODY", "FOOTWEAR"
)

# Save results
for idx, image in enumerate(images):
    image.save(f"result_{idx}.png")

Environment Variables:

AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AMAZON_NOVA_REGION=us-east-1  # Optional
AMAZON_NOVA_MODEL_ID=amazon.nova-canvas-v1:0  # Optional

Kling AI

Kling AI provides virtual try-on capabilities through their Kolors API, combining a human image with a cloth image to generate realistic try-on results with automatic asynchronous processing.

Features:

Asynchronous task processing with automatic polling
Multiple model versions (v1, v1-5)
Maximum image size: 16M pixels (4096x4096)
Webhook support for async results
Regional endpoint support

Reference: Kling AI API Documentation

Prerequisites:

Kling AI account
API key and secret key from Kling AI Developer Portal

Example:

from tryon.api import KlingAIVTONAdapter

# Initialize adapter
adapter = KlingAIVTONAdapter()

# Generate virtual try-on (automatically polls until completion)
images = adapter.generate_and_decode(
    source_image="person.jpg",
    reference_image="shirt.jpg",
    model="kolors-virtual-try-on-v1-5"  # Optional
)

# Save results
images[0].save("result.png")

Environment Variables:

KLING_AI_API_KEY=your_api_key
KLING_AI_SECRET_KEY=your_secret_key
KLING_AI_BASE_URL=https://api-singapore.klingai.com  # Optional

Model Versions:

kolors-virtual-try-on-v1: Original model version
kolors-virtual-try-on-v1-5: Enhanced version (recommended)

Segmind

Segmind provides virtual try-on capabilities through their Try-On Diffusion API, combining a model image (person) with a cloth image (garment) to create realistic try-on results.

Features:

Fast synchronous processing
Multiple garment categories (Upper body, Lower body, Dress)
Customizable inference parameters (steps, guidance scale, seed)
Simple API key authentication

Reference: Segmind Try-On Diffusion API

Prerequisites:

Segmind account
API key from Segmind API Portal

Example:

from tryon.api import SegmindVTONAdapter

# Initialize adapter
adapter = SegmindVTONAdapter()

# Generate virtual try-on
images = adapter.generate_and_decode(
    model_image="person.jpg",
    cloth_image="shirt.jpg",
    category="Upper body",  # Options: "Upper body", "Lower body", "Dress"
    num_inference_steps=35,  # Optional: 20-100, default: 25
    guidance_scale=2.5,  # Optional: 1-25, default: 2
    seed=42  # Optional: -1 to 999999999999999, default: -1
)

# Save results
images[0].save("result.png")

Environment Variables:

SEGMIND_API_KEY=your_api_key

Garment Categories:

"Upper body": Tops, shirts, jackets, hoodies (default)
"Lower body": Pants, skirts, shorts
"Dress": Dresses, jumpsuits

Inference Parameters:

num_inference_steps: Number of denoising steps (20-100, default: 25)
guidance_scale: Classifier-free guidance scale (1-25, default: 2)
seed: Random seed for reproducibility (-1 for random, default: -1)

Image Generation APIs

Nano Banana (Gemini 2.5 Flash Image)

Nano Banana is Google's fast and efficient image generation model, optimized for high-volume, low-latency tasks. It generates images at 1024px resolution.

Features:

Text-to-image generation
Image editing (image + text to image)
Multi-image composition and style transfer
Batch generation support
Multiple aspect ratios (10 options)
Fast generation times

Reference: Gemini Image Generation Documentation

Prerequisites:

Google Gemini API key
google-genai Python package

Example:

from tryon.api.nano_banana import NanoBananaAdapter

# Initialize adapter
adapter = NanoBananaAdapter()

# Text-to-image
images = adapter.generate_text_to_image(
    prompt="A nano banana dish in a fancy restaurant with a Gemini theme",
    aspect_ratio="16:9"  # Optional
)

# Image editing
images = adapter.generate_image_edit(
    image="cat.jpg",
    prompt="Add a nano-banana to the scene"
)

# Multi-image composition
images = adapter.generate_multi_image(
    images=["image1.jpg", "image2.jpg"],
    prompt="Combine these images with a Gemini theme"
)

# Batch generation
results = adapter.generate_batch([
    "Prompt 1",
    "Prompt 2",
    "Prompt 3"
])

# Save results
images[0].save("result.png")

Environment Variables:

GEMINI_API_KEY=your_api_key

Supported Aspect Ratios:

"1:1" (1024x1024)
"2:3" (832x1248)
"3:2" (1248x832)
"3:4" (864x1184)
"4:3" (1184x864)
"4:5" (896x1152)
"5:4" (1152x896)
"9:16" (768x1344)
"16:9" (1344x768)
"21:9" (1536x672)

Nano Banana Pro (Gemini 3 Pro Image Preview)

Nano Banana Pro is Google's advanced image generation model designed for professional asset production. It features real-world grounding using Google Search, default "Thinking" process, and can generate images up to 4K resolution.

Features:

Text-to-image generation with 1K/2K/4K resolution support
Image editing (image + text to image)
Multi-image composition and style transfer
Batch generation support
Search grounding (real-world grounding using Google Search)
High-fidelity text rendering
Up to 4K resolution output

Reference: Gemini Image Generation Documentation

Prerequisites:

Google Gemini API key
google-genai Python package

Example:

from tryon.api.nano_banana import NanoBananaProAdapter

# Initialize adapter
adapter = NanoBananaProAdapter()

# Text-to-image with 4K resolution
images = adapter.generate_text_to_image(
    prompt="A professional nano banana dish in a fancy restaurant",
    resolution="4K",  # Options: "1K", "2K", "4K"
    aspect_ratio="16:9",
    use_search_grounding=True  # Optional: Use Google Search for real-world grounding
)

# Image editing with 2K resolution
images = adapter.generate_image_edit(
    image="cat.jpg",
    prompt="Add a nano-banana to the scene",
    resolution="2K"
)

# Multi-image composition
images = adapter.generate_multi_image(
    images=["image1.jpg", "image2.jpg"],
    prompt="Combine these images with a Gemini theme",
    resolution="2K"
)

# Batch generation
results = adapter.generate_batch(
    prompts=["Prompt 1", "Prompt 2", "Prompt 3"],
    resolution="2K"
)

# Save results
images[0].save("result.png")

Environment Variables:

GEMINI_API_KEY=your_api_key

Supported Resolutions:

"1K": Standard resolution (varies by aspect ratio)
"2K": High resolution (varies by aspect ratio)
"4K": Ultra-high resolution (varies by aspect ratio)

Supported Aspect Ratios: Same as Nano Banana (10 options), with resolution-specific dimensions.

Installation

Core Dependencies

pip install pillow requests

Provider-Specific Dependencies

Amazon Nova Canvas:

pip install boto3

Kling AI:

pip install PyJWT

Nano Banana / Nano Banana Pro:

pip install google-genai

All Dependencies

pip install boto3 PyJWT google-genai

Quick Start

1. Set Up Environment Variables

Create a .env file in your project root:

# Amazon Nova Canvas
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AMAZON_NOVA_REGION=us-east-1

# Kling AI
KLING_AI_API_KEY=your_api_key
KLING_AI_SECRET_KEY=your_secret_key

# Segmind
SEGMIND_API_KEY=your_api_key

# Nano Banana / Nano Banana Pro
GEMINI_API_KEY=your_api_key

2. Basic Usage

from dotenv import load_dotenv
load_dotenv()

from tryon.api import (
    AmazonNovaCanvasVTONAdapter,
    KlingAIVTONAdapter,
    SegmindVTONAdapter
)
from tryon.api.nano_banana import NanoBananaAdapter, NanoBananaProAdapter

# Virtual Try-On
adapter = SegmindVTONAdapter()
images = adapter.generate_and_decode(
    model_image="person.jpg",
    cloth_image="shirt.jpg",
    category="Upper body"
)
images[0].save("vton_result.png")

# Image Generation
image_adapter = NanoBananaAdapter()
images = image_adapter.generate_text_to_image(
    prompt="A nano banana dish in a fancy restaurant"
)
images[0].save("generated_image.png")

API Comparison

Feature	Amazon Nova Canvas	Kling AI	Segmind	Nano Banana	Nano Banana Pro
Type	Virtual Try-On	Virtual Try-On	Virtual Try-On	Image Generation	Image Generation
Processing	Synchronous	Asynchronous	Synchronous	Synchronous	Synchronous
Max Resolution	2048x2048	4096x4096	Varies	1024px	4K (4096px+)
Mask Support	Yes (GARMENT/IMAGE)	No	No	N/A	N/A
Garment Classes	Yes (4 types)	No	Yes (3 types)	N/A	N/A
Batch Support	No	No	No	Yes	Yes
Image Editing	No	No	No	Yes	Yes
Multi-Image	No	No	No	Yes	Yes
Search Grounding	No	No	No	No	Yes
Cost Model	AWS Bedrock	Token-based	Per request	Token-based	Token-based
Latency	Medium	Medium-High	Low	Low	Medium

Common Patterns

Pattern 1: Try Multiple Providers

from tryon.api import (
    AmazonNovaCanvasVTONAdapter,
    KlingAIVTONAdapter,
    SegmindVTONAdapter
)

def try_multiple_providers(person_img, garment_img):
    results = {}
    
    # Try Segmind (fastest)
    try:
        adapter = SegmindVTONAdapter()
        results['segmind'] = adapter.generate_and_decode(
            model_image=person_img,
            cloth_image=garment_img,
            category="Upper body"
        )
    except Exception as e:
        print(f"Segmind failed: {e}")
    
    # Try Kling AI (best quality)
    try:
        adapter = KlingAIVTONAdapter()
        results['kling'] = adapter.generate_and_decode(
            source_image=person_img,
            reference_image=garment_img
        )
    except Exception as e:
        print(f"Kling AI failed: {e}")
    
    return results

Pattern 2: Image Preprocessing Pipeline

from tryon.api import SegmindVTONAdapter
from PIL import Image

def preprocess_and_generate(person_path, garment_path):
    # Load and preprocess images
    person_img = Image.open(person_path)
    garment_img = Image.open(garment_path)
    
    # Resize if needed
    person_img = person_img.resize((512, 768))
    garment_img = garment_img.resize((512, 512))
    
    # Generate
    adapter = SegmindVTONAdapter()
    images = adapter.generate_and_decode(
        model_image=person_img,
        cloth_image=garment_img,
        category="Upper body"
    )
    
    return images

Pattern 3: Batch Processing with Error Handling

from tryon.api.nano_banana import NanoBananaAdapter

def batch_generate_with_retry(prompts, max_retries=3):
    adapter = NanoBananaAdapter()
    results = []
    
    for prompt in prompts:
        for attempt in range(max_retries):
            try:
                images = adapter.generate_text_to_image(prompt)
                results.append(images)
                break
            except Exception as e:
                if attempt == max_retries - 1:
                    print(f"Failed after {max_retries} attempts: {e}")
                    results.append(None)
                else:
                    time.sleep(2 ** attempt)  # Exponential backoff
    
    return results

Pattern 4: Iterative Image Refinement

from tryon.api.nano_banana import NanoBananaProAdapter

def refine_image(initial_prompt, refinement_steps):
    adapter = NanoBananaProAdapter()
    current_image = None
    
    for step in refinement_steps:
        if current_image is None:
            # Initial generation
            images = adapter.generate_text_to_image(initial_prompt)
            current_image = images[0]
        else:
            # Refine existing image
            images = adapter.generate_image_edit(
                image=current_image,
                prompt=step
            )
            current_image = images[0]
    
    return current_image

Error Handling

All adapters provide comprehensive error handling:

from tryon.api import SegmindVTONAdapter

try:
    adapter = SegmindVTONAdapter()
    images = adapter.generate_and_decode(
        model_image="person.jpg",
        cloth_image="shirt.jpg"
    )
except ValueError as e:
    # Validation errors (missing API key, invalid parameters, etc.)
    print(f"Validation error: {e}")
except Exception as e:
    # API errors, network errors, etc.
    print(f"API error: {e}")

Common Error Types:

ValueError: Invalid parameters, missing credentials, validation errors
ImportError: Missing required dependencies
RuntimeError: API errors, network errors, timeout errors

Best Practices

1. Use Environment Variables

Always use environment variables for API keys and credentials:

import os
from tryon.api import SegmindVTONAdapter

# Good: Use environment variable
adapter = SegmindVTONAdapter()  # Reads from SEGMIND_API_KEY

# Bad: Hardcode API key
adapter = SegmindVTONAdapter(api_key="hardcoded_key")

2. Validate Images Before Processing

from PIL import Image
from tryon.api import AmazonNovaCanvasVTONAdapter

def validate_and_process(person_path, garment_path):
    # Validate images exist and are valid
    person_img = Image.open(person_path)
    garment_img = Image.open(garment_path)
    
    # Check dimensions
    adapter = AmazonNovaCanvasVTONAdapter()
    adapter.validate_image_size(person_img)
    adapter.validate_image_size(garment_img)
    
    # Process
    images = adapter.generate_and_decode(
        source_image=person_img,
        reference_image=garment_img
    )
    
    return images

3. Handle Asynchronous Processing

For Kling AI, be aware of asynchronous processing:

from tryon.api import KlingAIVTONAdapter

adapter = KlingAIVTONAdapter()

# This automatically polls until completion (default: 5 minutes)
images = adapter.generate_and_decode(
    source_image="person.jpg",
    reference_image="shirt.jpg"
)

# For custom polling, use generate() and poll manually
task_id = adapter.generate(...)  # Returns task_id
# Custom polling logic...

4. Optimize for Cost

Use Nano Banana (Flash) for high-volume, low-latency tasks
Use Nano Banana Pro only when you need 4K resolution or search grounding
Cache results when possible
Batch requests when supported

5. Error Recovery

from tryon.api import SegmindVTONAdapter
import time

def generate_with_retry(adapter, max_retries=3, backoff=2):
    for attempt in range(max_retries):
        try:
            return adapter.generate_and_decode(...)
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            time.sleep(backoff ** attempt)

Additional Resources

License

All material is made available under Creative Commons BY-NC 4.0.

Made with ❤️ by TryOn Labs

OpenTryOn API Adapters

Table of Contents

Overview

Virtual Try-On APIs

Amazon Nova Canvas

Kling AI

Segmind

Image Generation APIs

Nano Banana (Gemini 2.5 Flash Image)

Nano Banana Pro (Gemini 3 Pro Image Preview)

Installation

Core Dependencies

Provider-Specific Dependencies

All Dependencies

Quick Start

1. Set Up Environment Variables

2. Basic Usage

API Comparison

Common Patterns

Pattern 1: Try Multiple Providers

Pattern 2: Image Preprocessing Pipeline

Pattern 3: Batch Processing with Error Handling

Pattern 4: Iterative Image Refinement

Error Handling

Best Practices

1. Use Environment Variables

2. Validate Images Before Processing

3. Handle Asynchronous Processing

4. Optimize for Cost

5. Error Recovery

Additional Resources

License