Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
6.2.0
metadata
title: JoyCaption Reliable
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
π JoyCaption Reliable
Ultra-optimized JoyCaption for ZeroGPU - No more stuck generations!
This is a streamlined version of JoyCaption designed specifically for reliable performance on Hugging Face's ZeroGPU infrastructure. It prioritizes consistency and speed over advanced features.
β Key Optimizations
- 45-second GPU limit - Prevents ZeroGPU timeouts
- Aggressive memory cleanup - Immediate model deletion after each generation
- Fast loading - Optimized with
low_cpu_mem_usage=True - Progress tracking - Timestamps show exactly where processing is at
- Emergency cleanup - Graceful error handling with memory clearing
π― Features
- Multiple Styles: Engaging, Descriptive, SEO-Friendly, Creative
- Length Control: Short (100 tokens), Medium (200 tokens), Long (300 tokens)
- Fast Processing: Typically completes in 15-25 seconds
- No Freezing: Designed to avoid the common ZeroGPU stuck generation issue
π Performance
- Loading: 5-10 seconds
- Generation: 10-20 seconds
- Total Time: 15-30 seconds
- Memory Usage: Aggressively cleaned after each request
π‘ Why This Version is More Reliable
Unlike complex dual-model setups that can timeout or freeze, this version:
- Uses only the JoyCaption model (no secondary Venice model)
- Limits GPU duration to prevent ZeroGPU timeouts
- Performs immediate cleanup to prevent memory issues
- Has simplified prompts for faster processing
- Includes progress timestamps to track performance
π§ Technical Details
- Model:
fancyfeast/llama-joycaption-beta-one-hf-llava - Framework: Transformers + PyTorch
- Optimization:
torch.bfloat16,device_map="auto" - GPU Duration: 45 seconds maximum
- Token Limits: 100-300 based on length setting
π Trade-offs
Gained:
- β Consistent, reliable performance
- β Fast loading and generation
- β No stuck generations or timeouts
- β Predictable timing
Sacrificed:
- β No secondary Venice model integration
- β No advanced keyword injection
- β No complex correction systems
- β Reduced maximum output length
This version is perfect if you want reliable, fast captions without the complexity and potential issues of multi-model systems.
π¨ Caption Styles
- Engaging: Creative, captivating descriptions that avoid "A photo of"
- Descriptive: Focused on people, poses, clothing, and setting details
- SEO-Friendly: Optimized for search with engaging language
- Creative: Witty captions with interesting, unique language
Perfect for content creators, social media managers, and anyone who needs consistent, quality image captions without waiting or worrying about system freezes!