MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Paper • 2410.08182 • Published
Contributors who are invited to beta-test our next big feature! Contact us if you want to join this team :-)
torchaoInt8WeightOnlyConfig is already working flawlessly in our tests.import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_
pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)
@spaces.GPU
def generate(prompt: str):
return pipeline(prompt).images[0]medium size is now available as a power-user featurelarge (70GB VRAM)—but this paves the way for:medium will offer significantly more usage than large)xlarge size (141GB VRAM)auto (future default)mediumlarge (current default)largemedium