TheStageAI
/

Elastic-stable-diffusion-3.5-large

Text-to-Image

Model card Files Files and versions

xet

Community

psynote123 commited on Oct 14, 2025

Commit

3c0eac4

verified ·

1 Parent(s): bbbadd5

Update README.md

Browse files

Files changed (1) hide show

README.md +125 -3

README.md CHANGED Viewed

@@ -1,3 +1,125 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+base_model:
+- stabilityai/stable-diffusion-3.5-large
+base_model_relation: quantized
+pipeline_tag: text-to-image
+---
+# Elastic model: Fastest self-serving models. Stable Diffusion 3.5 Large.
+Elastic models are the models produced by TheStage AI ANNA: Automated Neural Networks Accelerator. ANNA allows you to control model size, latency and quality with a simple slider movement. For each model, ANNA produces a series of optimized models:
+* __XL__: Mathematically equivalent neural network, optimized with our DNN compiler.
+* __L__: Near lossless model, with less than 1% degradation obtained on corresponding benchmarks.
+* __M__: Faster model, with accuracy degradation less than 1.5%.
+* __S__: The fastest model, with accuracy degradation less than 2%.
+__Goals of Elastic Models:__
+* Provide the fastest models and service for self-hosting.
+* Provide flexibility in cost vs quality selection for inference.
+* Provide clear quality and latency benchmarks.
+* Provide interface of HF libraries: transformers and diffusers with a single line of code.
+* Provide models supported on a wide range of hardware, which are pre-compiled and require no JIT.
+> It's important to note that specific quality degradation can vary from model to model. For instance, with an S model, you can have 0.5% degradation as well.
+-----
+## Inference
+Currently, our demo model supports 1024x1024 and batch sizes 1-8. This will be updated in the near future.
+To infer our models, you just need to replace `diffusers` import with `elastic_models.diffusers`:
+```python
+import torch
+from elastic_models.diffusers import StableDiffusion3Pipeline
+model_name = 'stabilityai/stable-diffusion-3.5-large'
+hf_token = ''
+device = torch.device("cuda")
+pipeline = StableDiffusion3Pipeline.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    token=hf_token,
+    mode='S'
+)
+pipeline.to(device)
+prompts = ["A cat holding a sign that says hello world"]
+output = pipeline(prompt=prompts)
+for prompt, output_image in zip(prompts, output.images):
+    output_image.save((prompt.replace(' ', '_') + '.png'))
+```
+### Installation
+__System requirements:__
+* GPUs: H100, B200
+* CPU: AMD, Intel
+* Python: 3.10-3.12
+To work with our models just run these lines in your terminal:
+```shell
+pip install thestage
+pip install 'thestage-elastic-models[nvidia]' --extra-index-url https://thestage.jfrog.io/artifactory/api/pypi/pypi-thestage-ai-production/simple
+# or for blackwell support
+pip install 'thestage-elastic-models[blackwell]' --extra-index-url https://thestage.jfrog.io/artifactory/api/pypi/pypi-thestage-ai-production/simple
+pip install -U --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128
+pip install -U --pre torchvision --index-url https://download.pytorch.org/whl/nightly/cu128
+pip install flash_attn==2.7.3 --no-build-isolation
+pip uninstall apex
+```
+Then go to [app.thestage.ai](https://app.thestage.ai), login and generate API token from your profile page. Set up API token as follows:
+```shell
+thestage config set --api-token <YOUR_API_TOKEN>
+```
+Congrats, now you can use accelerated models!
+----
+## Benchmarks
+Benchmarking is one of the most important procedures during model acceleration. We aim to provide clear performance metrics for models using our algorithms.
+### Quality benchmarks
+For quality evaluation we have used: PSNR, SSIM and CLIP score. PSNR and SSIM were computed using outputs of original model.
+| Metric/Model  | S | M | L | XL | Original |
+|---------------|---|---|---|----|----------|
+| PSNR          | TBD | TBD | TBD | inf  | inf |
+| SSIM          | TBD | TBD | TBD | 1.0  | 1.0 |
+| CLIP          | TBD | TBD | TBD | TBD  | TBD|
+### Latency benchmarks
+Time in seconds to generate one image 1024x1024
+| GPU/Model | S   | M | L | XL | Original |
+|-----------|-----|---|---|----|----------|
+| H100      | TBD | TBD | TBD | 3.80  | 6.55 |
+## Links
+* __Platform__: [app.thestage.ai](https://app.thestage.ai)
+<!-- * __Elastic models Github__: [app.thestage.ai](app.thestage.ai) -->
+* __Subscribe for updates__: [TheStageAI X](https://x.com/TheStageAI)
+* __Contact email__: contact@thestage.ai