ImageStudio

Runtime error

ImageStudio Maintainer Claude Opus 4.8 (1M context) commited on 7 days ago

Commit

3cc90b8

1 Parent(s): c5fc943

swap assistant VLM -> ccharnkij/Qwen3.5-9B-Uncensored (test explicit-rating)

Gemma 4 E4B under-rated explicit images (capped ~2_STRONG_SUGGESTIVE). Try a
9B uncensored Qwen3.5 VL instead. Vision-capable; thinking model (reasoning
toggle meaningful again). Its generation_config eos omits <|im_end|>, but the
model-agnostic _resolve_vlm_eos_ids unions tokenizer eos + turn terminators so
stopping holds. ~18.8GB bf16 — heaviest assistant; watch GPU memory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Files changed (1) hide show

app.py +10 -6

app.py CHANGED Viewed

@@ -145,10 +145,14 @@ print("Pipelines loaded!")
 # describing a reference image. Loaded once, bf16 — fits alongside the two
 # diffusion pipelines on the shared GPU.
 # -----------------------------------------------------------------------------
-# Previous model (Qwen3.5-4B) — kept for easy revert. It shipped no
-# generation_config.json (needed manual eos pinning, see _resolve_vlm_eos_ids).
-# VLM_MODEL_ID = "rodrigomt/Qwen3.5-4B-Uncensored-Aggressive"
-VLM_MODEL_ID = "prithivMLmods/gemma-4-E4B-it-Uncensored-MAX"
 print(f"Loading assistant: {VLM_MODEL_ID} ...")
 vlm_processor = AutoProcessor.from_pretrained(
@@ -1329,7 +1333,7 @@ with gr.Blocks(fill_height=True) as demo:
         with gr.Tab("💬 Prompt Assistant"):
             gr.Markdown(
-                "Ask **Gemma 4 E4B** a single question — with or without an image. "
                 "Turn a rough idea into a rich prompt, or upload a reference image and "
                 "ask the model to describe it as a prompt."
             )
@@ -1350,7 +1354,7 @@ with gr.Blocks(fill_height=True) as demo:
                         choices=["Off", "On"],
                         value="Off",
                         label="🧠 Reasoning",
-                        info="Direct answers are best for prompts. (Reasoning mode only applies to thinking-capable models; Gemma 4 answers directly.)",
                     )
                     with gr.Accordion("⚙️ Settings", open=False):
                         vlm_max_tokens = gr.Slider(

 # describing a reference image. Loaded once, bf16 — fits alongside the two
 # diffusion pipelines on the shared GPU.
 # -----------------------------------------------------------------------------
+# Prior models, kept for easy revert:
+#   "rodrigomt/Qwen3.5-4B-Uncensored-Aggressive"   # no generation_config; needed eos pinning
+#   "prithivMLmods/gemma-4-E4B-it-Uncensored-MAX"  # under-rated explicit content (rating capped ~2)
+# Current: 9B uncensored Qwen3.5 VL. Vision-capable (image-text-to-text). NOTE its
+# generation_config eos is just <|endoftext|> and omits the chat terminator
+# <|im_end|>; _resolve_vlm_eos_ids() unions both so stopping still works. ~18.8 GB
+# bf16 — heaviest assistant so far; watch GPU memory alongside the diffusers.
+VLM_MODEL_ID = "ccharnkij/Qwen3.5-9B-Uncensored"
 print(f"Loading assistant: {VLM_MODEL_ID} ...")
 vlm_processor = AutoProcessor.from_pretrained(
         with gr.Tab("💬 Prompt Assistant"):
             gr.Markdown(
+                "Ask **Qwen3.5-9B** a single question — with or without an image. "
                 "Turn a rough idea into a rich prompt, or upload a reference image and "
                 "ask the model to describe it as a prompt."
             )
                         choices=["Off", "On"],
                         value="Off",
                         label="🧠 Reasoning",
+                        info="Off: direct answer, best for prompts • On: think step-by-step first (slower, raise max tokens)",
                     )
                     with gr.Accordion("⚙️ Settings", open=False):
                         vlm_max_tokens = gr.Slider(