ImageStudio Maintainer Claude Opus 4.8 (1M context) commited on
Commit
3cc90b8
Β·
1 Parent(s): c5fc943

swap assistant VLM -> ccharnkij/Qwen3.5-9B-Uncensored (test explicit-rating)

Browse files

Gemma 4 E4B under-rated explicit images (capped ~2_STRONG_SUGGESTIVE). Try a
9B uncensored Qwen3.5 VL instead. Vision-capable; thinking model (reasoning
toggle meaningful again). Its generation_config eos omits <|im_end|>, but the
model-agnostic _resolve_vlm_eos_ids unions tokenizer eos + turn terminators so
stopping holds. ~18.8GB bf16 β€” heaviest assistant; watch GPU memory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. app.py +10 -6
app.py CHANGED
@@ -145,10 +145,14 @@ print("Pipelines loaded!")
145
  # describing a reference image. Loaded once, bf16 β€” fits alongside the two
146
  # diffusion pipelines on the shared GPU.
147
  # -----------------------------------------------------------------------------
148
- # Previous model (Qwen3.5-4B) β€” kept for easy revert. It shipped no
149
- # generation_config.json (needed manual eos pinning, see _resolve_vlm_eos_ids).
150
- # VLM_MODEL_ID = "rodrigomt/Qwen3.5-4B-Uncensored-Aggressive"
151
- VLM_MODEL_ID = "prithivMLmods/gemma-4-E4B-it-Uncensored-MAX"
 
 
 
 
152
 
153
  print(f"Loading assistant: {VLM_MODEL_ID} ...")
154
  vlm_processor = AutoProcessor.from_pretrained(
@@ -1329,7 +1333,7 @@ with gr.Blocks(fill_height=True) as demo:
1329
 
1330
  with gr.Tab("πŸ’¬ Prompt Assistant"):
1331
  gr.Markdown(
1332
- "Ask **Gemma 4 E4B** a single question β€” with or without an image. "
1333
  "Turn a rough idea into a rich prompt, or upload a reference image and "
1334
  "ask the model to describe it as a prompt."
1335
  )
@@ -1350,7 +1354,7 @@ with gr.Blocks(fill_height=True) as demo:
1350
  choices=["Off", "On"],
1351
  value="Off",
1352
  label="🧠 Reasoning",
1353
- info="Direct answers are best for prompts. (Reasoning mode only applies to thinking-capable models; Gemma 4 answers directly.)",
1354
  )
1355
  with gr.Accordion("βš™οΈ Settings", open=False):
1356
  vlm_max_tokens = gr.Slider(
 
145
  # describing a reference image. Loaded once, bf16 β€” fits alongside the two
146
  # diffusion pipelines on the shared GPU.
147
  # -----------------------------------------------------------------------------
148
+ # Prior models, kept for easy revert:
149
+ # "rodrigomt/Qwen3.5-4B-Uncensored-Aggressive" # no generation_config; needed eos pinning
150
+ # "prithivMLmods/gemma-4-E4B-it-Uncensored-MAX" # under-rated explicit content (rating capped ~2)
151
+ # Current: 9B uncensored Qwen3.5 VL. Vision-capable (image-text-to-text). NOTE its
152
+ # generation_config eos is just <|endoftext|> and omits the chat terminator
153
+ # <|im_end|>; _resolve_vlm_eos_ids() unions both so stopping still works. ~18.8 GB
154
+ # bf16 β€” heaviest assistant so far; watch GPU memory alongside the diffusers.
155
+ VLM_MODEL_ID = "ccharnkij/Qwen3.5-9B-Uncensored"
156
 
157
  print(f"Loading assistant: {VLM_MODEL_ID} ...")
158
  vlm_processor = AutoProcessor.from_pretrained(
 
1333
 
1334
  with gr.Tab("πŸ’¬ Prompt Assistant"):
1335
  gr.Markdown(
1336
+ "Ask **Qwen3.5-9B** a single question β€” with or without an image. "
1337
  "Turn a rough idea into a rich prompt, or upload a reference image and "
1338
  "ask the model to describe it as a prompt."
1339
  )
 
1354
  choices=["Off", "On"],
1355
  value="Off",
1356
  label="🧠 Reasoning",
1357
+ info="Off: direct answer, best for prompts β€’ On: think step-by-step first (slower, raise max tokens)",
1358
  )
1359
  with gr.Accordion("βš™οΈ Settings", open=False):
1360
  vlm_max_tokens = gr.Slider(