why the outputs are different ?

by AAsuka - opened Dec 4, 2025

Dec 4, 2025

Thanks for the great work

I've tried the demo code here (https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct#using-%F0%9F%A4%97-transformers-to-chat) and found that the output is changing every time.
The outputs look similar but not 100% the same:
1.
['This is a heartwarming photograph capturing a tender moment between a woman and her dog on a sandy beach during sunset.\n\nThe scene is set on a wide, sandy beach, with the ocean stretching out to the horizon. The sun is low in the sky, casting a warm, golden glow across the scene and creating a soft, hazy atmosphere. Gentle waves can be seen breaking in the distance.\n\nIn the foreground, a woman with long, dark hair is sitting on the sand. She is wearing a black and white plaid shirt and dark pants. She is smiling and looking at her dog with a joyful expression. Her hands are gently']
2.
['This is a heartwarming, sunlit photograph of a woman and her dog on a sandy beach during what appears to be the golden hour, either sunrise or sunset.\n\nThe scene is set on a wide, sandy beach with gentle waves breaking in the background. The sky is a soft, bright white, suggesting the sun is low on the horizon, casting a warm, golden glow across the entire scene. The ocean is visible in the distance, with a small wave rolling in.\n\nIn the foreground, a woman with long, dark hair is sitting on the sand. She is wearing a black and white plaid shirt and dark pants. She']
3.
['This is a heartwarming photograph capturing a tender moment between a woman and her dog on a serene beach at sunset.\n\nThe scene is set on a wide, sandy beach, with the ocean stretching out into the horizon. The sun is low on the horizon, casting a warm, golden glow across the sky and the water, creating a soft, peaceful atmosphere. Gentle waves can be seen breaking in the distance.\n\nA woman with long, dark hair is sitting on the sand, her body angled towards the dog. She is wearing a black and white plaid shirt and dark pants. She is smiling warmly and looking at the dog with affection.\n\n']

I've tried to set model to evaluate mode by .eval(), but it doesn't work...
Does anyone have a clue? Thanks in advance

VINAY-UMRETHE

Dec 6, 2025

The variation you are seeing it's simply because generate() uses stochastic sampling by default. Calling model.eval() ` only disables dropout/batchnorm and it has no effect on how tokens are selected during response generation. to make your outputs identical every time, you need to turn off sampling and force greedy decoding and fix all RNG seeds. for fully deterministic setup try something like this...

import torch, random, numpy as np

torch.manual_seed(0)
random.seed(0)
np.random.seed(0)
torch.cuda.manual_seed_all(0)

and ensure to do sample False

generated_ids = model.generate(
    **inputs,
    temperature=1, # well these values don't matter since do sample is False don't worry
    top_p=1.0, 
    top_k=0,
    do_sample=False,
    max_new_tokens=128,
)

AAsuka

Dec 8, 2025

The variation you are seeing it's simply because generate() uses stochastic sampling by default. Calling model.eval() ` only disables dropout/batchnorm and it has no effect on how tokens are selected during response generation. to make your outputs identical every time, you need to turn off sampling and force greedy decoding and fix all RNG seeds. for fully deterministic setup try something like this...
import torch, random, numpy as np

torch.manual_seed(0)
random.seed(0)
np.random.seed(0)
torch.cuda.manual_seed_all(0)
and ensure to do sample False
generated_ids = model.generate(
    **inputs,
    temperature=1, # well these values don't matter since do sample is False don't worry
    top_p=1.0, 
    top_k=0,
    do_sample=False,
    max_new_tokens=128,
)

'do_sample=False' works, thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment