Nonovogo
/

gemma-3_Python_Trial_2R

text-generation-inference

Model card Files Files and versions

Nonovogo commited on Dec 1, 2025

Commit

2d01520

·

verified ·

1 Parent(s): 32fd935

Update README.md

Files changed (1) hide show

README.md +27 -0

README.md CHANGED Viewed

@@ -17,6 +17,33 @@ language:
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/gemma-3-270m-it
 This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/gemma-3-270m-it
+Use
+```
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize = False,
+    add_generation_prompt = True
+).removeprefix('<bos>')
+# This forces the model to enter "thinking mode" immediately.
+text += "<think>\n"
+# 3. Generate
+_ = model.generate(
+    **tokenizer(text, return_tensors="pt").to("cuda"),
+    max_new_tokens=2048,     # Don't let it ramble forever
+    # --- STABILITY SETTINGS ---
+    do_sample=True,         # Enable sampling to break deterministic loops
+    temperature=0.1,        # Very low temp (focused) but not zero
+    top_p=0.95,             # Standard filtering
+    repetition_penalty=1.0, # CRITICAL: Disable penalty (1.0 = no penalty)
+    streamer=TextStreamer(tokenizer, skip_prompt=True),
+    eos_token_id=tokenizer.eos_token_id # Ensure it knows when to stop
+)
+```
+For better output
 This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)