--- base_model: unsloth/gemma-3-270m-it tags: - text-generation-inference - transformers - unsloth - gemma3_text - trl license: apache-2.0 language: - en --- # Uploaded model - **Developed by:** Nonovogo - **License:** apache-2.0 - **Finetuned from model :** unsloth/gemma-3-270m-it Use ``` text = tokenizer.apply_chat_template( messages, tokenize = False, add_generation_prompt = True ).removeprefix('') # This forces the model to enter "thinking mode" immediately. text += "\n" # 3. Generate _ = model.generate( **tokenizer(text, return_tensors="pt").to("cuda"), max_new_tokens=2048, # Don't let it ramble forever # --- STABILITY SETTINGS --- do_sample=True, # Enable sampling to break deterministic loops temperature=0.1, # Very low temp (focused) but not zero top_p=0.95, # Standard filtering repetition_penalty=1.0, # CRITICAL: Disable penalty (1.0 = no penalty) streamer=TextStreamer(tokenizer, skip_prompt=True), eos_token_id=tokenizer.eos_token_id # Ensure it knows when to stop ) ``` For better output - This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)