Model repeating information and "spitting out" random characters
First of all, congratulations on the launch. Gemma 2 9B is, at least in my tests, the best model for PT-BR. Much better than much larger models.
However, problems are constantly happening, such as:
- Repeat information;
- "Spit" text infinitely;
- Place tags like "</start_of" at the end of your answer.
I am eagerly awaiting a solution.
Once again, I thank the entire Google Gemma team.
Would recommend you to use eager attention implementation
The same error even with eager attention and bf16.
Hi @brazilianslib , Could you please try again by updating the latest transformer version (!pip install -U transformers) and let us know if the issue still persists? Thank you.
For those asking about API access — I've been using Crazyrouter as a unified gateway. One API key, OpenAI SDK compatible. Works well for testing different models without managing multiple accounts.