--- license: other --- # Gemma-7B in 8-bit with bitsandbytes This is the repository for Gemma-7B quantized to 8-bit using bitsandbytes. Original model card and license for Gemma-7B can be found [here](https://huggingface.co/google/gemma-7b#gemma-model-card). This is the base model and it's not instruction fine-tuned. ## Usage Please visit original Gemma-7B [model card](https://huggingface.co/google/gemma-7b#usage-and-limitations) for intended uses and limitations. You can use this model like following: ```python from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer("google/gemma-7b") model = AutoModelForCausalLM.from_pretrained( "merve/gemma-7b-8bit", device_map='auto' ) input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids) print(tokenizer.decode(outputs[0])) ```