---
license: other
---
# Gemma-7B in 8-bit with bitsandbytes

This is the repository for Gemma-7B quantized to 8-bit using bitsandbytes.
Original model card and license for Gemma-7B can be found [here](https://huggingface.co/google/gemma-7b#gemma-model-card).
This is the base model and it's not instruction fine-tuned.

## Usage

Please visit original Gemma-7B [model card](https://huggingface.co/google/gemma-7b#usage-and-limitations) for intended uses and limitations.

You can use this model like following: 

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer("google/gemma-7b")
model = AutoModelForCausalLM.from_pretrained(
  "merve/gemma-7b-8bit",
  device_map='auto'
)

input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```