# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("GreenBitAI/LLaMA-2-7B-4bit-groupsize32")
model = AutoModelForCausalLM.from_pretrained("GreenBitAI/LLaMA-2-7B-4bit-groupsize32")Quick Links
GreenBit LLaMA
This is GreenBitAI's pretrained 4-bit LLaMA-2 7B model with advanced compression design and lossless performance to FP16 models.
Please refer to our Github page for the code to run the model and more information.
Model Description
- Developed by: GreenBitAI
- Model type: Causal (Llama 2)
- Language(s) (NLP): English
- License: Apache 2.0, Llama 2 license agreement
- Downloads last month
- 12
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="GreenBitAI/LLaMA-2-7B-4bit-groupsize32")