Instructions to use GritLM/GritLM-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GritLM/GritLM-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="GritLM/GritLM-7B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("GritLM/GritLM-7B", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("GritLM/GritLM-7B", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use GritLM/GritLM-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "GritLM/GritLM-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GritLM/GritLM-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/GritLM/GritLM-7B
- SGLang
How to use GritLM/GritLM-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "GritLM/GritLM-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GritLM/GritLM-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "GritLM/GritLM-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GritLM/GritLM-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use GritLM/GritLM-7B with Docker Model Runner:
docker model run hf.co/GritLM/GritLM-7B
About continuing training
Hello guys, incredible work btw ๐ฅ
I am interested to know if you managed to evaluate the model's performance on other languages than English? I'am interested to continue training this model on an arabic corpus! Do you think it will maintain it's performance across the embedding task as well? Would love to hear your thoughts about this subject
best ๐ค
cc : @Muennighoff
Thanks!
We evaluated it on TyDi QA - you can find the per-language metrics of this model here: https://huggingface.co/datasets/GritLM/results/blob/main/GritLM-7B/tydiqa_metrics.json
(the average is also reported in the paper)
Here's the GritLM-8x7B model: https://huggingface.co/datasets/GritLM/results/blob/main/GritLM-8x7B/tydiqa_metrics.json
We didn't test them on arabic embedding but there are a bunch of Arabic datasets available in MTEB - would be great to get their performance!
What languages does it suport?
You can try any language, but it will probably be best for English and related languages