Instructions to use NumbersStation/nsql-llama-2-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NumbersStation/nsql-llama-2-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NumbersStation/nsql-llama-2-7B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NumbersStation/nsql-llama-2-7B") model = AutoModelForCausalLM.from_pretrained("NumbersStation/nsql-llama-2-7B") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use NumbersStation/nsql-llama-2-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NumbersStation/nsql-llama-2-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/NumbersStation/nsql-llama-2-7B
- SGLang
How to use NumbersStation/nsql-llama-2-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NumbersStation/nsql-llama-2-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NumbersStation/nsql-llama-2-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use NumbersStation/nsql-llama-2-7B with Docker Model Runner:
docker model run hf.co/NumbersStation/nsql-llama-2-7B
The score of this model on spider and WikiSQL
I would like to know the score of this model on spider and WikiSQL. I'm not sure if you have submitted any results.
Also, do you have any scores on the test set!
Looking forward to a reply!
Here are some benchmark scores we have: https://www.numbersstation.ai/post/nsql-llama-2-7b and all numbers are on dev set.
Do you mind to share the evaluation script or the prompt template for us to duplicate the numbers? Many thanks
In the page https://www.numbersstation.ai/post/nsql-llama-2-7b, I found "Open Source Ours - Pretrain + Instruct". Could anyone explain what exactly is this "Instruct"? Only after this "Instruct", the accuracy can be close to chatgpt.
In the page https://www.numbersstation.ai/post/nsql-llama-2-7b, I found "Open Source Ours - Pretrain + Instruct". Could anyone explain what exactly is this "Instruct"? Only after this "Instruct", the accuracy can be close to chatgpt.
What they mean by pre-training is training it for Causal Language Modeling with SQL queries only. Instruct refers to the fine-tuning of the pretrained LLAMA model with text-to-SQL datasets, also known as instruction tuning.
Do you mind to share the evaluation script or the prompt template for us to duplicate the numbers? Many thanks
We've followed https://github.com/taoyds/spider for spider and geoquery evaluation and use Rajkumar prompt format. You can find examples here: https://github.com/NumbersStationAI/NSQL/tree/main/examples