Instructions to use NumbersStation/nsql-350M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NumbersStation/nsql-350M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NumbersStation/nsql-350M")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NumbersStation/nsql-350M") model = AutoModelForCausalLM.from_pretrained("NumbersStation/nsql-350M") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use NumbersStation/nsql-350M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NumbersStation/nsql-350M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-350M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/NumbersStation/nsql-350M
- SGLang
How to use NumbersStation/nsql-350M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NumbersStation/nsql-350M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-350M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NumbersStation/nsql-350M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NumbersStation/nsql-350M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use NumbersStation/nsql-350M with Docker Model Runner:
docker model run hf.co/NumbersStation/nsql-350M
questions on pretrain and sql formats
hi, thanks for the contribution. Does the training include samples from biquery sql and other variants of sql?. Also can you eloborate on your 2 step approach for pretrain and instruct fine tune?. What is the dataset for pretrain is it just sql statements without questions and you do next token prediction?
Thanks for your interest in our work!
For the pertaining step, we use the SQL subset from The Stack (https://huggingface.co/datasets/bigcode/the-stack), containing around 1M training samples. We use the raw SQL data with the next token prediction for continuous pertaining.
For the instruct fine-tuning step, we collect text-to-SQL pairs from more than 20 different public sources across the web from standard datasets such as WikiSQL to medical datasets such as MIMIC_III, containing around 300,000 samples of text-to-SQL pairs.
You can find more information from our blog (https://www.numbersstation.ai/post/introducing-nsql-open-source-sql-copilot-foundation-models).
Have you seen any catastrophic interference with the pretraining step? Was it only pretrained on the SQL data or do you mix other dataset which was used in the salesforce codegen?. Do you plan to opensource the training code?
With the pretraining step, we intend to let the model understand more about SQL and it did improve the text-to-SQL capability (You can find the analysis in our blog). We only pre-trained on the SQL data w/o mixing any other data used in salesforce codegen pertaining. We'll release the instruct fine-tuning data soon. Stay tuned!