Instructions to use juierror/flan-t5-text2sql-with-schema-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use juierror/flan-t5-text2sql-with-schema-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="juierror/flan-t5-text2sql-with-schema-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("juierror/flan-t5-text2sql-with-schema-v2") model = AutoModelForSeq2SeqLM.from_pretrained("juierror/flan-t5-text2sql-with-schema-v2") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use juierror/flan-t5-text2sql-with-schema-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "juierror/flan-t5-text2sql-with-schema-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "juierror/flan-t5-text2sql-with-schema-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/juierror/flan-t5-text2sql-with-schema-v2
- SGLang
How to use juierror/flan-t5-text2sql-with-schema-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "juierror/flan-t5-text2sql-with-schema-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "juierror/flan-t5-text2sql-with-schema-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "juierror/flan-t5-text2sql-with-schema-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "juierror/flan-t5-text2sql-with-schema-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use juierror/flan-t5-text2sql-with-schema-v2 with Docker Model Runner:
docker model run hf.co/juierror/flan-t5-text2sql-with-schema-v2
Integrating it with SageMaker
When running the provided code to deploy the model to Amazon SageMaker, the output was poor for the same input in the example. I ran the code in an Amazon SageMaker notebook btw. Is it possible to use the code as it is (with the 3 functions) directly in a lambda function instead of invoking SageMaker endpoint to run inference ? Thanks
Hi, @Youssef99
To be frank, I'm not sure why the result is not the same with the example, but the result for the model right now is not that good compare to other LLM models.
About the deployment on lambda, I have never used lambda before, but if it is similar to cloud function, I believe that you can deploy it, but not sure about download model latency on lambda.
Ok thank you ! Which model could i use then to perform text-to-sql having many tables ?