Instructions to use defog/llama-3-sqlcoder-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use defog/llama-3-sqlcoder-8b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="defog/llama-3-sqlcoder-8b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("defog/llama-3-sqlcoder-8b") model = AutoModelForCausalLM.from_pretrained("defog/llama-3-sqlcoder-8b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Local Apps Settings
- vLLM
How to use defog/llama-3-sqlcoder-8b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "defog/llama-3-sqlcoder-8b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "defog/llama-3-sqlcoder-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/defog/llama-3-sqlcoder-8b
- SGLang
How to use defog/llama-3-sqlcoder-8b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "defog/llama-3-sqlcoder-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "defog/llama-3-sqlcoder-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "defog/llama-3-sqlcoder-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "defog/llama-3-sqlcoder-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use defog/llama-3-sqlcoder-8b with Docker Model Runner:
docker model run hf.co/defog/llama-3-sqlcoder-8b
Fine tuning defog/llama-3-sqlcoder-8b
Hi defog Team,
I am using defog/llama-3-sqlcoder-8b LLM model for our one of the usecase. We are using Microsoft sql server management tool db to fetch the query generated and we found that for most of the nlp query , model is not generating sql query properly.
So we decided to fine tune the model as per our db schema in order to generate correct sql query by model.
Can you please help share how to fine tune the model? Also can you share sample dataset format for NLP query that you used for training llama-3-sqlcoder-8b? This sample dataset will help us to get an idea what questions need to use for training?
@kchakkarwar I'm sorry, we don't provide explicit instructions or data on how to finetune the model.