How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "whoami02/defog-sqlcoder-2-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "whoami02/defog-sqlcoder-2-GGUF",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'
Use Docker
docker model run hf.co/whoami02/defog-sqlcoder-2-GGUF:Q8_0
Quick Links

Model Details

I do not claim ownership of this model.
It is converted into 8-bit GGUF format from original repository huggingface.co/defog/sqlcoder-7b-2

Model Description

Developed by: Defog AI

Model Sources

Repository: https://huggingface.co/defog/sqlcoder-7b-2

Example usage

With Llamacpp:

from langchain_community.llms. llamacpp import Llamacpp
from huggingface_hub import hf_hub_download

YOUR_MODEL_DIRECTORY = None
CONTEXT LENGHT = None
MAX TOKENS = None
BATCH SIZE = None
TEMPERATURE = None
GPU_OFFLOAD = None

def load_model (model_id, model_basename):
  model_path = hf_hub_download (
    repo_id=model_id,
    filename=model_basename,
    resume_download=True,
    cache_dir="YOUR_MODEL_DIRECTORY",
  )
  kwargs = {
    'model_path': model_path,
    'n_ctx': CONTEXT_LENGHT,
    'max_tokens': MAX_TOKENS,
    'n_batch': BATCH_SIZE,
    'n_gpu_layers': GPU_OFFLOAD,
    'temperature': TEMPERATURE,
    'verbose': True,
  }
  return LlamaCpp(**kwargs)

11m = load_model(
model_id="whoami02/defog-sqlcoder-2-GGUF",
model_basename="sqlcoder-7b-2.q8_0.gguf",
Downloads last month
45
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support