Devstral SQLCoder SFT

This model is a full-parameter SFT checkpoint for SQL generation, trained from mistralai/Devstral-Small-2505 and exported to Hugging Face safetensors format.

Model Details

  • Base model: mistralai/Devstral-Small-2505
  • Architecture: MistralForCausalLM
  • Precision used in training: bf16
  • Max sequence length (training config): 4096
  • Export format: sharded safetensors with model.safetensors.index.json

Training Data (Merged)

The SFT run merged the following datasets:

  • spider
  • bird
  • bird23-train-filtered
  • synsql-2.5m
  • wikisql
  • gretelai-synthetic
  • sql-create-context

Intended Use

  • Text-to-SQL research and experimentation
  • SQL generation benchmarks and evaluation pipelines

Limitations

  • This model may generate incorrect SQL and should be validated before production use.
  • Performance depends on prompt format, schema context quality, and decoding settings.
  • Evaluate safety and compliance requirements before deployment.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_or_path = "<hf-username-or-org>/<model-repo>"

tokenizer = AutoTokenizer.from_pretrained(repo_or_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_or_path,
    torch_dtype="bfloat16",
)

Local Files Included

  • config.json
  • generation_config.json
  • tekken.json
  • model-00001-of-00021.safetensors ... model-00021-of-00021.safetensors
  • model.safetensors.index.json

Citation

If you use this model, please cite this repository:

Downloads last month
-
Safetensors
Model size
24B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for twinkle-ai/twinkle-sqlcoder

Collection including twinkle-ai/twinkle-sqlcoder