Model Card: Mistral-Nemo-Instruct-2407_ORPO- Fine-Tuned for Text-to-SQL

Model Overview

Model Name

Mistral-Nemo-Instruct-2407_ORPO

Base Model

Purpose

This model was fine-tuned to improve accuracy for translating natural language queries into SQL statements, specifically targeting non-technical users. The fine-tuning process compared two methodologies: Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO).

Fine-Tuning Methods

Direct Preference Optimization (DPO)

A dynamic weight-scaling approach to balance preference alignment and output diversity.
Uses preference pairs ("selected" vs. "rejected" outputs) to refine model behavior.

Odds Ratio Preference Optimization (ORPO)

Leverages binary preference data and an odds ratio-based penalty method.
Eliminates the need for reward models, offering higher efficiency and scalability.

Dataset

Training Dataset

Source: Synthetic Text-to-SQL dataset from Gretel AI
Size: 89,495 entries
Focus: Data Query Language (DQL) instructions, complex SQL queries including joins, window functions, and set operations.

Evaluation Dataset

Source: Mini-Dev dataset from the BIRD benchmark
Size: 500 Text-to-SQL pairs
Complexity Levels: Simple, Medium, Challenging

Evaluation

Metrics

Execution Accuracy (EX): Percentage of SQL queries executed correctly.

Results

Model	Execution Accuracy (%)
Mistral-NeMo-Instruct (Base)	Baseline
DPO Fine-Tuned Model	+0.86%
ORPO Fine-Tuned Model	+41.38%
ORPO vs. Codestral-22B	+35.54%

Model Use

Requirements

Python 3.10+
PyTorch 2.4+
CUDA 12.1

Inference Example

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftConfig,PeftModel

# Load the fine-tuned peft model
peft_config = PeftConfig.from_pretrained("JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora") 
model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path)
model = PeftModel.from_pretrained(model, "JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora")


# Load the fine-tuned model
tokenizer = AutoTokenizer.from_pretrained("your-model-name")
model = AutoModelForCausalLM.from_pretrained("your-model-name")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-Nemo-Instruct-2407")

# Input a natural language query
response = chatbot(messages)[0]['generated_text']

print(response)

Limitations

The model may not handle queries involving highly specialized or domain-specific SQL operations.
Training data was limited to synthetic datasets; real-world performance may vary.

Ethical Considerations

Bias: The training dataset was synthetic and may not fully represent real-world linguistic diversity.
Misuse: The model is intended for assisting in SQL generation and should not be used for tasks requiring high levels of security or privacy without additional safeguards.

Citation

If you use this model in your research or applications, please cite:

@article{JHuelsEKeuchel,
  title={Evaluation of Fine-Tuning Methods: DPO and ORPO for Text-to-SQL},
  author={Jonathan Hüls and Elina Keuchel.},
  year={2025}
}

License

The model is released under the apache-2.0 LICENSE.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning

Model tree for JHuel/Mistral-Nemo-Instruct-2407_DPO_qlora

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407