NBAmine
/

mistral-nemo-text-to-sql

Text Generation

text-generation-inference

Model card Files Files and versions

mistral-nemo-text-to-sql / README.md

NBAmine's picture

Update README.md

4dc019d verified 4 days ago

|

history blame contribute delete

2.48 kB

	---
	language:
	- en
	base_model: mistralai/Mistral-Nemo-Base-2407
	tags:
	- text-to-sql
	- mistral-nemo
	- spider
	- peft
	- qlora
	metrics:
	- execution_accuracy
	- exact_match
	model_creator: NBAmine
	pipeline_tag: text-generation
	datasets:
	- gretelai/synthetic_text_to_sql
	- xlangai/spider
	- NBAmine/xlangai-spider-with-context
	library_name: transformers
	---

	# Mistral-Nemo-12B-Text-to-SQL

	[![GitHub](https://img.shields.io/badge/GitHub-Repo-181717?logo=github&logoColor=white)](https://github.com/NBAmine/Nemo-text-to-sql)


	## Model Overview
	This is the full-precision (BF16), merged version of a Mistral-Nemo-12B model Parameter-Efficient Fine-Tuned for high-performance Text-to-SQL generation. This model is the result of merging LoRA adapters—trained via a two-phase curriculum learning strategy—back into the base weights.

	It is designed to serve as the "Source of Truth" for further optimizations (like AWQ or GGUF) and represents the peak predictive performance of the training pipeline before any quantization-related drift.

	- Base Model: `mistralai/Mistral-Nemo-Base-2407`
	- Primary Task: Natural Language to SQL generation with DDL context.
	- Output Format: Standalone SQL queries compatible with standard SQL engines.

	## Training Methodology
	The model was developed using an MLOps pipeline on dual T4 GPUs in Kaggle.

	### 1. Curriculum Learning Strategy
	The model underwent a two-stage training process:
	- Phase 1 (Syntactic Alignment): Focused on SQL syntax, basic keywords, and simple schema mapping.
	- Phase 2 (Logical Alignment): Introduced complex reasoning tasks including multiple `JOIN` operations, nested subqueries, and set operations (`UNION`, `INTERSECT`).

	### 2. Fine-Tuning Details
	- Technique: QLoRA (Rank 16, Alpha 32)
	- Quantization (during training): 4-bit NF4
	- Optimizer: Paged AdamW 8-bit
	- Hardware: 2x NVIDIA T4 (Kaggle).

	## Evaluation Results
	Evaluated on the Spider validation set:
	- Execution Accuracy (EX): 69.5%
	- Exact Match (EM): 61.2%
	- Max Context Length: 2048 tokens

	## Architecture Specs
	The merged weights utilize the standard Mistral-Nemo 12B architecture:
	- Parameters: 12.2B
	- Layers: 40
	- Attention: Grouped Query Attention (GQA) with 8 KV heads.
	- Vocabulary Size: 128k (Tekken Tokenizer)
	- VRAM Requirements: ~24GB for inference in BF16/FP16.


	## Template used during training

	prompt = "Context: {DDL}<br>Question: {NL_QUERY}<br>Answer:"