NBAmine commited on
Commit
4dc019d
·
verified ·
1 Parent(s): 52a9240

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -38
README.md CHANGED
@@ -1,66 +1,68 @@
1
  ---
2
- datasets:
3
- - gretelai/synthetic_text_to_sql
4
- - xlangai/spider
5
- - NBAmine/xlangai-spider-with-context
6
  language:
7
  - en
8
- base_model:
9
- - mistralai/Mistral-Nemo-Instruct-2407
10
- pipeline_tag: text-generation
11
- library_name: transformers
12
  tags:
13
  - text-to-sql
14
- - mistral
15
  - mistral-nemo
 
16
  - peft
17
  - qlora
 
 
 
 
 
 
 
 
 
 
18
  ---
19
 
20
- # Mistral-Nemo-Text-to-SQL-AWQ
21
 
22
  [![GitHub](https://img.shields.io/badge/GitHub-Repo-181717?logo=github&logoColor=white)](https://github.com/NBAmine/Nemo-text-to-sql)
23
 
24
 
25
  ## Model Overview
26
- This is a production-optimized, **4-bit AWQ quantized** version of a Parameter-Efficient Fine-Tuned **Mistral-Nemo-12B**. The model is specialized for **Text-to-SQL** tasks, specifically designed to handle complex relational database queries with DDL context.
27
 
28
- The original model was trained using a **two-phase curriculum learning strategy**:
29
- 1. **Phase 1 (Syntactic Alignment):** Learning basic SQL structure and schema-linking.
30
- 2. **Phase 2 (Logical Reasoning With Context):** Advanced reasoning for complex `JOIN`, `UNION`, and nested sub-queries.
31
 
32
- ## Engineering & Optimization
33
- This model demonstrates advanced optimization techniques to balance performance and hardware accessibility.
 
34
 
35
- ### 1. Stratified AWQ Calibration
36
- Unlike standard quantization, this model utilized **Stratified Sampling** based on SQL query length during the calibration phase. This ensured that the **"Salient Weights"** responsible for long-context reasoning and complex SQL clauses were protected via activation-aware scaling factors.
37
 
38
- ### 2. Quantization Specs
39
- - **Method:** AWQ (Activation-aware Weight Quantization)
40
- - **Bits:** 4-bit
41
- - **Group Size:** 128
42
- - **Zero Point:** True
43
- - **Format:** Safetensors (Zero-copy loading)
44
 
45
- ### 3. Hardware Requirements
46
- | Version | VRAM Usage | Precision | Suggested GPU |
47
- | :--- | :--- | :--- | :--- |
48
- | **BF16 (Original)** | ~24 GB | 16-bit | A100 / RTX 6000 |
49
- | **AWQ (This Model)** | **~7.5 GB** | **4-bit** | **RTX 3060 / T4** |
50
 
51
  ## Evaluation Results
52
  Evaluated on the **Spider** validation set:
53
- - **Execution Accuracy (EX):** ~69.0%
54
  - **Exact Match (EM):** 61.2%
55
- - **Max Context:** 2048 tokens
56
 
57
- ## Deployment with vLLM
58
- This model is optimized for **vLLM**
 
 
 
 
 
59
 
60
 
61
- ## Prompt Template
62
- The model follows a structured prompt format to ensure logical alignment:
63
 
64
- Context: {DDL_STATEMENTS}<br>
65
- Question: {USER_QUESTION}<br>
66
- Answer:
 
1
  ---
 
 
 
 
2
  language:
3
  - en
4
+ base_model: mistralai/Mistral-Nemo-Base-2407
 
 
 
5
  tags:
6
  - text-to-sql
 
7
  - mistral-nemo
8
+ - spider
9
  - peft
10
  - qlora
11
+ metrics:
12
+ - execution_accuracy
13
+ - exact_match
14
+ model_creator: NBAmine
15
+ pipeline_tag: text-generation
16
+ datasets:
17
+ - gretelai/synthetic_text_to_sql
18
+ - xlangai/spider
19
+ - NBAmine/xlangai-spider-with-context
20
+ library_name: transformers
21
  ---
22
 
23
+ # Mistral-Nemo-12B-Text-to-SQL
24
 
25
  [![GitHub](https://img.shields.io/badge/GitHub-Repo-181717?logo=github&logoColor=white)](https://github.com/NBAmine/Nemo-text-to-sql)
26
 
27
 
28
  ## Model Overview
29
+ This is the full-precision (BF16), merged version of a **Mistral-Nemo-12B** model Parameter-Efficient Fine-Tuned for high-performance **Text-to-SQL** generation. This model is the result of merging LoRA adapters—trained via a two-phase curriculum learning strategy—back into the base weights.
30
 
31
+ It is designed to serve as the "Source of Truth" for further optimizations (like AWQ or GGUF) and represents the peak predictive performance of the training pipeline before any quantization-related drift.
 
 
32
 
33
+ - **Base Model:** `mistralai/Mistral-Nemo-Base-2407`
34
+ - **Primary Task:** Natural Language to SQL generation with DDL context.
35
+ - **Output Format:** Standalone SQL queries compatible with standard SQL engines.
36
 
37
+ ## Training Methodology
38
+ The model was developed using an MLOps pipeline on dual T4 GPUs in Kaggle.
39
 
40
+ ### 1. Curriculum Learning Strategy
41
+ The model underwent a two-stage training process:
42
+ - **Phase 1 (Syntactic Alignment):** Focused on SQL syntax, basic keywords, and simple schema mapping.
43
+ - **Phase 2 (Logical Alignment):** Introduced complex reasoning tasks including multiple `JOIN` operations, nested subqueries, and set operations (`UNION`, `INTERSECT`).
 
 
44
 
45
+ ### 2. Fine-Tuning Details
46
+ - **Technique:** QLoRA (Rank 16, Alpha 32)
47
+ - **Quantization (during training):** 4-bit NF4
48
+ - **Optimizer:** Paged AdamW 8-bit
49
+ - **Hardware:** 2x NVIDIA T4 (Kaggle).
50
 
51
  ## Evaluation Results
52
  Evaluated on the **Spider** validation set:
53
+ - **Execution Accuracy (EX):** **69.5%**
54
  - **Exact Match (EM):** 61.2%
55
+ - **Max Context Length:** 2048 tokens
56
 
57
+ ## Architecture Specs
58
+ The merged weights utilize the standard Mistral-Nemo 12B architecture:
59
+ - **Parameters:** 12.2B
60
+ - **Layers:** 40
61
+ - **Attention:** Grouped Query Attention (GQA) with 8 KV heads.
62
+ - **Vocabulary Size:** 128k (Tekken Tokenizer)
63
+ - **VRAM Requirements:** ~24GB for inference in BF16/FP16.
64
 
65
 
66
+ ## Template used during training
 
67
 
68
+ prompt = "Context: {DDL}<br>Question: {NL_QUERY}<br>Answer:"