--- license: apache-2.0 base_model: - meta-llama/Llama-3.2-3B library_name: peft --- # Model Card for llama3-sql2plan This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) using LoRA (Low-Rank Adaptation) for the task of generating PostgreSQL execution plans from SQL queries. The model takes SQL queries as input and outputs the corresponding PostgreSQL execution plan in JSON format. ## Model Details ### Model Description This model is specifically designed to convert SQL queries into PostgreSQL execution plans. It was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA adapters, allowing efficient training while maintaining the base model's capabilities. - **Developed by:** Anirudh Bharadwaj - **Model type:** Causal Language Model (Decoder-only) - **Language(s) (NLP):** English (SQL and JSON) - **License:** Apache 2.0 - **Finetuned from model:** [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) ### Model Sources - **Repository:** [abharadwaj123/llama3-sql2plan](https://huggingface.co/abharadwaj123/llama3-sql2plan) - **Base Model:** [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) ## Uses ### Direct Use This model can be used directly to generate PostgreSQL execution plans from SQL queries. It is intended for: - Database query optimization analysis - Understanding query execution strategies - Educational purposes for learning PostgreSQL query planning - Database performance analysis tools ### Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "abharadwaj123/llama3-sql2plan" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", ) sql_query = "SELECT * FROM users WHERE age > 25;" prompt = ( "Generate the PostgreSQL execution plan in JSON format for the SQL query.\n\n" "[QUERY]\n" + sql_query + "\n\n[PLAN]\n" ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, ) generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) plan = generated_text.split("[PLAN]\n")[-1].strip() print(plan) ``` ### Out-of-Scope Use This model should not be used for: - Generating actual executable SQL queries (it generates execution plans, not queries) - Real-time database query execution - Production database systems without proper validation - Any use case requiring guaranteed accuracy of execution plans ## Bias, Risks, and Limitations ### Limitations - **Accuracy**: The model generates execution plans based on training data patterns and may not always produce accurate or optimal plans for all SQL queries. - **PostgreSQL-specific**: The model is trained specifically for PostgreSQL execution plans and may not be suitable for other database systems. - **Training Data Scope**: The model was trained on a subset of Stack Overflow data (10,000 samples from ~16,332 available), which may not cover all SQL query patterns. - **No Database Context**: The model does not have access to actual database schema, indexes, or statistics, which are crucial for accurate execution plan generation. ### Recommendations Users should: - Validate generated execution plans against actual PostgreSQL EXPLAIN output - Not rely solely on this model for critical database optimization decisions - Use this model as a tool for understanding and learning, not as a replacement for actual database query planning - Be aware that execution plans may vary based on database configuration, schema, and data distribution ## How to Get Started with the Model ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load model and tokenizer model_name = "abharadwaj123/llama3-sql2plan" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", ) # Prepare input sql_query = "SELECT * FROM users WHERE age > 25;" prompt = ( "Generate the PostgreSQL execution plan in JSON format for the SQL query.\n\n" "[QUERY]\n" + sql_query + "\n\n[PLAN]\n" ) # Generate plan inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, ) # Extract plan generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) plan = generated_text.split("[PLAN]\n")[-1].strip() ``` ## Training Details ### Training Data The model was trained on a dataset derived from Stack Overflow data (`stackoverflow_n18147.csv`), containing SQL queries and their corresponding PostgreSQL execution plans in JSON format. - **Total samples in dataset:** 18,147 - **Training samples used:** 10,000 (sampled from first 90% of dataset, ~16,332 samples) - **Sampling method:** Random sampling with random_state=42 - **Data format:** SQL query text paired with PostgreSQL execution plan JSON The training data was filtered to remove rows with missing `sql_text` or `plan_json` values. ### Training Procedure #### Preprocessing 1. **Data Loading**: Loaded CSV file and filtered out rows with missing SQL text or plan JSON 2. **Data Splitting**: Used first 90% of dataset as training pool, then randomly sampled 10,000 examples 3. **Formatting**: Each example was formatted with a prompt template: ``` Generate the PostgreSQL execution plan in JSON format for the SQL query. [QUERY] {sql_text} [PLAN] {plan_json} ``` 4. **Tokenization**: - Maximum sequence length: 2048 tokens - Input prompt tokens were masked in labels (set to -100) to only train on plan generation - Padding to max_length #### Training Hyperparameters - **Training regime:** FP16 mixed precision training - **LoRA Configuration:** - `r`: 64 - `lora_alpha`: 32 - `lora_dropout`: 0.05 - `target_modules`: ["q_proj", "k_proj", "v_proj", "o_proj"] - **Training Arguments:** - `per_device_train_batch_size`: 2 - `gradient_accumulation_steps`: 8 - `effective_batch_size`: 16 - `learning_rate`: 1e-5 - `warmup_ratio`: 0.03 - `num_train_epochs`: 2 - `gradient_checkpointing`: True - `logging_steps`: 20 - `save_strategy`: "epoch" #### Testing Data The remaining 10% of the original dataset (~1,815 samples) was held out and could be used for evaluation. ## Model Examination The model uses LoRA (Low-Rank Adaptation) fine-tuning, which allows efficient training by only updating a small number of parameters (low-rank matrices) while keeping the base model weights frozen. This approach: - Reduces memory requirements during training - Enables faster training compared to full fine-tuning - Maintains the base model's general capabilities - Allows easy merging of adapters with the base model ## Technical Specifications ### Model Architecture and Objective - **Architecture:** Transformer-based decoder-only language model (Llama-3.2-3B) - **Objective:** Causal language modeling with masked input tokens - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) via PEFT - **Base Model Parameters:** ~3 billion - **Trainable Parameters:** Significantly reduced via LoRA (exact count depends on LoRA rank and target modules) ## Model Card Authors Anirudh Bharadwaj ## Model Card Contact For questions or issues, please contact through the Hugging Face model repository: [abharadwaj123/llama3-sql2plan](https://huggingface.co/abharadwaj123/llama3-sql2plan)