--- license: apache-2.0 base_model: codellama/CodeLlama-7b-Instruct-hf tags: - text-to-sql - spider-dataset - sqlifyai - code-generation library_name: transformers pipeline_tag: text-generation --- # SQLifyAI - Text-to-SQL Model This model was fine-tuned using SQLifyAI on the Spider dataset for converting natural language questions to SQL queries. ## Model Details - **Base Model**: codellama/CodeLlama-7b-Instruct-hf - **Dataset**: Spider - **Training**: Multi-stage curriculum learning with advanced schema linking - **Commit**: 30-minute rapid test training run ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dattheshshenoy/sqlifyai-30min-test") model = AutoModelForCausalLM.from_pretrained("dattheshshenoy/sqlifyai-30min-test") # Generate SQL question = "What are the names of all students?" schema = "CREATE TABLE students (id INT, name VARCHAR(50));" prompt = f"### Question: {question}\n### Schema: {schema}\n### SQL:" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=128) sql = tokenizer.decode(outputs[0], skip_special_tokens=True).split("### SQL:")[-1].strip() ``` ## Performance - Trained with advanced schema linking and curriculum learning - Optimized for Spider dataset evaluation metrics