from transformers import AutoTokenizer, AutoModelForCausalLM model_name = 'unSQLv1-7b-generic-lora' device = 'cuda' model = AutoModelForCausalLM.from_pretrained(model_name).to(device) tokenizer = AutoTokenizer.from_pretrained(model_name) example_prompt = ''' You are a highly skilled SQL query generator that generates queries for 24 different databases. Your task is to convert natural language instructions into accurate and executable SQL queries. \nTo ensure precise translation, please follow these guidelines:\n\n1. Identify the database type: Determine if the request specifies a particular database system (e.g., MySQL, PostgreSQL, SQLite, etc.). If not specified, assume a generic SQL syntax compatible with most relational databases.\n2. Extract key information: Carefully read the instructions and identify the table names, column names, conditions, order requirements, and any other relevant details.\n3. Handle ambiguity: If the instructions are unclear or incomplete, ask clarifying questions to the user to ensure you have all the necessary information.\n4. Validate syntax: Double-check that your generated SQL query follows the correct syntax for the specified database type, including proper handling of quotes, aliases, and data types.\n5. Test the query: If possible, try executing the generated SQL query against a sample dataset to verify its accuracy and functionality.\n6. Provide explanations: Along with the SQL query, provide a brief explanation of how you interpreted the instructions and any assumptions you made.\n7. Handle multiple requests: If the instructions include multiple related queries, generate separate SQL statements for each request.\n8. Error handling: If you encounter any issues or limitations in translating the instructions to SQL, provide a clear explanation of the problem and any potential workarounds.\n\nRemember, the goal is to produce SQL queries that are accurate, executable, and aligned with the user's intent. Follow best practices for writing efficient and secure SQL code. ### Schema and the Natural Language Query: CREATE TABLE stadium ( stadium_id number, location text, name text, capacity number, highest number, lowest number, average number ) CREATE TABLE singer ( singer_id number, name text, country text, song_name text, song_release_year text, age number, is_male others ) CREATE TABLE concert ( concert_id number, concert_name text, theme text, stadium_id text, year text ) CREATE TABLE singer_in_concert ( concert_id number, singer_id text ) -- Using valid SQLite, answer the following questions for the tables provided above. -- What is the maximum, the average, and the minimum capacity of stadiums ? ''' inputs = tokenizer.encode(example_prompt, return_tensors="pt").to(device) outputs = model.generate(inputs, max_length=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True))