deepinfo
/

genbi-phi3-sql-agent

Model card Files Files and versions

genbi-phi3-sql-agent / readme.md

deepinfo's picture

Upload readme.md

1a607f3 verified 15 days ago

|

history blame contribute delete

2.14 kB

	---
	language:
	- en
	tags:
	- text-to-sql
	- phi-3
	- genbi
	- dbt
	- azure-sql
	- qlora
	base_model: microsoft/Phi-3-mini-4k-instruct
	license: mit
	---

	# GenBI Phi-3 SQL Agent

	## Model Description
	This model is a specialized Small Language Model (SLM) designed to act as the reasoning engine for an Agentic GenBI Enterprise Platform. It translates natural language business questions into highly accurate, executable T-SQL queries.

	It has been instruction-tuned to understand modern data stack semantics, specifically bridging the gap between a dbt Semantic Layer and an Azure SQL data warehouse.

	- Developed by: deepinfo
	- Model type: Causal Language Model (Fine-tuned via QLoRA)
	- Base model: Microsoft Phi-3-mini-4k-instruct
	- Primary Use Case: Enterprise Business Intelligence, Autonomous SQL Generation, and LangGraph Agentic Workflows.

	## Training Details
	This model was fine-tuned using a synthetic dataset generated from a strictly typed `dbt` semantic layer (`manifest.json`). The training ensures the model adheres strictly to predefined business logic (e.g., Margin, Revenue, Customer Lifetime Value) rather than hallucinating column names or relationships.
	- Hardware: NVIDIA T4/A100 (Google Colab)
	- Technique: QLoRA (Quantized Low-Rank Adaptation)
	- Format: Merged fp16/bf16 weights.

	## Usage Example (Python)
	```python
	from transformers import pipeline

	pipe = pipeline("text-generation", model="deepinfo/genbi-phi3-sql-agent")

	prompt = """You are the reasoning engine for an Agentic GenBI Enterprise Platform. Your role is to translate business questions into accurate T-SQL queries for an Azure SQL database.

	Context:
	Target Table: fct_sales_performance
	Columns:
	- gross_revenue: Standard definition for total sales: Qty * Price.
	- gross_profit: Margin definition: Revenue - Cost.
	- category_name: The top-level classification of the product sold.

	User: What was the total gross profit generated by the Bikes category?"""

	output = pipe(prompt, max_new_tokens=100, temperature=0.1)
	print(output[0]['generated_text'])