File size: 3,039 Bytes
b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b b737d9b cd7a19b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | ---
language:
- en
tags:
- sql
- text-to-sql
- daraz
- llama3
- unsloth
- ecommerce
license: apache-2.0
datasets:
- custom
base_model: unsloth/llama-3-8b-bnb-4bit
---
# drz-sql-llama3
This model is a fine-tuned version of Llama 3 (8B) for generating SQL queries specific to the Daraz e-commerce platform.
## Model Description
- **Base Model:** Llama 3 8B (4-bit quantized)
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Training Data:** 20 Daraz-specific SQL query examples
- **Use Case:** Converting natural language questions to SQL queries for Daraz analytics
## Training Details
- **Framework:** Unsloth
- **LoRA Rank:** 16
- **Training Steps:** 100
- **Batch Size:** 2
- **Gradient Accumulation:** 4
- **Learning Rate:** 0.0002
## Key Features
This model understands Daraz-specific:
- Table schemas (e.g., `daraz_cdm.dwd_drz_trd_core_df`, `daraz_cdm.dwd_drz_prd_sku_extension`)
- Business logic (Choice classification, KAM assignments, industry mapping)
- Query patterns (MAX_PT for partitions, DATEADD for date filtering)
- Metrics (GMV, L7/L30 calculations, order types)
## Usage
```python
from unsloth import FastLanguageModel
# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Bilal326/drz-sql-llama3",
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
# Generate SQL
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""
prompt = alpaca_prompt.format(
"Generate SQL for the following request:",
"Get total GMV for last 30 days in Pakistan",
""
)
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5)
print(tokenizer.decode(outputs[0]))
```
## Example Queries
The model can handle:
- Simple aggregations: "Get total GMV and orders for last 30 days"
- Complex joins: "Get seller performance with KAM assignments"
- Time-based analysis: "Show monthly GMV trend by industry"
- Advanced logic: "Compare Choice vs Non-Choice GMV in Crossborder"
## Limitations
- Trained specifically for Daraz schema and business logic
- May not generalize to other SQL dialects or schemas
- Requires Daraz-specific tables to be available
## Training Dataset
Custom dataset of 20 SQL query examples covering:
- Revenue and GMV analysis
- Product performance metrics
- Seller segmentation
- Category and brand analysis
- Time-based trends
## Citation
If you use this model, please cite:
```
@misc{drz-sql-llama3,
author = {Bilal326},
title = {drz-sql-llama3: Daraz SQL Generation Model},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Bilal326/drz-sql-llama3}
}
```
## Acknowledgments
- Built with [Unsloth](https://github.com/unslothai/unsloth)
- Based on Meta's Llama 3
- Fine-tuned for Daraz e-commerce analytics
|