Revops_Data_Wizard / GAMLOG_ABGROUP_Instructions_file.txt
github-actions[bot]
sync: automatic content update from github
0555d94
## A/B Normalization SQL Prompt Guide (For Model Inference)
This guide outlines how to instruct a model to write Snowflake SQL for normalized A/B test comparisons using the ANALYTICS.ADCODE_DA.AB_GRANULAR table.
---
## Goal
ALWAYS Generate a SQL query that:
- Filters the table by a given ab_key and a specific date
- Aggregates total impressions and revenue per ab_value
- Normalizes impressions and revenue by dividing each by its allocation percentage
- Selects the control group per segment as the ab_value with the highest number of impressions
- Computes each variant’s performance ratio to control for both normalized revenue and impressions
- Filters out low-volume rows (e.g., where total_imps < 10000)
---
Dimensions
## The model must support dynamically adding or removing dimensions based on user input.
Supported dimensions include:
- device
- browser_clean
- format
- topadunitname
- refresh_num
- lazy
## When dimensions are included in the user’s request, they must also be:
- Included in the SELECT and GROUP BY clauses
- Added to the PARTITION BY clause when determining the control
- Used in the JOIN condition when aligning experimental rows with their corresponding control group
---
## Logic Flow
1. Filter the data for the specific AB test key and date.
2. Group by the selected dimensions and ab_value.
3. Aggregate total impressions and total revenue.
4. Normalize:
- normalized_imps = total_imps / percentage
- normalized_rev = total_rev / percentage
5. Determine control per segment:
- Use ROW_NUMBER() partitioned by the selected dimensions (e.g., device, browser_clean)
- Order by total_imps DESC
- Select the first row as the control
6. Join the normalized data back to the control data using the same dimensions
7. Compute ratios to control:
- (normalized_rev / control_rev) - 1
- (normalized_imps / control_imps) - 1
8. Sort by revenue ratio to control, descending
---
## SQL Syntax Considerations
- Use CAST(PERCENTAGE AS FLOAT) before division
- Use HAVING SUM(IMPS) >= 10000 to filter out low-volume variants
- Ensure consistent aliasing (e.g., use n for normalized data, c for control)
- Always match segment dimensions exactly in the JOIN (e.g., n.device = c.device)
## Output Expectation
A single Snowflake SQL query that:
- Adheres to the logic above
- Dynamically includes any dimensions mentioned in the request
- Is clean, readable, and production-ready
## Important
- Please ensure that the 'View Data Analysis & Insights' highlights the revenue and impression lift for the experiment values and flags outliers, if any.
- **Default** If a user does not request specific dimensions, please portray topline or aggregate data, please only include the 'AB_KEY' and 'AB_VALUE' and aggregate the data over that timeframe
- Always filter out low-volume rows HAVING SUM(IMPS) >= 10000
- Always ensure you normalize the data so that there are columns portraying the lift % for the metrics (i.e imps, rev, etc.).
## Example query
WITH base_data AS (
SELECT
AB_VALUE,
DEVICE,
BROWSER_CLEAN,
CAST(PERCENTAGE AS FLOAT) AS percentage,
SUM(IMPS) AS total_imps,
SUM(REV) AS total_rev
FROM ANALYTICS.ADCODE_DA.AB_GRANULAR
WHERE AB_KEY = 'cnftp'
AND DATE = '2025-04-01'
GROUP BY AB_VALUE, DEVICE, BROWSER_CLEAN, PERCENTAGE
HAVING SUM(IMPS) >= 10000
),
normalized AS (
SELECT
AB_VALUE,
DEVICE,
BROWSER_CLEAN,
total_imps,
total_rev,
percentage,
(total_imps / percentage) AS norm_imps,
(total_rev / percentage) AS norm_rev
FROM base_data
),
ranked_controls AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY DEVICE, BROWSER_CLEAN
ORDER BY total_imps DESC
) AS rn
FROM normalized
),
control AS (
SELECT
DEVICE,
BROWSER_CLEAN,
AB_VALUE AS control_ab_value,
norm_imps AS control_norm_imps,
norm_rev AS control_norm_rev
FROM ranked_controls
WHERE rn = 1
)
SELECT
n.AB_VALUE,
n.DEVICE,
n.BROWSER_CLEAN,
n.total_imps,
n.total_rev,
n.percentage,
ROUND(n.norm_imps, 2) AS norm_imps,
ROUND(n.norm_rev, 2) AS norm_rev,
ROUND(n.norm_imps / c.control_norm_imps, 4) - 1 AS norm_imps_ratio_to_control,
ROUND(n.norm_rev / c.control_norm_rev, 4) - 1 AS norm_rev_ratio_to_control
FROM normalized n
JOIN control c
ON n.DEVICE = c.DEVICE
AND n.BROWSER_CLEAN = c.BROWSER_CLEAN
ORDER BY n.DEVICE, n.BROWSER_CLEAN, norm_rev_ratio_to_control DESC;