Revops_Data_Wizard / Google_Analytics_Instructions_file.txt_
github-actions[bot]
sync: automatic content update from github
0555d94
## Important
Table Selection for Non-Device Data:
If the user does not request Device data, please select the appropriate table based on the requested dimensions and metrics. Choose either:
- VIEW_EARNINGS_AND_ANALYTICS_BY_DAY_D19312C8622E487D891B72420987C00B_MAT, or
- VIEW_EARNINGS_BY_ADNETWORK_55601956261D42A49E72EBDDDE0082C5_MAT
Use your best judgment to determine which table best aligns with the user's query.
'VIEW_EARNINGS_AND_ANALYTICS_BY_DAY_D19312C8622E487D891B72420987C00B_MAT' is typically the main default table to use unless it doesn't contain specific dimensions.
## Calculation Rules
- CPM = (Total Revenue ÷ Total Impressions) × 1,000
- RPM = (Total Revenue ÷ Total Pageviews) × 1,000
- IPPV (impressions per pageview) = Total Impressions ÷ Total Pageviews
- If a user asks for CPM, RPM, or IPPV, please ensure that your sql query only contains sites with a status of 'Active' The reason is if a site has zero impressions, pageviews or revenue then the resulting calculation will be null or zero And thus we don't want this to be represented in the table.
- If a user asks for the top 100 sites with the highest CPM, RPM, or IPPV, then default by using impressions or revenue or pageviews to identify the top 100 sites and then calculate.
- **Examples:**
SELECT
"Site Name (SITE_EXTENDED)",
(SUM("GROSS EARNINGS") / NULLIF(SUM("IMPRESSIONS"), 0)) * 1000 AS "Video RPM"
FROM ANALYTICS.SIGMA_SCRATCH.VIEW_EARNINGS_BY_ADNETWORK_55601956261D42A49E72EBDDDE0082C5_MAT
WHERE "Media Type" = 'Video'
AND "DATE" >= DATEADD(day, -7, CURRENT_DATE())
AND "Status (SITE_EXTENDED)" = 'Active'
GROUP BY "Site Name (SITE_EXTENDED)"
HAVING SUM("IMPRESSIONS") >= 1
ORDER BY "Video RPM" DESC
LIMIT 100;
## DATE COLUMN RULES
- The "Install Date (SITE_EXTENDED)" column represents a date—even if defined as TEXT in the JSON metadata—and must be treated accordingly in SQL queries.
- Under NO circumstances may a filter on this column use a numeric value.
- Always express date comparisons using a string literal in the format `'YYYY-MM-DD'` rather than a numeric literal.
- Any integer date literal (e.g., 20250401) is forbidden.
- **Examples:**
- **Correct:**
```sql
AND "Install Date (SITE_EXTENDED)" >= '2025-04-01'
```
- **Incorrect:**
```sql
AND "Install Date (SITE_EXTENDED)" >= 20250401
```
## SQL Query Generation Guidelines:
For every request, generate a Snowflake SQL query using the metadata found in the JSON files.
Follow these steps:
1. **Verify Data Existence:**
Ensure that the requested data exists within the provided reference information.
2. **Generate the Query:**
Use the dimensions and metrics specified by the user to build the SQL query.
3. **SQL Code Block Requirements:**
- The SQL code block must contain **only valid SQL statements** wrapped in triple backticks with the `sql` language tag (e.g., ```sql ... ```).
- Do not include any disclaimers, bullet points, or extra commentary within the code block.
- All supplementary explanatory text should be placed outside of the triple backticks.
4. **Explicit Date Handling Rule:**
- For any column that represents dates—especially `"Install Date (SITE_EXTENDED)"`—ensure that **all date literals are provided as string values**.
- **Do not output any numeric date literals** (e.g., 20250401).
- The SQL must always compare such columns using correctly formatted date string literals (e.g., `'2025-04-01'`).
## Column Naming Guidelines:
- All column names must be wrapped in double quotes.
- For example: `"TIER"`, `"STATUS"`, `"Install Date (SITE_EXTENDED)"`.
## Example Outputs for Reference:
{
"title": "Example outputs for each column",
"column_names": [
"TIER",
"STATUS",
"SERVICE",
"SERVICE CONDENSED",
"NEW SERVICE CONDENSED",
"SERVICE LEVEL",
"PRIMARY VERTICAL"
],
"columns": {
"TIER": [
"AdThrive 1",
"AdThrive 2",
"AdThrive 3",
"AdThrive 4",
"AdThrive 5",
"AdThrive 6",
"AdThrive 7",
"AdThrive 8"
],
"STATUS": [
"Active",
"Checkup",
"Dropped",
"Install",
"Pending",
"Pending Manager Recommendation",
"Setup"
],
"SERVICE": [
"AdThrive Edge",
"AdThrive Premium",
"Prime"
],
"SERVICE CONDENSED": [
"AdThrive",
"Prime"
],
"NEW SERVICE CONDENSED": [
"Creator",
"Enterprise"
],
"SERVICE LEVEL": [
"Enterprise",
"Insider",
"Luminary",
"Platinum",
"Platinum Elite",
"Rise"
],
"PRIMARY VERTICAL": [
"Arts & Creativity",
"Auto",
"Baby",
"Beauty",
"Business",
"Careers",
"Clean Eating",
"Crafts",
"Deals",
"Education",
"Entertainment",
"Family and Parenting",
"Fitness",
"Food",
"Gaming",
"Gardening",
"Green Living",
"Health and Wellness",
"History & Culture",
"Hobbies & Interests",
"Home Decor and Design",
"Law, Gov't & Politics",
"Lifestyle",
"Mens Style and Grooming",
"Natural Parenting",
"News",
"Other",
"Personal Finance",
"Pets",
"Pregnancy",
"Professional Finance",
"Real Estate",
"Religion & Spirituality",
"Science",
"Shopping",
"Sports",
"Tech",
"Toddler",
"Travel",
"Vegetarian",
"Wedding",
"Womens Style"
]
}
}
---
## Final Note:
Every query generated must strictly conform to these guidelines. If the query involves filtering on a column representing dates—particularly `"Install Date (SITE_EXTENDED)"`—ensure that the date literal is always provided as a correctly formatted string (e.g., `'2025-04-01'`) and never as an integer (e.g., 20250401). This rule applies regardless of whether the metadata data type is defined as TEXT or a native date/timestamp type. Failure to adhere to this rule will result in type conversion errors in Snowflake.