```csv x,y,xg,shot_type,situation,player_received_ball,last_5_years,goal 72,34,0.12,Open Play,Through Ball,Pass,True,0 25,45,0.08,Open Play,Cross,Header,True,0 90,20,0.25,Open Play,Ground Pass,Dribble,True,1 5,10,0.02,Open Play,Set Piece,Pass,True,0 15,60,0.05,Open Play,Through Ball,Pass,True,0 85,15,0.30,Open Play,Cross,Volley,True,1 30,30,0.15,Open Play,Ground Pass,Pass,True,0 95,5,0.40,Open Play,Through Ball,Dribble,True,1 10,50,0.03,Open Play,Set Piece,Pass,True,0 60,40,0.18,Open Play,Ground Pass,Pass,True,0 78,28,0.20,Open Play,Through Ball,Pass,True,0 22,55,0.07,Open Play,Cross,Header,True,0 88,18,0.35,Open Play,Ground Pass,Dribble,True,1 8,12,0.01,Open Play,Set Piece,Pass,True,0 40,48,0.10,Open Play,Through Ball,Pass,True,0 75,32,0.22,Open Play,Cross,Volley,True,1 35,38,0.17,Open Play,Ground Pass,Pass,True,0 92,8,0.38,Open Play,Through Ball,Dribble,True,1 18,58,0.06,Open Play,Set Piece,Pass,True,0 55,45,0.16,Open Play,Ground Pass,Pass,True,0 70,36,0.19,Open Play,Through Ball,Pass,True,0 28,52,0.09,Open Play,Cross,Header,True,0 86,22,0.32,Open Play,Ground Pass,Dribble,True,1 12,42,0.04,Open Play,Set Piece,Pass,True,0 45,42,0.13,Open Play,Through Ball,Pass,True,0 68,30,0.21,Open Play,Cross,Volley,True,0 32,40,0.14,Open Play,Ground Pass,Pass,True,0 98,6,0.42,Open Play,Through Ball,Dribble,True,1 2,20,0.01,Open Play,Set Piece,Pass,True,0 50,50,0.15,Open Play,Ground Pass,Pass,True,0 77,26,0.24,Open Play,Through Ball,Pass,True,1 26,54,0.08,Open Play,Cross,Header,True,0 84,24,0.31,Open Play,Ground Pass,Dribble,True,1 16,46,0.05,Open Play,Set Piece,Pass,True,0 48,46,0.11,Open Play,Through Ball,Pass,True,0 65,35,0.18,Open Play,Cross,Volley,True,0 38,36,0.16,Open Play,Ground Pass,Pass,True,0 96,10,0.39,Open Play,Through Ball,Dribble,True,1 6,18,0.02,Open Play,Set Piece,Pass,True,0 58,48,0.17,Open Play,Ground Pass,Pass,True,0 74,29,0.23,Open Play,Through Ball,Pass,True,1 24,56,0.07,Open Play,Cross,Header,True,0 82,24,0.29,Open Play,Ground Pass,Dribble,True,1 14,44,0.04,Open Play,Set Piece,Pass,True,0 42,44,0.12,Open Play,Through Ball,Pass,True,0 62,38,0.19,Open Play,Cross,Volley,True,0 36,34,0.15,Open Play,Ground Pass,Pass,True,0 94,12,0.37,Open Play,Through Ball,Dribble,True,1 4,16,0.01,Open Play,Set Piece,Pass,True,0 54,52,0.14,Open Play,Ground Pass,Pass,True,0 76,27,0.25,Open Play,Through Ball,Pass,True,1 20,60,0.06,Open Play,Cross,Header,True,0 80,26,0.28,Open Play,Ground Pass,Dribble,True,1 10,40,0.03,Open Play,Set Piece,Pass,True,0 46,48,0.11,Open Play,Through Ball,Pass,True,0 66,33,0.20,Open Play,Cross,Volley,True,0 34,32,0.14,Open Play,Ground Pass,Pass,True,0 90,14,0.36,Open Play,Through Ball,Dribble,True,1 2,14,0.01,Open Play,Set Piece,Pass,True,0 52,54,0.13,Open Play,Ground Pass,Pass,True,0 78,25,0.26,Open Play,Through Ball,Pass,True,1 22,58,0.07,Open Play,Cross,Header,True,0 86,20,0.30,Open Play,Ground Pass,Dribble,True,1 18,48,0.05,Open Play,Set Piece,Pass,True,0 44,46,0.10,Open Play,Through Ball,Pass,True,0 64,36,0.19,Open Play,Cross,Volley,True,0 30,38,0.13,Open Play,Ground Pass,Pass,True,0 88,16,0.34,Open Play,Through Ball,Dribble,True,1 6,10,0.02,Open Play,Set Piece,Pass,True,0 56,50,0.15,Open Play,Ground Pass,Pass,True,0 72,32,0.22,Open Play,Through Ball,Pass,True,1 28,52,0.08,Open Play,Cross,Header,True,0 84,22,0.31,Open Play,Ground Pass,Dribble,True,1 16,42,0.04,Open Play,Set Piece,Pass,True,0 40,44,0.11,Open Play,Through Ball,Pass,True,0 60,34,0.17,Open Play,Cross,Volley,True,0 38,30,0.14,Open Play,Ground Pass,Pass,True,0 92,8,0.38,Open Play,Through Ball,Dribble,True,1 10,20,0.03,Open Play,Set Piece,Pass,True,0 50,50,0.13,Open Play,Ground Pass,Pass,True,0 76,28,0.24,Open Play,Through Ball,Pass,True,1 26,56,0.07,Open Play,Cross,Header,True,0 82,20,0.28,Open Play,Ground Pass,Dribble,True,1 14,46,0.04,Open Play,Set Piece,Pass,True,0 46,42,0.12,Open Play,Through Ball,Pass,True,0 68,32,0.21,Open Play,Cross,Volley,True,0 34,36,0.15,Open Play,Ground Pass,Pass,True,0 98,6,0.40,Open Play,Through Ball,Dribble,True,1 4,18,0.01,Open Play,Set Piece,Pass,True,0 58,46,0.16,Open Play,Ground Pass,Pass,True,0 74,30,0.23,Open Play,Through Ball,Pass,True,1 24,54,0.08,Open Play,Cross,Header,True,0 80,28,0.29,Open Play,Ground Pass,Dribble,True,1 12,44,0.04,Open Play,Set Piece,Pass,True,0 42,48,0.10,Open Play,Through Ball,Pass,True,0 66,30,0.20,Open Play,Cross,Volley,True,0 36,38,0.14,Open Play,Ground Pass,Pass,True,0 96,10,0.39,Open Play,Through Ball,Dribble,True,1 8,14,0.02,Open Play,Set Piece,Pass,True,0 54,54,0.14,Open Play,Ground Pass,Pass,True,0 ``` **Feature Explanation and Distribution:** The synthetic dataset includes the following features, inspired by the provided text: * **x:** X coordinate of the shot on the pitch (numeric, likely uniformly distributed across the width of the pitch, 0-100). * **y:** Y coordinate of the shot on the pitch (numeric, likely uniformly distributed across the length of the pitch, 0-100). * **xg:** Expected goals (numeric, likely right-skewed distribution with most values between 0 and 0.2, a few higher values). Represents the probability of a goal given the shot's circumstances. * **shot_type:** Type of shot (categorical, likely Open Play is the most frequent). * **situation:** The type of scenario leading to the shot (categorical, various possibilities like Through Ball, Cross, Ground Pass, Set Piece). * **player_received_ball:** How the player received the ball before the shot (categorical, options like Pass, Header, Dribble, Volley). * **last_5_years:** Boolean indicating if the shot occurred in the last 5 years (Boolean, mostly True). * **goal:** Binary outcome, 1 if the shot resulted in a goal, 0 otherwise (categorical, imbalanced with more 0s than 1s). **Distribution Notes:** * Numerical features (x, y, xg) are generated with a mix of uniform and slightly skewed distributions to reflect the potential spatial distribution of shots and the probability of a goal. * Categorical features have varying probabilities assigned to different categories to simulate the imbalance and frequency of different shot types and situations as described in the paper. The goal feature is explicitly imbalanced, reflecting the difficulty of scoring in soccer. This synthetic dataset provides a reasonable approximation of the features and their distributions based on the limited information in the provided text. The actual distributions in the original dataset might be slightly different.