DeanGumas commited on
Commit
e92176b
·
1 Parent(s): ae5db4e

full training run completed, created combined dataset tsv, validation code must be updated still

Browse files
Files changed (40) hide show
  1. finetune_model.ipynb +0 -0
  2. finetuned-model-16-full/checkpoint-448/README.md +202 -0
  3. finetuned-model-16-full/checkpoint-448/adapter_config.json +39 -0
  4. finetuned-model-16-full/checkpoint-448/adapter_model.safetensors +3 -0
  5. finetuned-model-16-full/checkpoint-448/optimizer.pt +3 -0
  6. finetuned-model-16-full/checkpoint-448/rng_state.pth +3 -0
  7. finetuned-model-16-full/checkpoint-448/scheduler.pt +3 -0
  8. finetuned-model-16-full/checkpoint-448/special_tokens_map.json +26 -0
  9. finetuned-model-16-full/checkpoint-448/tokenizer.json +0 -0
  10. finetuned-model-16-full/checkpoint-448/tokenizer_config.json +206 -0
  11. finetuned-model-16-full/checkpoint-448/trainer_state.json +155 -0
  12. finetuned-model-16-full/checkpoint-448/training_args.bin +3 -0
  13. finetuned-model-16-full/checkpoint-576/README.md +202 -0
  14. finetuned-model-16-full/checkpoint-576/adapter_config.json +39 -0
  15. finetuned-model-16-full/checkpoint-576/adapter_model.safetensors +3 -0
  16. finetuned-model-16-full/checkpoint-576/optimizer.pt +3 -0
  17. finetuned-model-16-full/checkpoint-576/rng_state.pth +3 -0
  18. finetuned-model-16-full/checkpoint-576/scheduler.pt +3 -0
  19. finetuned-model-16-full/checkpoint-576/special_tokens_map.json +26 -0
  20. finetuned-model-16-full/checkpoint-576/tokenizer.json +0 -0
  21. finetuned-model-16-full/checkpoint-576/tokenizer_config.json +206 -0
  22. finetuned-model-16-full/checkpoint-576/trainer_state.json +192 -0
  23. finetuned-model-16-full/checkpoint-576/training_args.bin +3 -0
  24. finetuned-model-16-full/config.json +48 -0
  25. finetuned-model-16-full/generation_config.json +6 -0
  26. finetuned-model-16-full/model.safetensors +3 -0
  27. finetuned-model-16-full/runs/Nov23_23-46-54_DESKTOP-SMJC97K/events.out.tfevents.1763970415.DESKTOP-SMJC97K.13948.0 +3 -0
  28. finetuned-model-16-full/special_tokens_map.json +26 -0
  29. finetuned-model-16-full/tokenizer.json +0 -0
  30. finetuned-model-16-full/tokenizer_config.json +206 -0
  31. training-data/combined_full_dataset.tsv +0 -0
  32. training-data/nba_test_set.tsv +151 -0
  33. training-data/tennis_train_set_connor.tsv +208 -0
  34. training-data/{tennis_train_set.tsv → tennis_train_set_dean.tsv} +0 -0
  35. training-data/tennis_train_set_mehul.tsv +204 -0
  36. training-data/test_set.tsv +251 -151
  37. utils/processing/combine_datasets.ipynb +82 -19
  38. val-16-full.hf/data-00000-of-00001.arrow +3 -0
  39. val-16-full.hf/dataset_info.json +33 -0
  40. val-16-full.hf/state.json +13 -0
finetune_model.ipynb CHANGED
The diff for this file is too large to render. See raw diff
 
finetuned-model-16-full/checkpoint-448/README.md ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ./deepseek-coder-1.3b-instruct
3
+ library_name: peft
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+ ### Framework versions
201
+
202
+ - PEFT 0.15.1
finetuned-model-16-full/checkpoint-448/adapter_config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "./deepseek-coder-1.3b-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 32,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.0,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": null,
22
+ "peft_type": "LORA",
23
+ "r": 16,
24
+ "rank_pattern": {},
25
+ "revision": null,
26
+ "target_modules": [
27
+ "o_proj",
28
+ "q_proj",
29
+ "down_proj",
30
+ "gate_proj",
31
+ "k_proj",
32
+ "up_proj",
33
+ "v_proj"
34
+ ],
35
+ "task_type": "CAUSAL_LM",
36
+ "trainable_token_indices": null,
37
+ "use_dora": false,
38
+ "use_rslora": false
39
+ }
finetuned-model-16-full/checkpoint-448/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:439a95c95d1f0e74304a3b3112fe443d2221f5b819ed730c39c61c8170a84730
3
+ size 322342688
finetuned-model-16-full/checkpoint-448/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c58ed4b9c5b8c4eed5ee0b1157cfc8d66ef0e91914fe2b285f8b92fed9a67fad
3
+ size 120213058
finetuned-model-16-full/checkpoint-448/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e082090983dd756d25706b1c9d33775d066a1b8317856af058e5917ecc5037b
3
+ size 14244
finetuned-model-16-full/checkpoint-448/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22a9d6952543d6b2835e0be6cf01eaac2ea0da968793df18598491c75943d39b
3
+ size 1064
finetuned-model-16-full/checkpoint-448/special_tokens_map.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ {
4
+ "content": "<|endofsql|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ }
10
+ ],
11
+ "bos_token": {
12
+ "content": "<|begin▁of▁sentence|>",
13
+ "lstrip": false,
14
+ "normalized": true,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "eos_token": "<|endofsql|>",
19
+ "pad_token": {
20
+ "content": "<|end▁of▁sentence|>",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ }
26
+ }
finetuned-model-16-full/checkpoint-448/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
finetuned-model-16-full/checkpoint-448/tokenizer_config.json ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "32000": {
7
+ "content": "õ",
8
+ "lstrip": false,
9
+ "normalized": true,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": false
13
+ },
14
+ "32001": {
15
+ "content": "÷",
16
+ "lstrip": false,
17
+ "normalized": true,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": false
21
+ },
22
+ "32002": {
23
+ "content": "Á",
24
+ "lstrip": false,
25
+ "normalized": true,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": false
29
+ },
30
+ "32003": {
31
+ "content": "ý",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": false
37
+ },
38
+ "32004": {
39
+ "content": "À",
40
+ "lstrip": false,
41
+ "normalized": true,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": false
45
+ },
46
+ "32005": {
47
+ "content": "ÿ",
48
+ "lstrip": false,
49
+ "normalized": true,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": false
53
+ },
54
+ "32006": {
55
+ "content": "ø",
56
+ "lstrip": false,
57
+ "normalized": true,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": false
61
+ },
62
+ "32007": {
63
+ "content": "ú",
64
+ "lstrip": false,
65
+ "normalized": true,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": false
69
+ },
70
+ "32008": {
71
+ "content": "þ",
72
+ "lstrip": false,
73
+ "normalized": true,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": false
77
+ },
78
+ "32009": {
79
+ "content": "ü",
80
+ "lstrip": false,
81
+ "normalized": true,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": false
85
+ },
86
+ "32010": {
87
+ "content": "ù",
88
+ "lstrip": false,
89
+ "normalized": true,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": false
93
+ },
94
+ "32011": {
95
+ "content": "ö",
96
+ "lstrip": false,
97
+ "normalized": true,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": false
101
+ },
102
+ "32012": {
103
+ "content": "û",
104
+ "lstrip": false,
105
+ "normalized": true,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": false
109
+ },
110
+ "32013": {
111
+ "content": "<|begin▁of▁sentence|>",
112
+ "lstrip": false,
113
+ "normalized": true,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": true
117
+ },
118
+ "32014": {
119
+ "content": "<|end▁of▁sentence|>",
120
+ "lstrip": false,
121
+ "normalized": true,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": true
125
+ },
126
+ "32015": {
127
+ "content": "<|fim▁hole|>",
128
+ "lstrip": false,
129
+ "normalized": true,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "32016": {
135
+ "content": "<|fim▁begin|>",
136
+ "lstrip": false,
137
+ "normalized": true,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "32017": {
143
+ "content": "<|fim▁end|>",
144
+ "lstrip": false,
145
+ "normalized": true,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "32018": {
151
+ "content": "<pad>",
152
+ "lstrip": false,
153
+ "normalized": true,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "32019": {
159
+ "content": "<|User|>",
160
+ "lstrip": false,
161
+ "normalized": true,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "32020": {
167
+ "content": "<|Assistant|>",
168
+ "lstrip": false,
169
+ "normalized": true,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "32021": {
175
+ "content": "<|EOT|>",
176
+ "lstrip": false,
177
+ "normalized": true,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": true
181
+ },
182
+ "32022": {
183
+ "content": "<|endofsql|>",
184
+ "lstrip": false,
185
+ "normalized": false,
186
+ "rstrip": false,
187
+ "single_word": false,
188
+ "special": true
189
+ }
190
+ },
191
+ "additional_special_tokens": [
192
+ "<|endofsql|>"
193
+ ],
194
+ "bos_token": "<|begin▁of▁sentence|>",
195
+ "chat_template": "{% if not add_generation_prompt is defined %}\n{% set add_generation_prompt = false %}\n{% endif %}\n{%- set ns = namespace(found=false) -%}\n{%- for message in messages -%}\n {%- if message['role'] == 'system' -%}\n {%- set ns.found = true -%}\n {%- endif -%}\n{%- endfor -%}\n{{bos_token}}{%- if not ns.found -%}\n{{'You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\\n'}}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n{{ message['content'] }}\n {%- else %}\n {%- if message['role'] == 'user' %}\n{{'### Instruction:\\n' + message['content'] + '\\n'}}\n {%- else %}\n{{'### Response:\\n' + message['content'] + '\\n<|EOT|>\\n'}}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{% if add_generation_prompt %}\n{{'### Response:'}}\n{% endif %}",
196
+ "clean_up_tokenization_spaces": false,
197
+ "eos_token": "<|endofsql|>",
198
+ "extra_special_tokens": {},
199
+ "legacy": true,
200
+ "model_max_length": 16384,
201
+ "pad_token": "<|end▁of▁sentence|>",
202
+ "sp_model_kwargs": {},
203
+ "tokenizer_class": "LlamaTokenizerFast",
204
+ "unk_token": null,
205
+ "use_default_system_prompt": false
206
+ }
finetuned-model-16-full/checkpoint-448/trainer_state.json ADDED
@@ -0,0 +1,155 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 448,
3
+ "best_metric": 0.10987533628940582,
4
+ "best_model_checkpoint": "./finetuned-model-16-full\\checkpoint-448",
5
+ "epoch": 7.0,
6
+ "eval_steps": 500,
7
+ "global_step": 448,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 0.7889546351084813,
14
+ "grad_norm": 0.21714453399181366,
15
+ "learning_rate": 4.603174603174603e-05,
16
+ "loss": 0.5977,
17
+ "step": 50
18
+ },
19
+ {
20
+ "epoch": 1.0,
21
+ "eval_loss": 0.1679154932498932,
22
+ "eval_runtime": 449.444,
23
+ "eval_samples_per_second": 0.556,
24
+ "eval_steps_per_second": 0.556,
25
+ "step": 64
26
+ },
27
+ {
28
+ "epoch": 1.5680473372781065,
29
+ "grad_norm": 0.13297569751739502,
30
+ "learning_rate": 4.2063492063492065e-05,
31
+ "loss": 0.1177,
32
+ "step": 100
33
+ },
34
+ {
35
+ "epoch": 2.0,
36
+ "eval_loss": 0.13711020350456238,
37
+ "eval_runtime": 448.5722,
38
+ "eval_samples_per_second": 0.557,
39
+ "eval_steps_per_second": 0.557,
40
+ "step": 128
41
+ },
42
+ {
43
+ "epoch": 2.3471400394477318,
44
+ "grad_norm": 0.11708807200193405,
45
+ "learning_rate": 3.809523809523809e-05,
46
+ "loss": 0.0936,
47
+ "step": 150
48
+ },
49
+ {
50
+ "epoch": 3.0,
51
+ "eval_loss": 0.12583088874816895,
52
+ "eval_runtime": 449.2291,
53
+ "eval_samples_per_second": 0.557,
54
+ "eval_steps_per_second": 0.557,
55
+ "step": 192
56
+ },
57
+ {
58
+ "epoch": 3.126232741617357,
59
+ "grad_norm": 0.13195854425430298,
60
+ "learning_rate": 3.412698412698413e-05,
61
+ "loss": 0.0876,
62
+ "step": 200
63
+ },
64
+ {
65
+ "epoch": 3.9151873767258385,
66
+ "grad_norm": 0.1334521323442459,
67
+ "learning_rate": 3.0158730158730158e-05,
68
+ "loss": 0.0773,
69
+ "step": 250
70
+ },
71
+ {
72
+ "epoch": 4.0,
73
+ "eval_loss": 0.11675416678190231,
74
+ "eval_runtime": 447.9794,
75
+ "eval_samples_per_second": 0.558,
76
+ "eval_steps_per_second": 0.558,
77
+ "step": 256
78
+ },
79
+ {
80
+ "epoch": 4.6942800788954635,
81
+ "grad_norm": 0.1745067685842514,
82
+ "learning_rate": 2.6190476190476192e-05,
83
+ "loss": 0.0715,
84
+ "step": 300
85
+ },
86
+ {
87
+ "epoch": 5.0,
88
+ "eval_loss": 0.11324296146631241,
89
+ "eval_runtime": 449.3754,
90
+ "eval_samples_per_second": 0.556,
91
+ "eval_steps_per_second": 0.556,
92
+ "step": 320
93
+ },
94
+ {
95
+ "epoch": 5.4733727810650885,
96
+ "grad_norm": 0.14304669201374054,
97
+ "learning_rate": 2.2222222222222223e-05,
98
+ "loss": 0.0678,
99
+ "step": 350
100
+ },
101
+ {
102
+ "epoch": 6.0,
103
+ "eval_loss": 0.11094386130571365,
104
+ "eval_runtime": 448.7599,
105
+ "eval_samples_per_second": 0.557,
106
+ "eval_steps_per_second": 0.557,
107
+ "step": 384
108
+ },
109
+ {
110
+ "epoch": 6.252465483234714,
111
+ "grad_norm": 0.20627053081989288,
112
+ "learning_rate": 1.8253968253968254e-05,
113
+ "loss": 0.0628,
114
+ "step": 400
115
+ },
116
+ {
117
+ "epoch": 7.0,
118
+ "eval_loss": 0.10987533628940582,
119
+ "eval_runtime": 446.2745,
120
+ "eval_samples_per_second": 0.56,
121
+ "eval_steps_per_second": 0.56,
122
+ "step": 448
123
+ }
124
+ ],
125
+ "logging_steps": 50,
126
+ "max_steps": 630,
127
+ "num_input_tokens_seen": 0,
128
+ "num_train_epochs": 10,
129
+ "save_steps": 500,
130
+ "stateful_callbacks": {
131
+ "EarlyStoppingCallback": {
132
+ "args": {
133
+ "early_stopping_patience": 2,
134
+ "early_stopping_threshold": 0.0
135
+ },
136
+ "attributes": {
137
+ "early_stopping_patience_counter": 0
138
+ }
139
+ },
140
+ "TrainerControl": {
141
+ "args": {
142
+ "should_epoch_stop": false,
143
+ "should_evaluate": false,
144
+ "should_log": false,
145
+ "should_save": true,
146
+ "should_training_stop": false
147
+ },
148
+ "attributes": {}
149
+ }
150
+ },
151
+ "total_flos": 1.7404803793236787e+17,
152
+ "train_batch_size": 1,
153
+ "trial_name": null,
154
+ "trial_params": null
155
+ }
finetuned-model-16-full/checkpoint-448/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8ae54bf6183491c37908c8caa39df99368338c0538c4bd6a8180c8d4391e698
3
+ size 5368
finetuned-model-16-full/checkpoint-576/README.md ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ./deepseek-coder-1.3b-instruct
3
+ library_name: peft
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
200
+ ### Framework versions
201
+
202
+ - PEFT 0.15.1
finetuned-model-16-full/checkpoint-576/adapter_config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "./deepseek-coder-1.3b-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 32,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.0,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": null,
22
+ "peft_type": "LORA",
23
+ "r": 16,
24
+ "rank_pattern": {},
25
+ "revision": null,
26
+ "target_modules": [
27
+ "o_proj",
28
+ "q_proj",
29
+ "down_proj",
30
+ "gate_proj",
31
+ "k_proj",
32
+ "up_proj",
33
+ "v_proj"
34
+ ],
35
+ "task_type": "CAUSAL_LM",
36
+ "trainable_token_indices": null,
37
+ "use_dora": false,
38
+ "use_rslora": false
39
+ }
finetuned-model-16-full/checkpoint-576/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca124c6d98210e0ccc2eec5355119c139a89e7320d2c4d2c925fb92c8b520d4e
3
+ size 322342688
finetuned-model-16-full/checkpoint-576/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee4c1833b01e730c3074ace15c91106b893498b3421b9761f6a3af3dee2861bc
3
+ size 120213058
finetuned-model-16-full/checkpoint-576/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dcee61872be8ce4b146937b29f4c887e8389a189ad402764ed693c6c6611ac20
3
+ size 14244
finetuned-model-16-full/checkpoint-576/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b41a0518285ef2a38b25e05a4c65a5d28964fb248c6dcff0d6a98c908d92a873
3
+ size 1064
finetuned-model-16-full/checkpoint-576/special_tokens_map.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ {
4
+ "content": "<|endofsql|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ }
10
+ ],
11
+ "bos_token": {
12
+ "content": "<|begin▁of▁sentence|>",
13
+ "lstrip": false,
14
+ "normalized": true,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "eos_token": "<|endofsql|>",
19
+ "pad_token": {
20
+ "content": "<|end▁of▁sentence|>",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ }
26
+ }
finetuned-model-16-full/checkpoint-576/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
finetuned-model-16-full/checkpoint-576/tokenizer_config.json ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "32000": {
7
+ "content": "õ",
8
+ "lstrip": false,
9
+ "normalized": true,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": false
13
+ },
14
+ "32001": {
15
+ "content": "÷",
16
+ "lstrip": false,
17
+ "normalized": true,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": false
21
+ },
22
+ "32002": {
23
+ "content": "Á",
24
+ "lstrip": false,
25
+ "normalized": true,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": false
29
+ },
30
+ "32003": {
31
+ "content": "ý",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": false
37
+ },
38
+ "32004": {
39
+ "content": "À",
40
+ "lstrip": false,
41
+ "normalized": true,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": false
45
+ },
46
+ "32005": {
47
+ "content": "ÿ",
48
+ "lstrip": false,
49
+ "normalized": true,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": false
53
+ },
54
+ "32006": {
55
+ "content": "ø",
56
+ "lstrip": false,
57
+ "normalized": true,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": false
61
+ },
62
+ "32007": {
63
+ "content": "ú",
64
+ "lstrip": false,
65
+ "normalized": true,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": false
69
+ },
70
+ "32008": {
71
+ "content": "þ",
72
+ "lstrip": false,
73
+ "normalized": true,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": false
77
+ },
78
+ "32009": {
79
+ "content": "ü",
80
+ "lstrip": false,
81
+ "normalized": true,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": false
85
+ },
86
+ "32010": {
87
+ "content": "ù",
88
+ "lstrip": false,
89
+ "normalized": true,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": false
93
+ },
94
+ "32011": {
95
+ "content": "ö",
96
+ "lstrip": false,
97
+ "normalized": true,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": false
101
+ },
102
+ "32012": {
103
+ "content": "û",
104
+ "lstrip": false,
105
+ "normalized": true,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": false
109
+ },
110
+ "32013": {
111
+ "content": "<|begin▁of▁sentence|>",
112
+ "lstrip": false,
113
+ "normalized": true,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": true
117
+ },
118
+ "32014": {
119
+ "content": "<|end▁of▁sentence|>",
120
+ "lstrip": false,
121
+ "normalized": true,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": true
125
+ },
126
+ "32015": {
127
+ "content": "<|fim▁hole|>",
128
+ "lstrip": false,
129
+ "normalized": true,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "32016": {
135
+ "content": "<|fim▁begin|>",
136
+ "lstrip": false,
137
+ "normalized": true,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "32017": {
143
+ "content": "<|fim▁end|>",
144
+ "lstrip": false,
145
+ "normalized": true,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "32018": {
151
+ "content": "<pad>",
152
+ "lstrip": false,
153
+ "normalized": true,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "32019": {
159
+ "content": "<|User|>",
160
+ "lstrip": false,
161
+ "normalized": true,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "32020": {
167
+ "content": "<|Assistant|>",
168
+ "lstrip": false,
169
+ "normalized": true,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "32021": {
175
+ "content": "<|EOT|>",
176
+ "lstrip": false,
177
+ "normalized": true,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": true
181
+ },
182
+ "32022": {
183
+ "content": "<|endofsql|>",
184
+ "lstrip": false,
185
+ "normalized": false,
186
+ "rstrip": false,
187
+ "single_word": false,
188
+ "special": true
189
+ }
190
+ },
191
+ "additional_special_tokens": [
192
+ "<|endofsql|>"
193
+ ],
194
+ "bos_token": "<|begin▁of▁sentence|>",
195
+ "chat_template": "{% if not add_generation_prompt is defined %}\n{% set add_generation_prompt = false %}\n{% endif %}\n{%- set ns = namespace(found=false) -%}\n{%- for message in messages -%}\n {%- if message['role'] == 'system' -%}\n {%- set ns.found = true -%}\n {%- endif -%}\n{%- endfor -%}\n{{bos_token}}{%- if not ns.found -%}\n{{'You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\\n'}}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n{{ message['content'] }}\n {%- else %}\n {%- if message['role'] == 'user' %}\n{{'### Instruction:\\n' + message['content'] + '\\n'}}\n {%- else %}\n{{'### Response:\\n' + message['content'] + '\\n<|EOT|>\\n'}}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{% if add_generation_prompt %}\n{{'### Response:'}}\n{% endif %}",
196
+ "clean_up_tokenization_spaces": false,
197
+ "eos_token": "<|endofsql|>",
198
+ "extra_special_tokens": {},
199
+ "legacy": true,
200
+ "model_max_length": 16384,
201
+ "pad_token": "<|end▁of▁sentence|>",
202
+ "sp_model_kwargs": {},
203
+ "tokenizer_class": "LlamaTokenizerFast",
204
+ "unk_token": null,
205
+ "use_default_system_prompt": false
206
+ }
finetuned-model-16-full/checkpoint-576/trainer_state.json ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_global_step": 448,
3
+ "best_metric": 0.10987533628940582,
4
+ "best_model_checkpoint": "./finetuned-model-16-full\\checkpoint-448",
5
+ "epoch": 9.0,
6
+ "eval_steps": 500,
7
+ "global_step": 576,
8
+ "is_hyper_param_search": false,
9
+ "is_local_process_zero": true,
10
+ "is_world_process_zero": true,
11
+ "log_history": [
12
+ {
13
+ "epoch": 0.7889546351084813,
14
+ "grad_norm": 0.21714453399181366,
15
+ "learning_rate": 4.603174603174603e-05,
16
+ "loss": 0.5977,
17
+ "step": 50
18
+ },
19
+ {
20
+ "epoch": 1.0,
21
+ "eval_loss": 0.1679154932498932,
22
+ "eval_runtime": 449.444,
23
+ "eval_samples_per_second": 0.556,
24
+ "eval_steps_per_second": 0.556,
25
+ "step": 64
26
+ },
27
+ {
28
+ "epoch": 1.5680473372781065,
29
+ "grad_norm": 0.13297569751739502,
30
+ "learning_rate": 4.2063492063492065e-05,
31
+ "loss": 0.1177,
32
+ "step": 100
33
+ },
34
+ {
35
+ "epoch": 2.0,
36
+ "eval_loss": 0.13711020350456238,
37
+ "eval_runtime": 448.5722,
38
+ "eval_samples_per_second": 0.557,
39
+ "eval_steps_per_second": 0.557,
40
+ "step": 128
41
+ },
42
+ {
43
+ "epoch": 2.3471400394477318,
44
+ "grad_norm": 0.11708807200193405,
45
+ "learning_rate": 3.809523809523809e-05,
46
+ "loss": 0.0936,
47
+ "step": 150
48
+ },
49
+ {
50
+ "epoch": 3.0,
51
+ "eval_loss": 0.12583088874816895,
52
+ "eval_runtime": 449.2291,
53
+ "eval_samples_per_second": 0.557,
54
+ "eval_steps_per_second": 0.557,
55
+ "step": 192
56
+ },
57
+ {
58
+ "epoch": 3.126232741617357,
59
+ "grad_norm": 0.13195854425430298,
60
+ "learning_rate": 3.412698412698413e-05,
61
+ "loss": 0.0876,
62
+ "step": 200
63
+ },
64
+ {
65
+ "epoch": 3.9151873767258385,
66
+ "grad_norm": 0.1334521323442459,
67
+ "learning_rate": 3.0158730158730158e-05,
68
+ "loss": 0.0773,
69
+ "step": 250
70
+ },
71
+ {
72
+ "epoch": 4.0,
73
+ "eval_loss": 0.11675416678190231,
74
+ "eval_runtime": 447.9794,
75
+ "eval_samples_per_second": 0.558,
76
+ "eval_steps_per_second": 0.558,
77
+ "step": 256
78
+ },
79
+ {
80
+ "epoch": 4.6942800788954635,
81
+ "grad_norm": 0.1745067685842514,
82
+ "learning_rate": 2.6190476190476192e-05,
83
+ "loss": 0.0715,
84
+ "step": 300
85
+ },
86
+ {
87
+ "epoch": 5.0,
88
+ "eval_loss": 0.11324296146631241,
89
+ "eval_runtime": 449.3754,
90
+ "eval_samples_per_second": 0.556,
91
+ "eval_steps_per_second": 0.556,
92
+ "step": 320
93
+ },
94
+ {
95
+ "epoch": 5.4733727810650885,
96
+ "grad_norm": 0.14304669201374054,
97
+ "learning_rate": 2.2222222222222223e-05,
98
+ "loss": 0.0678,
99
+ "step": 350
100
+ },
101
+ {
102
+ "epoch": 6.0,
103
+ "eval_loss": 0.11094386130571365,
104
+ "eval_runtime": 448.7599,
105
+ "eval_samples_per_second": 0.557,
106
+ "eval_steps_per_second": 0.557,
107
+ "step": 384
108
+ },
109
+ {
110
+ "epoch": 6.252465483234714,
111
+ "grad_norm": 0.20627053081989288,
112
+ "learning_rate": 1.8253968253968254e-05,
113
+ "loss": 0.0628,
114
+ "step": 400
115
+ },
116
+ {
117
+ "epoch": 7.0,
118
+ "eval_loss": 0.10987533628940582,
119
+ "eval_runtime": 446.2745,
120
+ "eval_samples_per_second": 0.56,
121
+ "eval_steps_per_second": 0.56,
122
+ "step": 448
123
+ },
124
+ {
125
+ "epoch": 7.031558185404339,
126
+ "grad_norm": 0.18030017614364624,
127
+ "learning_rate": 1.4285714285714285e-05,
128
+ "loss": 0.0578,
129
+ "step": 450
130
+ },
131
+ {
132
+ "epoch": 7.82051282051282,
133
+ "grad_norm": 0.20803600549697876,
134
+ "learning_rate": 1.0317460317460318e-05,
135
+ "loss": 0.0563,
136
+ "step": 500
137
+ },
138
+ {
139
+ "epoch": 8.0,
140
+ "eval_loss": 0.11132891476154327,
141
+ "eval_runtime": 449.0806,
142
+ "eval_samples_per_second": 0.557,
143
+ "eval_steps_per_second": 0.557,
144
+ "step": 512
145
+ },
146
+ {
147
+ "epoch": 8.599605522682445,
148
+ "grad_norm": 0.2382732778787613,
149
+ "learning_rate": 6.349206349206349e-06,
150
+ "loss": 0.0518,
151
+ "step": 550
152
+ },
153
+ {
154
+ "epoch": 9.0,
155
+ "eval_loss": 0.11033967137336731,
156
+ "eval_runtime": 447.1892,
157
+ "eval_samples_per_second": 0.559,
158
+ "eval_steps_per_second": 0.559,
159
+ "step": 576
160
+ }
161
+ ],
162
+ "logging_steps": 50,
163
+ "max_steps": 630,
164
+ "num_input_tokens_seen": 0,
165
+ "num_train_epochs": 10,
166
+ "save_steps": 500,
167
+ "stateful_callbacks": {
168
+ "EarlyStoppingCallback": {
169
+ "args": {
170
+ "early_stopping_patience": 2,
171
+ "early_stopping_threshold": 0.0
172
+ },
173
+ "attributes": {
174
+ "early_stopping_patience_counter": 2
175
+ }
176
+ },
177
+ "TrainerControl": {
178
+ "args": {
179
+ "should_epoch_stop": false,
180
+ "should_evaluate": false,
181
+ "should_log": false,
182
+ "should_save": true,
183
+ "should_training_stop": true
184
+ },
185
+ "attributes": {}
186
+ }
187
+ },
188
+ "total_flos": 2.2377604877018726e+17,
189
+ "train_batch_size": 1,
190
+ "trial_name": null,
191
+ "trial_params": null
192
+ }
finetuned-model-16-full/checkpoint-576/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8ae54bf6183491c37908c8caa39df99368338c0538c4bd6a8180c8d4391e698
3
+ size 5368
finetuned-model-16-full/config.json ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 32013,
8
+ "eos_token_id": 32021,
9
+ "head_dim": 128,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 2048,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 5504,
14
+ "max_position_embeddings": 16384,
15
+ "mlp_bias": false,
16
+ "model_type": "llama",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 24,
19
+ "num_key_value_heads": 16,
20
+ "pretraining_tp": 1,
21
+ "quantization_config": {
22
+ "_load_in_4bit": false,
23
+ "_load_in_8bit": true,
24
+ "bnb_4bit_compute_dtype": "float32",
25
+ "bnb_4bit_quant_storage": "uint8",
26
+ "bnb_4bit_quant_type": "fp4",
27
+ "bnb_4bit_use_double_quant": false,
28
+ "llm_int8_enable_fp32_cpu_offload": false,
29
+ "llm_int8_has_fp16_weight": false,
30
+ "llm_int8_skip_modules": null,
31
+ "llm_int8_threshold": 6.0,
32
+ "load_in_4bit": false,
33
+ "load_in_8bit": true,
34
+ "quant_method": "bitsandbytes"
35
+ },
36
+ "rms_norm_eps": 1e-06,
37
+ "rope_scaling": {
38
+ "factor": 4.0,
39
+ "rope_type": "linear",
40
+ "type": "linear"
41
+ },
42
+ "rope_theta": 100000,
43
+ "tie_word_embeddings": false,
44
+ "torch_dtype": "float16",
45
+ "transformers_version": "4.50.3",
46
+ "use_cache": true,
47
+ "vocab_size": 32023
48
+ }
finetuned-model-16-full/generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 32013,
4
+ "eos_token_id": 32021,
5
+ "transformers_version": "4.50.3"
6
+ }
finetuned-model-16-full/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19d97ee7585fc40d36dd22fc38fc79c2392a591485b3674fa7a77b5ecd6562b0
3
+ size 1478884408
finetuned-model-16-full/runs/Nov23_23-46-54_DESKTOP-SMJC97K/events.out.tfevents.1763970415.DESKTOP-SMJC97K.13948.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8116b222877dc2b6038f4c604a7856836f93dacbedd0e0aa940b1b30802b908
3
+ size 10786
finetuned-model-16-full/special_tokens_map.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ {
4
+ "content": "<|endofsql|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ }
10
+ ],
11
+ "bos_token": {
12
+ "content": "<|begin▁of▁sentence|>",
13
+ "lstrip": false,
14
+ "normalized": true,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "eos_token": "<|endofsql|>",
19
+ "pad_token": {
20
+ "content": "<|end▁of▁sentence|>",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ }
26
+ }
finetuned-model-16-full/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
finetuned-model-16-full/tokenizer_config.json ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "32000": {
7
+ "content": "õ",
8
+ "lstrip": false,
9
+ "normalized": true,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": false
13
+ },
14
+ "32001": {
15
+ "content": "÷",
16
+ "lstrip": false,
17
+ "normalized": true,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": false
21
+ },
22
+ "32002": {
23
+ "content": "Á",
24
+ "lstrip": false,
25
+ "normalized": true,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": false
29
+ },
30
+ "32003": {
31
+ "content": "ý",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": false
37
+ },
38
+ "32004": {
39
+ "content": "À",
40
+ "lstrip": false,
41
+ "normalized": true,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": false
45
+ },
46
+ "32005": {
47
+ "content": "ÿ",
48
+ "lstrip": false,
49
+ "normalized": true,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": false
53
+ },
54
+ "32006": {
55
+ "content": "ø",
56
+ "lstrip": false,
57
+ "normalized": true,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": false
61
+ },
62
+ "32007": {
63
+ "content": "ú",
64
+ "lstrip": false,
65
+ "normalized": true,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": false
69
+ },
70
+ "32008": {
71
+ "content": "þ",
72
+ "lstrip": false,
73
+ "normalized": true,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": false
77
+ },
78
+ "32009": {
79
+ "content": "ü",
80
+ "lstrip": false,
81
+ "normalized": true,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": false
85
+ },
86
+ "32010": {
87
+ "content": "ù",
88
+ "lstrip": false,
89
+ "normalized": true,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": false
93
+ },
94
+ "32011": {
95
+ "content": "ö",
96
+ "lstrip": false,
97
+ "normalized": true,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": false
101
+ },
102
+ "32012": {
103
+ "content": "û",
104
+ "lstrip": false,
105
+ "normalized": true,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": false
109
+ },
110
+ "32013": {
111
+ "content": "<|begin▁of▁sentence|>",
112
+ "lstrip": false,
113
+ "normalized": true,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": true
117
+ },
118
+ "32014": {
119
+ "content": "<|end▁of▁sentence|>",
120
+ "lstrip": false,
121
+ "normalized": true,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": true
125
+ },
126
+ "32015": {
127
+ "content": "<|fim▁hole|>",
128
+ "lstrip": false,
129
+ "normalized": true,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "32016": {
135
+ "content": "<|fim▁begin|>",
136
+ "lstrip": false,
137
+ "normalized": true,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "32017": {
143
+ "content": "<|fim▁end|>",
144
+ "lstrip": false,
145
+ "normalized": true,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "32018": {
151
+ "content": "<pad>",
152
+ "lstrip": false,
153
+ "normalized": true,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "32019": {
159
+ "content": "<|User|>",
160
+ "lstrip": false,
161
+ "normalized": true,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "32020": {
167
+ "content": "<|Assistant|>",
168
+ "lstrip": false,
169
+ "normalized": true,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "32021": {
175
+ "content": "<|EOT|>",
176
+ "lstrip": false,
177
+ "normalized": true,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": true
181
+ },
182
+ "32022": {
183
+ "content": "<|endofsql|>",
184
+ "lstrip": false,
185
+ "normalized": false,
186
+ "rstrip": false,
187
+ "single_word": false,
188
+ "special": true
189
+ }
190
+ },
191
+ "additional_special_tokens": [
192
+ "<|endofsql|>"
193
+ ],
194
+ "bos_token": "<|begin▁of▁sentence|>",
195
+ "chat_template": "{% if not add_generation_prompt is defined %}\n{% set add_generation_prompt = false %}\n{% endif %}\n{%- set ns = namespace(found=false) -%}\n{%- for message in messages -%}\n {%- if message['role'] == 'system' -%}\n {%- set ns.found = true -%}\n {%- endif -%}\n{%- endfor -%}\n{{bos_token}}{%- if not ns.found -%}\n{{'You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\\n'}}\n{%- endif %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n{{ message['content'] }}\n {%- else %}\n {%- if message['role'] == 'user' %}\n{{'### Instruction:\\n' + message['content'] + '\\n'}}\n {%- else %}\n{{'### Response:\\n' + message['content'] + '\\n<|EOT|>\\n'}}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{% if add_generation_prompt %}\n{{'### Response:'}}\n{% endif %}",
196
+ "clean_up_tokenization_spaces": false,
197
+ "eos_token": "<|endofsql|>",
198
+ "extra_special_tokens": {},
199
+ "legacy": true,
200
+ "model_max_length": 16384,
201
+ "pad_token": "<|end▁of▁sentence|>",
202
+ "sp_model_kwargs": {},
203
+ "tokenizer_class": "LlamaTokenizerFast",
204
+ "unk_token": null,
205
+ "use_default_system_prompt": false
206
+ }
training-data/combined_full_dataset.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training-data/nba_test_set.tsv ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ natural_query sql_query result is_nba
2
+ What is the average number of fg_pct in home games by the Chicago Bulls? SELECT AVG(fg_pct_home) FROM game WHERE team_name_home = 'Chicago Bulls'; 0.4636694306246544 True
3
+ How many lead changes occurred in games where the Denver Nuggets played away? SELECT SUM(lead_changes) as total_lead_changes FROM other_stats WHERE team_abbreviation_away = 'DEN'; 5828.0 True
4
+ Which team had the most away games where they had more offensive than defensive rebounds? SELECT team_abbreviation_away FROM game WHERE oreb_away > dreb_away GROUP BY team_abbreviation_away ORDER BY COUNT(*) DESC LIMIT 1; ATL True
5
+ What is the maximum number of team rebounds recorded by the Dallas Mavericks in away games where they committed more than 20 fouls? SELECT MAX(o.team_rebounds_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_away = 'DAL' AND g.pf_away > 20 AND g.season_id = '22021'; 16 True
6
+ What was the average margin of victory for the Miami Heat during the 2013 NBA season? SELECT AVG(victory_margin) AS avg_victory_margin FROM ( SELECT plus_minus_home AS victory_margin FROM game WHERE team_name_home = 'Miami Heat' AND wl_home = 'W' AND season_id = '22013' UNION ALL SELECT plus_minus_away AS victory_margin FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'W' AND season_id = '22013' ) AS victories 11.48148148 True
7
+ What is the average fast break points scored by the Philadelphia 76ers at home during the 2018 season? SELECT AVG(os.pts_fb_home) AS avg_fast_break FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22018'; 16.32352941 True
8
+ Which team has the nickname 'Celtics'? SELECT full_name FROM team WHERE nickname = 'Celtics'; Boston Celtics True
9
+ How many games did the Milwaukee Bucks play at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Milwaukee Bucks' AND season_id = '22020'; 36 True
10
+ What is the average second-chance points for Toronto Raptors home games between 2015-2020? SELECT AVG(os.pts_2nd_chance_home) AS avg_second_chance FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'TOR' AND g.season_id BETWEEN '22015' AND '22020'; 13.07653061 True
11
+ Which team had the most fast break points in a single home game during the 2020 season? SELECT team_name_home, MAX(pts_fb_home) FROM other_stats JOIN game ON other_stats.game_id = game.game_id WHERE game.season_id = '22020'; Houston Rockets|35 True
12
+ What's the average points in the paint for the Boston Celtics in home games where they won by at least 10 points? SELECT AVG(os.pts_paint_home) FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Boston Celtics' AND g.plus_minus_home >= 10; 41.85 True
13
+ What is the highest combined total score (home + away) in a single game in the dataset? SELECT game_date, (pts_home + pts_away) AS total_points FROM game ORDER BY total_points DESC LIMIT 1; 2017-02-19 00:00:00|374.0 True
14
+ Which team had the best three-point shooting percentage in home games during the 2020 season? SELECT team_name_home, AVG(fg3_pct_home) AS avg_3pt_pct FROM game WHERE season_id = '22020' GROUP BY team_name_home ORDER BY avg_3pt_pct DESC LIMIT 1; LA Clippers | 0.423777777777778 True
15
+ Which team is located in the state of Indiana? SELECT full_name FROM team WHERE state = 'Indiana'; Indiana Pacers True
16
+ What was the most blocks recorded by the Orlando Magic in a single home game in the 1999 season? SELECT MAX(blk_home) AS max_blocks FROM game WHERE team_abbreviation_home = 'ORL' AND season_id = '21999'; 10.0 True
17
+ What was the average number of fastbreak points scored by the Houston Rockets in games they won by more than 15 points at home? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Houston Rockets' AND g.wl_home = 'W' AND (g.pts_home - g.pts_away) > 15; 13.39790576 True
18
+ How many times did the Los Angeles Clippers lose at home in the 2002 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'LAC' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22002'; 4 True
19
+ What is the full name of the team based in Dallas? SELECT full_name FROM team WHERE city = 'Dallas'; Dallas Mavericks True
20
+ Which team played the most total games (home + away) between 1995 and 2005? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE season_id BETWEEN '21995' AND '22005' UNION ALL SELECT team_abbreviation_away FROM game WHERE season_id BETWEEN '21995' AND '22005') GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; WAS True
21
+ How many games did the Miami Heat lose away in the 1996 season? SELECT COUNT(*) as losses FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'L' AND season_id = '21996'; 9.0 True
22
+ What is the average number of tov in away games by the Miami Heat? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Miami Heat'; 15.235255570117957 True
23
+ "What is the total second chance points by the Miami Heat at home?""" SELECT SUM(pts_2nd_chance_home) as total_2nd_chance FROM other_stats WHERE team_abbreviation_home = 'MIA'; 11670.0 True
24
+ How many home games did the Orlando Magic play in the 2013 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Orlando Magic' AND season_id = '22013'; 41.0 True
25
+ In which season did the Boston Celtics have the highest average tov at home? SELECT season_id, AVG(tov_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2005.0 True
26
+ In which season did the Chicago Bulls have the highest average ft_pct at home? SELECT season_id, AVG(ft_pct_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2016.0 True
27
+ How many games did the Cleveland Cavaliers play at home with more than 8 times tied in 1996? SELECT COUNT(*) as games FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Cleveland Cavaliers' AND os.times_tied > 8 AND g.season_id = '21996'; 5.0 True
28
+ What was the average number of offensive rebounds per game for the Chicago Bulls in the 2019 season? SELECT AVG(oreb) AS avg_offensive_rebounds FROM ( SELECT game_id, oreb_home AS oreb FROM game WHERE team_name_home = 'Chicago Bulls' AND season_id = '22019' UNION ALL SELECT game_id, oreb_away AS oreb FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22019' ); 10.46153846 True
29
+ What was the highest combined steals and blocks total for the Toronto Raptors in any home game during their championship season? SELECT MAX(stl_home + blk_home) AS combined_steals_blocks FROM game WHERE team_name_home = 'Toronto Raptors' AND season_id = '22019'; 24 True
30
+ How many times have the Boston Celtics won an away game by at least 20 points? SELECT COUNT(*) FROM game WHERE team_abbreviation_away = 'BOS' AND wl_away = 'W' AND (pts_away - pts_home) >= 20; 179 True
31
+ How many total turnovers did the Sacramento Kings commit in the 2001 season? SELECT SUM(tov) AS total_turnovers FROM ( SELECT tov_home AS tov FROM game WHERE team_abbreviation_home = 'SAC' AND season_id = '22001' UNION ALL SELECT tov_away AS tov FROM game WHERE team_abbreviation_away = 'SAC' AND season_id = '22001' ); 1128.0 True
32
+ What is the largest margin of victory the Miami Heat have ever had in an away game? SELECT MAX(ABS(pts_away - pts_home)) AS largest_margin FROM game WHERE team_abbreviation_away = 'MIA' AND pts_away > pts_home; 34.0 True
33
+ What was the average margin of victory for the Boston Celtics in home games during the 2000 season? SELECT AVG(pts_home - pts_away) AS avg_victory_margin FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22000'; 9.75 True
34
+ What are the nicknames of teams based in Florida? SELECT nickname FROM team WHERE state = 'Florida'; Heat, Magic True
35
+ What was the highest total rebound count by an away team in a game? SELECT team_abbreviation_away, reb_away, game_date FROM game ORDER BY reb_away DESC LIMIT 1; BOS|90.0|1957-10-22 00:00:00 True
36
+ What is the total number of rebounds by the San Antonio Spurs in home games during the 2015 season? SELECT SUM(reb_home) FROM game WHERE team_abbreviation_home = 'SAS' AND season_id = '22015'; 1845.0 True
37
+ Which away team scored the most points off turnovers in a single game? SELECT team_abbreviation_away FROM other_stats ORDER BY pts_off_to_away DESC LIMIT 1; ATL True
38
+ What is the highest fast break points by the Houston Rockets at home? SELECT MAX(pts_fb_home) as max_fb_points FROM other_stats WHERE team_abbreviation_home = 'HOU'; 37.0 True
39
+ What is the average number of tov in home games by the Miami Heat? SELECT AVG(tov_home) FROM game WHERE team_name_home = 'Miami Heat'; 14.627184466019418 True
40
+ What is the total number of points scored by the Los Angeles Clippers in the 2014 season in games where they had more team turnovers but fewer total turnovers than their opponent? SELECT SUM(g.pts_home) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'LAC' AND o.team_turnovers_home > o.team_turnovers_away AND o.total_turnovers_home < o.total_turnovers_away AND g.season_id = '22014'; 295.0 True
41
+ Which home team had the most games with a positive plus-minus but still lost? SELECT team_name_home FROM game WHERE wl_home = 'L' AND plus_minus_home > 0 GROUP BY team_name_home ORDER BY COUNT(*) DESC LIMIT 1; West NBA All Stars West True
42
+ In which season did the Miami Heat have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Miami Heat' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2019.0 True
43
+ How many games did the Chicago Bulls win at home in the 2010 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'CHI' AND wl_home = 'W' AND season_id = '22010'; 36 True
44
+ What was the average points scored by the Denver Nuggets in home games during the 2019 season? SELECT AVG(pts_home) AS avg_home_points FROM game WHERE team_name_home = 'Denver Nuggets' AND season_id = '22019'; 111.8378378 True
45
+ When was the Los Angeles Clippers team founded according to the team database? SELECT year_founded FROM team WHERE full_name = 'Los Angeles Clippers'; 1970 True
46
+ What is the average number of ast in home games by the Boston Celtics? SELECT AVG(ast_home) FROM game WHERE team_name_home = 'Boston Celtics'; 24.886892177589857 True
47
+ What is the average number of ast in away games by the Los Angeles Lakers? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 22.594638949671772 True
48
+ What team had the most turnovers in a single game during the 2019 season? SELECT CASE WHEN tov_home > tov_away THEN team_name_home ELSE team_name_away END AS team_with_most_turnovers FROM game WHERE season_id = '22019' ORDER BY CASE WHEN tov_home > tov_away THEN tov_home ELSE tov_away END DESC LIMIT 1 Sacramento Kings True
49
+ What is the highest points scored by the Miami Heat at home when they had more than 10 second chance points? SELECT MAX(g.pts_home) as max_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_home = 'Miami Heat' AND os.pts_2nd_chance_home > 10; 149.0 True
50
+ What is the total points in the paint by the Chicago Bulls at home in games they lost in 1996? SELECT SUM(os.pts_paint_home) as total_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.wl_home = 'L' AND g.season_id = '21996'; 56.0 True
51
+ How many games did the Oklahoma City Thunder score more than 30 points in the first quarter during the 2017 season? SELECT COUNT(*) AS high_scoring_first_quarters FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'Oklahoma City Thunder' AND g.pts_home / 4 > 30) OR (g.team_name_away = 'Oklahoma City Thunder' AND g.pts_away / 4 > 30) AND g.season_id = '22017'; 83 True
52
+ What is the total number of points scored by the Milwaukee Bucks away when they had more than 5 lead changes? SELECT SUM(g.pts_away) as total_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_away = 'Milwaukee Bucks' AND os.lead_changes > 5; 44835.0 True
53
+ List all games where the Houston Rockets and Dallas Mavericks played each other in the 2015 season. SELECT * FROM game WHERE season_id = '22015' AND ((team_abbreviation_home = 'HOU' AND team_abbreviation_away = 'DAL') OR (team_abbreviation_home = 'DAL' AND team_abbreviation_away = 'HOU')); 22015|1610612745|HOU|Houston Rockets|0021500140|2015-11-14 00:00:00|HOU vs. DAL|L|240|32.0|84.0|0.381|9.0|34.0|0.265|25.0|32.0|0.781|12.0|31.0|43.0|22.0|9.0|5.0|14.0|23.0|98.0|-12|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|W|43.0|89.0|0.483|8.0|28.0|0.286|16.0|21.0|0.762|8.0|37.0|45.0|24.0|6.0|7.0|11.0|21.0|110.0|12|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021500287|2015-12-04 00:00:00|DAL vs. HOU|L|240|37.0|81.0|0.457|8.0|29.0|0.276|14.0|20.0|0.7|11.0|31.0|42.0|23.0|8.0|5.0|18.0|17.0|96.0|-4|1|1610612745|HOU|Houston Rockets|HOU @ DAL|W|39.0|84.0|0.464|12.0|26.0|0.462|10.0|18.0|0.556|15.0|30.0|45.0|20.0|12.0|5.0|18.0|22.0|100.0|4|1|Regular Season 22015|1610612745|HOU|Houston Rockets|0021500665|2016-01-24 00:00:00|HOU vs. DAL|W|240|43.0|89.0|0.483|15.0|44.0|0.341|14.0|21.0|0.667|9.0|31.0|40.0|27.0|9.0|7.0|9.0|21.0|115.0|11|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|L|36.0|79.0|0.456|15.0|30.0|0.5|17.0|22.0|0.773|8.0|28.0|36.0|17.0|4.0|4.0|16.0|20.0|104.0|-11|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021501170|2016-04-06 00:00:00|DAL vs. HOU|W|240|33.0|80.0|0.413|10.0|33.0|0.303|12.0|14.0|0.857|13.0|27.0|40.0|19.0|9.0|4.0|14.0|20.0|88.0|2|1|1610612745|HOU|Houston Rockets|HOU @ DAL|L|34.0|78.0|0.436|6.0|20.0|0.3|12.0|18.0|0.667|12.0|29.0|41.0|19.0|6.0|4.0|16.0|17.0|86.0|-2|1|Regular Season True
54
+ What is the highest combined reb in any game involving the San Antonio Spurs? SELECT MAX(reb_home + reb_away) FROM game WHERE team_name_home = 'San Antonio Spurs' OR team_name_away = 'San Antonio Spurs'; 134.0 True
55
+ In which season did the Chicago Bulls have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2021.0 True
56
+ What is the lowest plus-minus score for the New York Knicks away? SELECT MIN(plus_minus_away) as min_plus_minus FROM game WHERE team_name_away = 'New York Knicks'; -47.0 True
57
+ How many total points did the Chicago Bulls score across all games in the 1988 season? SELECT SUM(pts) AS total_points FROM ( SELECT pts_home AS pts FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '21988' UNION ALL SELECT pts_away AS pts FROM game WHERE team_abbreviation_away = 'CHI' AND season_id = '21988' ); 8726.0 True
58
+ What is the total number of fast break points scored by the Memphis Grizzlies at home during the 2005 season? SELECT SUM(pts_fb_home) FROM other_stats WHERE game_id IN ( SELECT game_id FROM game WHERE team_name_home = 'Memphis Grizzlies' AND season_id = '22005' ); 368 True
59
+ What was the average points difference in home games won by the Denver Nuggets? SELECT AVG(pts_home - pts_away) FROM game WHERE team_abbreviation_home = 'DEN' AND wl_home = 'W'; 11.96471532 True
60
+ How many times did the Memphis Grizzlies lose at home in the 2008 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'MEM' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22008'; 3 True
61
+ In which season did the Boston Celtics have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1958.0 True
62
+ In the 2020 season, what was the average number of second chance points allowed by the New Orleans Pelicans in games they won by less than 5 points? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE ((g.team_abbreviation_home = 'NOP' AND g.wl_home = 'W' AND ABS(g.pts_home - g.pts_away) < 5) OR (g.team_abbreviation_away = 'NOP' AND g.wl_away = 'W' AND ABS(g.pts_home - g.pts_away) < 5)) AND g.season_id = '22020'; 16.6 True
63
+ How many games did the Golden State Warriors lose away in 1996? SELECT COUNT(*) as away_losses FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'L' AND season_id = '21996'; 29.0 True
64
+ Which team was most often held under 60 points in a game? SELECT team FROM (SELECT team_abbreviation_home AS team, pts_home AS pts FROM game UNION ALL SELECT team_abbreviation_away, pts_away FROM game) WHERE pts < 60 GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; BOS True
65
+ What is the average number of three-pointers made by the Golden State Warriors at home in the 2018 season? SELECT AVG(fg3m_home) FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22018'; 13.1951219512195 True
66
+ What is the Los Angeles Lakers' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'LAL' AND season_id = '22016'; 27 True
67
+ What is the average number of points in the paint allowed by the Philadelphia 76ers when playing at home in the 2020 season in games with more than 15 lead changes? SELECT AVG(o.pts_paint_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22020' AND o.lead_changes > 15; 50.0 True
68
+ How many points did the home team score in the game with the most lead changes and the fewest total fouls? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats JOIN game USING(game_id) ORDER BY lead_changes DESC, (pf_home + pf_away) ASC LIMIT 1); 122.0 True
69
+ How many games did the Cleveland Cavaliers lose away with more than 10 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Cleveland Cavaliers' AND g.wl_away = 'L' AND os.pts_fb_away > 10 AND g.season_id = '21996'; 4.0 True
70
+ What is the highest combined ast in any game involving the Orlando Magic? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Orlando Magic' OR team_name_away = 'Orlando Magic'; 74.0 True
71
+ What is the average points in the paint by the Utah Jazz away when they won? SELECT AVG(os.pts_paint_away) as avg_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Utah Jazz' AND g.wl_away = 'W'; 42.48 True
72
+ How many games did the Los Angeles Lakers play away in 1996? SELECT COUNT(*) as away_games FROM game WHERE team_name_away = 'Los Angeles Lakers' AND season_id = '21996'; 41.0 True
73
+ How many games had at least one team with 30+ assists? SELECT COUNT(*) FROM game WHERE ast_home >= 30 OR ast_away >= 30; 11305 True
74
+ What is the highest three-point percentage the Phoenix Suns achieved in an away game? SELECT MAX(fg3_pct_away) FROM game WHERE team_abbreviation_away = 'PHX'; 1 True
75
+ How many away games did the Miami Heat play in the 2021 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Miami Heat' AND season_id = '22021'; 41.0 True
76
+ How many times did the Boston Celtics win at home during the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'BOS' AND season_id = '22015' AND wl_home = 'W'; 28 True
77
+ How many free throws did the Houston Rockets attempt in away games they won during the 2020 season? SELECT SUM(fta_away) FROM game WHERE team_name_away = 'Houston Rockets' AND wl_away = 'W' AND season_id = '22020'; 149.0 True
78
+ Which away team has scored the most points against the Miami Heat in a single game? SELECT team_name_away, pts_away FROM game WHERE team_abbreviation_home = 'MIA' ORDER BY pts_away DESC LIMIT 1; Milwaukee Bucks|144.0 True
79
+ How many points were scored in the earliest recorded game in the database? SELECT (pts_home + pts_away) FROM game ORDER BY game_date ASC LIMIT 1; 134.0 True
80
+ What is the average number of tov in away games by the Los Angeles Lakers? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 14.554896142433234 True
81
+ What is the total number of rebounds by the Milwaukee Bucks at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Milwaukee Bucks'; 76050.0 True
82
+ What is the highest number of assists recorded by the Indiana Pacers in a single home game? SELECT MAX(ast_home) FROM game WHERE team_name_home = 'Indiana Pacers'; 44.0 True
83
+ How many times did the Miami Heat score more than 120 points at home in the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'MIA' AND season_id = '22015' AND pts_home > 120; 3 True
84
+ What was the lowest number of combined turnovers in any game involving the San Antonio Spurs during the 2019 season? SELECT MIN(o.total_turnovers_home + o.total_turnovers_away) AS min_combined_turnovers FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'San Antonio Spurs' OR g.team_name_away = 'San Antonio Spurs') AND g.season_id = '22019'; 13 True
85
+ What was the average number of fastbreak points scored by the Los Angeles Lakers in home wins during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Los Angeles Lakers' AND g.wl_home = 'W' AND g.season_id = '22020'; 13.64705882 True
86
+ What was the highest number of steals by the Detroit Pistons in a single game during the 2004 season? SELECT MAX(stl) AS max_steals FROM ( SELECT stl_home AS stl FROM game WHERE team_abbreviation_home = 'DET' AND season_id = '22004' UNION ALL SELECT stl_away AS stl FROM game WHERE team_abbreviation_away = 'DET' AND season_id = '22004' ); 13 True
87
+ In 2018, which team has the most home wins and how many home wins did they have? SELECT team_abbreviation_home, COUNT(*) FROM game WHERE wl_home = 'W' AND season_id = '22018' GROUP BY team_abbreviation_home ORDER BY COUNT(*) DESC LIMIT 1; (DEN, 34) True
88
+ How many three-pointers did the Golden State Warriors attempt in total during the 2017 season? SELECT SUM(fg3a) AS total_three_attempts FROM ( SELECT fg3a_home AS fg3a FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22017' UNION ALL SELECT fg3a_away AS fg3a FROM game WHERE team_abbreviation_away = 'GSW' AND season_id = '22017' ); 2369.0 True
89
+ What is the highest number of three-pointers made in a single game by the Houston Rockets at home? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Houston Rockets'; 27.0 True
90
+ How many games did the Boston Celtics win on the road during the 2018 season? SELECT COUNT(*) AS away_wins FROM game WHERE team_name_away = 'Boston Celtics' AND wl_away = 'W' AND season_id = '22018'; 21 True
91
+ What is the most three-pointers the Brooklyn Nets have ever made in a home game? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Brooklyn Nets'; 22.0 True
92
+ How many total offensive rebounds did the Houston Rockets have in away games during the 2018 season? SELECT SUM(oreb_away) FROM game WHERE team_name_away = 'Houston Rockets' AND season_id = '22018'; 419.0 True
93
+ What is the average number of pts in away games by the Miami Heat? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Miami Heat'; 96.7824377457405 True
94
+ What is the state of the team nicknamed 'Jazz'? SELECT state FROM team WHERE nickname = 'Jazz'; Utah True
95
+ How many points did the Phoenix Suns score in the highest scoring away game they played? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'PHX'; 161.0 True
96
+ In which season did the Charlotte Hornets have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Charlotte Hornets' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2017.0 True
97
+ Which team had the worst average point differential in the 2007 season? SELECT team_abbreviation, AVG(point_diff) AS avg_point_differential FROM ( SELECT team_abbreviation_home AS team_abbreviation, (pts_home - pts_away) AS point_diff FROM game WHERE season_id = '22007' UNION ALL SELECT team_abbreviation_away, (pts_away - pts_home) FROM game WHERE season_id = '22007' ) GROUP BY team_abbreviation ORDER BY avg_point_differential ASC LIMIT 1; SEA|-8.75609756097561 True
98
+ In which season did the Milwaukee Bucks have the highest average fg_pct at home? SELECT season_id, AVG(fg_pct_home) as avg_stat FROM game WHERE team_name_home = 'Milwaukee Bucks' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 42017.0 True
99
+ In games where the Brooklyn Nets scored more than 50 points in the paint at home, what was their assist-to-field goal made ratio? SELECT SUM(g.ast_home) * 1.0 / SUM(g.fgm_home) AS assist_to_fgm_ratio FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Brooklyn Nets' AND o.pts_paint_home > 50; 0.588761175 True
100
+ How many away games did the Chicago Bulls play in the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22020'; 36.0 True
101
+ What is the average scoring ouput for home teams. Round to 2 decimal places. SELECT ROUND(AVG(pts_home),2) AS avg_home_points FROM game WHERE season_type = 'Regular Season'; 104.76 True
102
+ In which season did the Golden State Warriors have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Golden State Warriors' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1974.0 True
103
+ Which team founded in the 70s has a nickname starting with 'C'? SELECT full_name FROM team WHERE year_founded BETWEEN 1970 AND 1979 AND nickname LIKE 'C%'; Cleveland Cavaliers, Los Angeles Clippers True
104
+ What is the highest combined ft_pct in any game involving the Los Angeles Lakers? SELECT MAX(ft_pct_home + ft_pct_away) FROM game WHERE team_name_home = 'Los Angeles Lakers' OR team_name_away = 'Los Angeles Lakers'; 1.957 True
105
+ How many fastbreak points did the Los Angeles Clippers average in home games during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'LA Clippers' AND g.season_id = '22020'; 11.5 True
106
+ What is the average number of three-pointers made by away teams in games where they had more turnovers than assists? SELECT AVG(fg3m_away) FROM game WHERE tov_away > ast_away; 4.511052937754508 True
107
+ What was the difference in average free throw attempts between the Brooklyn Nets and their opponents in home games during the 2020 season? SELECT AVG(fta_home - fta_away) AS fta_diff FROM game WHERE team_name_home = 'Brooklyn Nets' AND season_id = '22020'; 1.083333333 True
108
+ What is the total points scored by the Philadelphia Warriors away? SELECT SUM(pts_away) as total_points FROM game WHERE team_name_away = 'Philadelphia 76ers'; 251917.0 True
109
+ When was the last time the New York Knicks won a home game? SELECT game_date FROM game WHERE team_abbreviation_home = 'NYK' AND wl_home = 'W' ORDER BY game_date DESC LIMIT 1; 2023-05-10 00:00:00 True
110
+ What was the lowest-scoring game involving the Indiana Pacers in the 1994 season? SELECT MIN(total_points) AS lowest_scoring_game FROM ( SELECT (pts_home + pts_away) AS total_points FROM game WHERE season_id = '21994' AND (team_abbreviation_home = 'IND' OR team_abbreviation_away = 'IND') ); 155.0 True
111
+ How many games did the Sacramento Kings lose at home in 1996? SELECT COUNT(*) as home_losses FROM game WHERE team_name_home = 'Sacramento Kings' AND wl_home = 'L' AND season_id = '21996'; 19.0 True
112
+ What was the total score of the only game in which the home team made exactly 33 field goals? SELECT pts_home + pts_away FROM game WHERE fgm_home = 33 LIMIT 1; 144.0 True
113
+ What was the difference in second-chance points between the Chicago Bulls and their opponents in their closest home game of the 2016 season? SELECT o.pts_2nd_chance_home - o.pts_2nd_chance_away AS second_chance_diff FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.season_id = '22016' ORDER BY ABS(g.pts_home - g.pts_away) ASC LIMIT 1; -5 True
114
+ What is the highest plus-minus score for the Indiana Pacers at home? SELECT MAX(plus_minus_home) as max_plus_minus FROM game WHERE team_name_home = 'Indiana Pacers'; 65.0 True
115
+ What is the total number of three-pointers made by the Golden State Warriors at home versus the Cleveland Cavaliers in all seasons combined? SELECT SUM(fg3m_home) AS total_threes FROM game WHERE team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers'; 407 True
116
+ How many points did the away team score in the only game where the home team had exactly 69 field goal attempts? SELECT pts_away FROM game WHERE fga_home = 69 LIMIT 1; 81.0 True
117
+ What is the average number of ast in away games by the Milwaukee Bucks? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Milwaukee Bucks'; 22.16927374301676 True
118
+ What is the total number of steals recorded by the Miami Heat in games against the Boston Celtics? SELECT SUM(CASE WHEN team_name_home = 'Miami Heat' THEN stl_home ELSE stl_away END) AS total_steals FROM game WHERE (team_name_home = 'Miami Heat' AND team_name_away = 'Boston Celtics') OR (team_name_home = 'Boston Celtics' AND team_name_away = 'Miami Heat'); 1253 True
119
+ Which team had the most games where both teams scored over 110 points? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE pts_home > 110 AND pts_away > 110 UNION ALL SELECT team_abbreviation_away FROM game WHERE pts_home > 110 AND pts_away > 110) GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; LAL True
120
+ What is the highest number of points the Los Angeles Lakers have scored in a single away game? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'LAL'; 153.0 True
121
+ What is the total second chance points by the Washington Wizards away? SELECT SUM(pts_2nd_chance_away) as total_2nd_chance FROM other_stats WHERE team_abbreviation_away = 'WAS'; 13226.0 True
122
+ What is the average number of assists per game for the Golden State Warriors when they won during the 2018 season? SELECT AVG(assists) AS avg_assists FROM ( SELECT ast_home AS assists FROM game WHERE team_name_home = 'Golden State Warriors' AND wl_home = 'W' AND season_id = '22018' UNION ALL SELECT ast_away AS assists FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'W' AND season_id = '22018' ) AS winning_games 31 True
123
+ What was the total number of points in the game where both teams had the exact same number of personal fouls? SELECT pts_home + pts_away FROM game WHERE pf_home = pf_away ORDER BY game_date DESC LIMIT 1; 258.0 True
124
+ How many games did the Boston Celtics win at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22020'; 21 True
125
+ Which team had the highest average free throw percentage at home in the 2016 season? SELECT team_name_home, AVG(ft_pct_home) AS avg_ft_percentage FROM game WHERE season_id = '22016' GROUP BY team_name_home ORDER BY avg_ft_percentage DESC LIMIT 1; Boston Celtics | 0.820975609756098 True
126
+ In the 2001 season, what was the average number of second chance points scored by the opponents when the Atlanta Hawks played at home and lost? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'ATL' AND g.wl_home = 'L' AND g.season_id = '22001'; 13.333333333333334 True
127
+ Which team had the highest average points from second chance opportunities in home games they won during the 2016 season? SELECT g.team_name_home, AVG(o.pts_2nd_chance_home) AS avg_second_chance_pts FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.wl_home = 'W' AND g.season_id = '22016' GROUP BY g.team_name_home ORDER BY avg_second_chance_pts DESC LIMIT 1; Los Angeles Lakers | 15.6153846153846 True
128
+ What is the highest number of points the Golden State Warriors have ever scored in a single home game? SELECT MAX(pts_home) FROM game WHERE team_abbreviation_home = 'GSW'; 149.0 True
129
+ What is the average number of ft_pct in home games by the Los Angeles Lakers? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Los Angeles Lakers'; 0.7450706106870195 True
130
+ How many team turnovers did the New York Knicks have at home? SELECT SUM(team_turnovers_home) as total_team_turnovers FROM other_stats WHERE team_abbreviation_home = 'NYK'; 550.0 True
131
+ How many three-pointers did the Golden State Warriors make in total during the 2016 season? SELECT SUM(fg3m_home + fg3m_away) AS total_three_pointers FROM game WHERE season_id = '22016' AND (team_name_home = 'Golden State Warriors' OR team_name_away = 'Golden State Warriors'); 1719.0 True
132
+ What is the total rebounds by the Miami Heat at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Miami Heat'; 65199.0 True
133
+ What is the average number of fg_pct in away games by the Los Angeles Lakers? SELECT AVG(fg_pct_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 0.4678996728462382 True
134
+ How many points did the home team score in the game with the most second chance points? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats ORDER BY (pts_2nd_chance_home + pts_2nd_chance_away) DESC LIMIT 1); 115.0 True
135
+ What was the total number of points in the only game where the sum of both teams' free throws made was exactly 42? SELECT pts_home + pts_away FROM game WHERE (ftm_home + ftm_away) = 42 LIMIT 1; 156.0 True
136
+ What is the average number of ft_pct in home games by the Charlotte Hornets? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Charlotte Hornets'; 0.7601475237091683 True
137
+ Which team is based in the city of Chicago? SELECT full_name FROM team WHERE city = 'Chicago'; Chicago Bulls True
138
+ What is the Chicago Bulls' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '22016'; 47 True
139
+ Which players scored 50 or more points in a game during the 1990s? SELECT game_id, game_date, CASE WHEN pts_home >= 50 THEN team_name_home ELSE team_name_away END AS team_name, CASE WHEN pts_home >= 50 THEN pts_home ELSE pts_away END AS points FROM game WHERE (pts_home >= 50 OR pts_away >= 50) AND CAST(SUBSTR(season_id, 2) AS INTEGER) BETWEEN 1990 AND 1999 ORDER BY points DESC True
140
+ How many home games did the Los Angeles Lakers play in the 2022 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Los Angeles Lakers' AND season_id = '22022'; 41.0 True
141
+ What is the total points in the paint by the Milwaukee Bucks away? SELECT SUM(pts_paint_away) as total_pts_paint FROM other_stats WHERE team_abbreviation_away = 'MIL'; 39056.0 True
142
+ What is the largest margin of victory in a game, whether home or away? SELECT game_date, ABS(pts_home - pts_away) AS margin FROM game ORDER BY margin DESC LIMIT 1; 2021-12-02 00:00:00|73.0 True
143
+ What is the average number of pts in away games by the Portland Trail Blazers? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Portland Trail Blazers'; 102.6668215613383 True
144
+ What is the highest number of rebounds recorded by a home team in a game during the 2005 season? SELECT MAX(reb_home) FROM game WHERE season_id = '22005'; 65.0 True
145
+ What is the highest combined ast in any game involving the Boston Celtics? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Boston Celtics' OR team_name_away = 'Boston Celtics'; 79.0 True
146
+ How many times were games tied when the Indiana Pacers played away? SELECT SUM(times_tied) as total_times_tied FROM other_stats WHERE team_abbreviation_away = 'IND'; 4910.0 True
147
+ How many points did the away team score when the home team had more than 20 offensive rebounds? SELECT SUM(pts_away) FROM game WHERE game_id IN (SELECT game_id FROM game WHERE oreb_home > 20); 199836.0 True
148
+ What is the highest combined score in a game between the Golden State Warriors and the Cleveland Cavaliers? SELECT MAX(pts_home + pts_away) FROM game WHERE (team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers') OR (team_name_home = 'Cleveland Cavaliers' AND team_name_away = 'Golden State Warriors'); 266.0 True
149
+ Which game had the highest total points scored by both teams when the Los Angeles Lakers played at home? SELECT game_id, (pts_home + pts_away) AS total_points FROM game WHERE team_abbreviation_home = 'LAL' ORDER BY total_points DESC LIMIT 1; (0028000933, 294.0) True
150
+ How many games did the Sacramento Kings lose away with more than 15 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Sacramento Kings' AND g.wl_away = 'L' AND os.pts_fb_away > 15 AND g.season_id = '21996'; 10.0 True
151
+ What is the lowest number of points the Golden State Warriors have scored in an away game? SELECT MIN(pts_away) FROM game WHERE team_abbreviation_away = 'GSW'; 65.0 True
training-data/tennis_train_set_connor.tsv ADDED
@@ -0,0 +1,208 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ natural_query sql_query result
2
+ List all players with their name and date of birth. SELECT name, dob FROM players;
3
+ What is the height of player 'Roger Federer'? SELECT height FROM players WHERE name = 'Roger Federer';
4
+ Get the names and countries of all players taller than 190 cm. SELECT name, ioc FROM players WHERE height > 190;
5
+ Find the average height of all players from USA. SELECT AVG(height) FROM players WHERE ioc = 'USA';
6
+ List matches where winner_age is less than 21. SELECT tourney_name, winner_name FROM matches WHERE winner_age < 21;
7
+ What was the final score in the match where 'Rafael Nadal' was the winner? SELECT score FROM matches WHERE winner_name = 'Rafael Nadal';
8
+ Which matches lasted more than 180 minutes? SELECT tourney_id, tourney_name, minutes FROM matches WHERE minutes > 180;
9
+ List all players whose height is between 180 and 200 cm. SELECT name FROM players WHERE height BETWEEN 180 AND 200;
10
+ Find the winner and loser names for matches held in 'Paris'. SELECT winner_name, loser_name FROM matches WHERE tourney_name = 'Paris';
11
+ Show players whose birth date is after 20000101. SELECT name, dob FROM players WHERE dob > 20000101;
12
+ Who was the youngest match winner in 'Roland Garros'? SELECT winner_name FROM matches WHERE tourney_name = 'Roland Garros' ORDER BY winner_age ASC LIMIT 1;
13
+ Display matches where the score included a tiebreak. SELECT * FROM matches WHERE score LIKE '%7-6%';
14
+ Get matches with the 'best_of' field equal to 5. SELECT * FROM matches WHERE best_of = '5';
15
+ Count matches won by left-handed players. SELECT COUNT(*) FROM matches WHERE winner_hand = 'L;
16
+ Show all match results where 'Maria Sharapova' was the loser. SELECT * FROM matches WHERE loser_name = 'Andre Agassi';
17
+ List tournaments and winners for matches longer than 200 minutes. SELECT tourney_name, winner_name FROM matches WHERE minutes > 200;
18
+ Show ranking history for player 'Stan Wawrinka'. SELECT ranking_date, rank, Points FROM rankings JOIN players ON player = player_id WHERE name = 'Stan Wawrinka';
19
+ List all doubles match winners in 'Wimbledon'. SELECT winner1_name, winner2_name FROM matches WHERE tourney_name = 'Wimbledon';
20
+ Show matches where winner's age is more than 34. SELECT tourney_id, winner_name FROM matches WHERE winner_age > 34;
21
+ How many matches did 'Pete Sampras' win in 1990? SELECT COUNT(*) FROM matches WHERE winner_name = 'Pete Sampras' AND tourney_date BETWEEN 19900101 AND 19901231;
22
+ Which matches were completed in under one hour? SELECT tourney_id FROM matches WHERE minutes < 60;
23
+ What are the details of the doubles winners for tournament 'Indian Wells'? SELECT winner1_name, winner2_name, winner1_ioc, winner2_ioc FROM matches WHERE tourney_name = 'Indian Wells';
24
+ What are the player names and countries for players over 210 cm tall? SELECT name, ioc FROM players WHERE height > 210;
25
+ List matches where either the winner or loser is from 'JPN'. SELECT tourney_name, winner_name, loser_name FROM matches WHERE winner_ioc = 'JPN' OR loser_ioc = 'JPN';
26
+ Give all tournaments and dates where Rafael Nadal defeated Novak Djokovic. SELECT tourney_name, tourney_date FROM matches WHERE winner_name='Rafael Nadal' AND loser_name='Novak Djokovic';
27
+ Which players lost all their matches in 2021? SELECT name FROM players WHERE player_id IN (SELECT loser_id FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231) AND player_id NOT IN (SELECT winner_id FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231);
28
+ How many tournaments were held in 2022? SELECT COUNT(DISTINCT tourney_name) FROM matches WHERE tourney_date BETWEEN 20220101 AND 20221231;
29
+ Find the 5 players with the highest average match duration as winners. SELECT winner_name, AVG(minutes) as avg_time FROM matches GROUP BY winner_name ORDER BY avg_time DESC LIMIT 5;
30
+ Show matches where both winner and loser share the same first letter of their name. SELECT tourney_name, winner_name, loser_name FROM matches WHERE SUBSTR(winner_name,1,1)=SUBSTR(loser_name,1,1);
31
+ List all matches where the loser scored a 'bagel' set (lost a set 0-6). SELECT * FROM matches WHERE score LIKE '%0-6%' OR score LIKE '%6-0%';
32
+ Find tournaments never won by players under 25. SELECT DISTINCT tourney_name FROM matches WHERE tourney_name NOT IN (SELECT tourney_name FROM matches WHERE winner_age < 25);
33
+ List matches with more than five sets played. SELECT * FROM matches WHERE LENGTH(score) - LENGTH(REPLACE(score, ' ', '')) + 1 > 5;
34
+ Which players made their debut in 2005? SELECT name FROM players WHERE dob > 19870101 AND player_id IN (SELECT winner_id FROM matches WHERE tourney_date BETWEEN 20050101 AND 20051231 UNION SELECT loser_id FROM matches WHERE tourney_date BETWEEN 20050101 AND 20051231);
35
+ How many matches ended with a tiebreak in the final set? SELECT COUNT(*) FROM matches WHERE score LIKE '%7-6%' AND (score LIKE '% 7-6' OR score LIKE '%7-6%');
36
+ List tournaments with the most five-set matches. SELECT tourney_name, COUNT(*) as five_set FROM matches WHERE LENGTH(score) - LENGTH(REPLACE(score,' ',''))+1 = 5 GROUP BY tourney_name ORDER BY five_set DESC LIMIT 1;
37
+ Which player had the most consecutive match wins in 2021? SELECT winner_name, MAX(streak) FROM (SELECT winner_name, COUNT(*) as streak FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231 GROUP BY winner_name, tourney_date ORDER BY tourney_date) GROUP BY winner_name ORDER BY streak DESC LIMIT 1;
38
+ List tournaments played by exactly two players from France. SELECT tourney_name FROM (SELECT tourney_name, COUNT(DISTINCT winner_name) as f_wins, COUNT(DISTINCT loser_name) as f_losses FROM matches WHERE winner_ioc='FRA' OR loser_ioc='FRA' GROUP BY tourney_name HAVING f_wins + f_losses = 2);
39
+ Show all winners whose opponent was ranked in the top 10 at the match date. SELECT matches.winner_name FROM matches JOIN rankings ON matches.tourney_date=rankings.ranking_date AND matches.loser_id=rankings.player WHERE rank <= 10;
40
+ How many distinct players won at least once in 2020? SELECT COUNT(DISTINCT winner_name) FROM matches WHERE tourney_date BETWEEN 20200101 AND 20201231;
41
+ List all players who never lost to 'Rafael Nadal'. SELECT name FROM players WHERE name NOT IN (SELECT loser_name FROM matches WHERE winner_name='Rafael Nadal');
42
+ Which player had the highest number of different doubles partners? SELECT name, COUNT(DISTINCT partner) as partners FROM (SELECT winner1_name as name, winner2_name as partner FROM matches WHERE winner1_id IS NOT NULL UNION ALL SELECT winner2_name as name, winner1_name as partner FROM matches WHERE winner2_id IS NOT NULL) GROUP BY name ORDER BY partners DESC LIMIT 1;
43
+ List tournaments with more than 10 different nationalities among winners. SELECT tourney_name FROM (SELECT tourney_name, COUNT(DISTINCT winner_ioc) as countries FROM matches GROUP BY tourney_name) WHERE countries > 10;
44
+ List all matches where the loser had a higher average match time than the winner in 2021. SELECT m1.tourney_name, m1.winner_name, m1.loser_name FROM matches m1 JOIN (SELECT loser_name, AVG(minutes) as avg_time FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231 GROUP BY loser_name) l ON m1.loser_name=l.loser_name JOIN (SELECT winner_name, AVG(minutes) as avg_time FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231 GROUP BY winner_name) w ON m1.winner_name=w.winner_name WHERE l.avg_time > w.avg_time;
45
+ Which player won matches in every month of 2022? SELECT winner_name FROM (SELECT winner_name, COUNT(DISTINCT STRFTIME('%m',tourney_date)) as months FROM matches WHERE tourney_date BETWEEN 20220101 AND 20221231 GROUP BY winner_name) WHERE months=12;
46
+ List players who have won doubles and singles matches in the same tournament. SELECT DISTINCT winner_name FROM matches WHERE tourney_name IN (SELECT tourney_name FROM matches WHERE winner1_id IS NOT NULL) AND tourney_name IN (SELECT tourney_name FROM matches WHERE winner1_id IS NULL);
47
+ How many left-handed players ranked in the top 30 in 2024? SELECT COUNT(DISTINCT players.player_id) FROM rankings JOIN players ON rankings.player=players.player_id WHERE ranking_date BETWEEN 20240101 AND 20241231 AND rank<=30 AND hand='left';
48
+ List all player names. SELECT name FROM players;
49
+ Show all tournaments. SELECT DISTINCT tourney_name FROM matches;
50
+ List all countries represented. SELECT DISTINCT ioc FROM players;
51
+ Show all right-handed players. SELECT name FROM players WHERE hand = 'R';
52
+ How many matches did Rafael Nadal win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Rafael Nadal';
53
+ List all losses by Novak Djokovic. SELECT loser_name FROM matches WHERE loser_name = 'Novak Djokovic';
54
+ What is the tallest player height? SELECT MAX(height) FROM players;
55
+ Show all players from USA. SELECT name FROM players WHERE ioc = 'USA';
56
+ List all players from France. SELECT name FROM players WHERE ioc = 'FRA';
57
+ How many matches were played in 2022? SELECT COUNT(*) FROM matches WHERE tourney_date BETWEEN 20220101 AND 20221231;
58
+ Show all matches where Roger Federer won. SELECT tourney_name FROM matches WHERE winner_name = 'Roger Federer';
59
+ How many singles matches are there? SELECT COUNT(*) FROM matches WHERE winner1_id IS NULL;
60
+ List all tournaments in 2023. SELECT DISTINCT tourney_name FROM matches WHERE tourney_date BETWEEN 20230101 AND 20231231;
61
+ Show the tallest 5 players. SELECT name, height FROM players ORDER BY height DESC LIMIT 5;
62
+ Show the shortest 5 players. SELECT name, height FROM players ORDER BY height ASC LIMIT 5;
63
+ How many players are from Germany? SELECT COUNT(*) FROM players WHERE ioc = 'GER';
64
+ List all player names and their countries. SELECT name, ioc FROM players;
65
+ Show players taller than 190 cm. SELECT name FROM players WHERE height > 190;
66
+ How many unique losers are there? SELECT COUNT(DISTINCT loser_name) FROM matches;
67
+ What is the shortest match ever? SELECT MIN(minutes) FROM matches;
68
+ What is the average match duration? SELECT AVG(minutes) FROM matches;
69
+ List all matches longer than 3 hours. SELECT tourney_name, minutes FROM matches WHERE minutes > 180;
70
+ List all matches shorter than 30 minutes. SELECT tourney_name, minutes FROM matches WHERE minutes < 30;
71
+ Show all tournaments with exactly 10 matches. SELECT tourney_name FROM matches GROUP BY tourney_name HAVING COUNT(*) = 10;
72
+ What country has the most players? SELECT ioc, COUNT() FROM players GROUP BY ioc ORDER BY COUNT() DESC LIMIT 1;
73
+ How many players are there from each country? SELECT ioc, COUNT(*) FROM players GROUP BY ioc;
74
+ List all matches in the Australian Open. SELECT * FROM matches WHERE tourney_name LIKE '%Australian%';
75
+ List all matches in Wimbledon. SELECT * FROM matches WHERE tourney_name LIKE '%Wimbledon%';
76
+ How many matches were won by left-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'L';
77
+ How many matches were won by right-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'R';
78
+ Show all matches where the winner was older than 30. SELECT tourney_name, winner_name, winner_age FROM matches WHERE winner_age > 30;
79
+ Show all matches where the winner was younger than 20. SELECT tourney_name, winner_name, winner_age FROM matches WHERE winner_age < 20;
80
+ How many players are over 190 cm tall? SELECT COUNT(*) FROM players WHERE height > 190;
81
+ How many players are under 170 cm tall? SELECT COUNT(*) FROM players WHERE height < 170;
82
+ What is the score of the longest match? SELECT score FROM matches ORDER BY minutes DESC LIMIT 1;
83
+ List 10 random matches. SELECT * FROM matches LIMIT 10;
84
+ Show the first 5 players alphabetically. SELECT name FROM players ORDER BY name ASC LIMIT 5;
85
+ Show all matches in 2021. SELECT * FROM matches WHERE tourney_date BETWEEN 20210101 AND 20211231;
86
+ List all Italian players. SELECT name FROM players WHERE ioc = 'ITA';
87
+ Show all matches from 2020. SELECT * FROM matches WHERE tourney_date BETWEEN 20200101 AND 20201231;
88
+ What is the average age of all winners? SELECT AVG(winner_age) FROM matches;
89
+ Show all players who are 180 cm tall. SELECT name FROM players WHERE height = 180;
90
+ List all British players. SELECT name FROM players WHERE ioc = 'GBR';
91
+ List all Canadian players. SELECT name FROM players WHERE ioc = 'CAN';
92
+ How many matches were won by players aged 25? SELECT COUNT(*) FROM matches WHERE winner_age = 25;
93
+ Show all matches from January 2023. SELECT * FROM matches WHERE tourney_date BETWEEN 20230101 AND 20230131;
94
+ What is the most common winner score? SELECT score, COUNT() FROM matches GROUP BY score ORDER BY COUNT() DESC LIMIT 1;
95
+ How many singles matches were in 2022? SELECT COUNT(*) FROM matches WHERE winner1_id IS NULL AND tourney_date BETWEEN 20220101 AND 20221231;
96
+ List all matches with a score of '6-4 6-3'. SELECT * FROM matches WHERE score = '6-4 6-3';
97
+ How many players are from Switzerland? SELECT COUNT(*) FROM players WHERE ioc = 'SUI';
98
+ How many matches lasted exactly 100 minutes? SELECT COUNT(*) FROM matches WHERE minutes = 100;
99
+ List all matches over 200 minutes. SELECT tourney_name FROM matches WHERE minutes > 200;
100
+ Show all Australian players. SELECT name FROM players WHERE ioc = 'AUS';
101
+ What is the oldest player in the database? SELECT name FROM players ORDER BY dob ASC LIMIT 1;
102
+ What is the youngest player in the database? SELECT name FROM players ORDER BY dob DESC LIMIT 1;
103
+ List all Czech players. SELECT name FROM players WHERE ioc = 'CZE';
104
+ How many matches were in the year 2019? SELECT COUNT(*) FROM matches WHERE tourney_date BETWEEN 20190101 AND 20191231;
105
+ Show all matches won by players from Sweden. SELECT tourney_name FROM matches WHERE winner_ioc = 'SWE';
106
+ How many matches were won by players over 200 cm? SELECT COUNT(*) FROM matches WHERE winner_ht > 200;
107
+ List all matches lost by players under 170 cm. SELECT tourney_name FROM matches WHERE loser_ht < 170;
108
+ How many players have a height of 180 cm or more? SELECT COUNT(*) FROM players WHERE height >= 180;
109
+ How many players have a height less than 180 cm? SELECT COUNT(*) FROM players WHERE height < 180;
110
+ Show all matches between 100 and 150 minutes. SELECT tourney_name FROM matches WHERE minutes BETWEEN 100 AND 150;
111
+ List all rankings from 2023. SELECT * FROM rankings WHERE ranking_date BETWEEN 20230101 AND 20231231;
112
+ What is the highest ranking ever achieved? SELECT MIN(rank) FROM rankings;
113
+ What is the lowest ranking in the database? SELECT MAX(rank) FROM rankings;
114
+ How many players are ranked number 1? SELECT COUNT(DISTINCT player) FROM rankings WHERE rank = 1;
115
+ Show all top 5 ranked players. SELECT DISTINCT player FROM rankings WHERE rank <= 5;
116
+ List all Belgian players. SELECT name FROM players WHERE ioc = 'BEL';
117
+ How many matches were played in Japan? SELECT COUNT(*) FROM matches WHERE winner_ioc = 'JPN' OR loser_ioc = 'JPN';
118
+ Show all players who are 175 cm tall. SELECT name FROM players WHERE height = 175;
119
+ List all matches with a best_of value of 3. SELECT * FROM matches WHERE best_of = '3';
120
+ How many matches did Novak Djokovic win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Novak Djokovic';
121
+ How many matches did Roger Federer lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Roger Federer';
122
+ How many matches did Stan Wawrinka win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Stan Wawrinka';
123
+ List all matches where Rafael Nadal lost. SELECT tourney_name FROM matches WHERE loser_name = 'Rafael Nadal';
124
+ How many matches did Dominic Thiem win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Dominic Thiem';
125
+ Show all matches lost by Matteo Berrettini. SELECT tourney_name FROM matches WHERE loser_name = 'Matteo Berrettini';
126
+ How many matches did Daniil Medvedev win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Daniil Medvedev';
127
+ List all matches won by Alexander Zverev. SELECT tourney_name FROM matches WHERE winner_name = 'Alexander Zverev';
128
+ How many matches did Stefanos Tsitsipas lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Stefanos Tsitsipas';
129
+ Show all matches won by Gael Monfils. SELECT tourney_name FROM matches WHERE winner_name = 'Gael Monfils';
130
+ How many matches did David Ferrer win? SELECT COUNT(*) FROM matches WHERE winner_name = 'David Ferrer';
131
+ List all matches where Juan Martin del Potro lost. SELECT tourney_name FROM matches WHERE loser_name = 'Juan Martin del Potro';
132
+ How many matches did Tommy Paul win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Tommy Paul';
133
+ Show all matches lost by Taylor Fritz. SELECT tourney_name FROM matches WHERE loser_name = 'Taylor Fritz';
134
+ How many matches did Felix Auger-Aliassime win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Felix Auger-Aliassime';
135
+ List all matches won by Cameron Norrie. SELECT tourney_name FROM matches WHERE winner_name = 'Cameron Norrie';
136
+ How many matches did Jannik Sinner lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Jannik Sinner';
137
+ Show all matches won by Carlos Alcaraz. SELECT tourney_name FROM matches WHERE winner_name = 'Carlos Alcaraz';
138
+ How many matches did Matteo Berrettini win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Matteo Berrettini';
139
+ List all matches where Andy Murray lost. SELECT tourney_name FROM matches WHERE loser_name = 'Andy Murray';
140
+ How many matches did Grigor Dimitrov win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Grigor Dimitrov';
141
+ Show all matches lost by Milos Raonic. SELECT tourney_name FROM matches WHERE loser_name = 'Milos Raonic';
142
+ How many matches did Nick Kyrgios win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Nick Kyrgios';
143
+ List all matches won by Andrey Rublev. SELECT tourney_name FROM matches WHERE winner_name = 'Andrey Rublev';
144
+ How many matches did Diego Schwartzman lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Diego Schwartzman';
145
+ Show all matches won by Gaston Gaudio. SELECT tourney_name FROM matches WHERE winner_name = 'Gaston Gaudio';
146
+ How many matches did Tommy Haas win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Tommy Haas';
147
+ List all matches where Richard Gasquet lost. SELECT tourney_name FROM matches WHERE loser_name = 'Richard Gasquet';
148
+ How many matches did Kei Nishikori win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Kei Nishikori';
149
+ Show all matches lost by Philipp Petzschner. SELECT tourney_name FROM matches WHERE loser_name = 'Philipp Petzschner';
150
+ How many matches did Mardy Fish win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Mardy Fish';
151
+ List all matches won by Robby Ginepri. SELECT tourney_name FROM matches WHERE winner_name = 'Robby Ginepri';
152
+ How many matches did Lleyton Hewitt lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Lleyton Hewitt';
153
+ Show all matches won by Marin Cilic. SELECT tourney_name FROM matches WHERE winner_name = 'Marin Cilic';
154
+ How many matches did Igor Andreev win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Igor Andreev';
155
+ List all matches where Gilles Simon lost. SELECT tourney_name FROM matches WHERE loser_name = 'Gilles Simon';
156
+ How many matches did Fernando Verdasco win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Fernando Verdasco';
157
+ Show all matches lost by Tomas Berdych. SELECT tourney_name FROM matches WHERE loser_name = 'Tomas Berdych';
158
+ How many matches did Tommy Robredo win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Tommy Robredo';
159
+ List all matches won by Albert Montanes. SELECT tourney_name FROM matches WHERE winner_name = 'Albert Montanes';
160
+ How many matches did Adrian Mannarino lose? SELECT COUNT(*) FROM matches WHERE loser_name = 'Adrian Mannarino';
161
+ Show all matches won by John McEnroe. SELECT tourney_name FROM matches WHERE winner_name = 'John McEnroe';
162
+ How many matches did Pete Sampras win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Pete Sampras';
163
+ List all matches where Andre Agassi lost. SELECT tourney_name FROM matches WHERE loser_name = 'Andre Agassi';
164
+ How many matches did Bjorn Borg win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Bjorn Borg';
165
+ Show all matches lost by Jimmy Connors. SELECT tourney_name FROM matches WHERE loser_name = 'Jimmy Connors';
166
+ How many matches did Arthur Ashe win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Arthur Ashe';
167
+ How many matches did Novak Djokovic and Rafael Nadal play against each other? SELECT COUNT(*) FROM matches WHERE (winner_name = 'Novak Djokovic' AND loser_name = 'Rafael Nadal') OR (winner_name = 'Rafael Nadal' AND loser_name = 'Novak Djokovic');
168
+ Show all matches won by players from France in 2022. SELECT tourney_name FROM matches WHERE winner_ioc = 'FRA' AND tourney_date BETWEEN 20220101 AND 20221231;
169
+ How many matches ended in exactly 120 minutes? SELECT COUNT(*) FROM matches WHERE minutes = 120;
170
+ List all tournaments where Roger Federer played. SELECT DISTINCT tourney_name FROM matches WHERE winner_name = 'Roger Federer' OR loser_name = 'Roger Federer';
171
+ How many matches were played between left-handed and right-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand <> loser_hand;
172
+ List all tournaments in alphabetical order. SELECT DISTINCT tourney_name FROM matches ORDER BY tourney_name ASC;
173
+ List all left-handed players. SELECT name FROM players WHERE hand = 'L';
174
+ Show all matches where a left-handed player won. SELECT tourney_name, winner_name FROM matches WHERE winner_hand = 'L';
175
+ Show all matches where a right-handed player won. SELECT tourney_name, winner_name FROM matches WHERE winner_hand = 'R';
176
+ How many matches were lost by left-handed players? SELECT COUNT(*) FROM matches WHERE loser_hand = 'L';
177
+ How many matches were lost by right-handed players? SELECT COUNT(*) FROM matches WHERE loser_hand = 'R';
178
+ List all matches where a left-handed player lost. SELECT tourney_name, loser_name FROM matches WHERE loser_hand = 'L';
179
+ List all matches where a right-handed player lost. SELECT tourney_name, loser_name FROM matches WHERE loser_hand = 'R';
180
+ How many matches were between two left-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'L' AND loser_hand = 'L';
181
+ How many matches were between two right-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'R' AND loser_hand = 'R';
182
+ How many matches were between a left-handed and right-handed player? SELECT COUNT(*) FROM matches WHERE (winner_hand = 'L' AND loser_hand = 'R') OR (winner_hand = 'R' AND loser_hand = 'L');
183
+ Show all left-handed players from USA. SELECT name FROM players WHERE hand = 'L' AND ioc = 'USA';
184
+ Show all right-handed players from Spain. SELECT name FROM players WHERE hand = 'R' AND ioc = 'ESP';
185
+ How many left-handed players are from France? SELECT COUNT(*) FROM players WHERE hand = 'L' AND ioc = 'FRA';
186
+ How many right-handed players are from Germany? SELECT COUNT(*) FROM players WHERE hand = 'R' AND ioc = 'GER';
187
+ List matches won by left-handed players in 2023. SELECT tourney_name FROM matches WHERE winner_hand = 'L' AND tourney_date BETWEEN 20230101 AND 20231231;
188
+ List matches won by right-handed players in 2022. SELECT tourney_name FROM matches WHERE winner_hand = 'R' AND tourney_date BETWEEN 20220101 AND 20221231;
189
+ Show all left-handed players taller than 190 cm. SELECT name FROM players WHERE hand = 'L' AND height > 190;
190
+ Show all right-handed players shorter than 170 cm. SELECT name FROM players WHERE hand = 'R' AND height < 170;
191
+ How many left-handed players are taller than 185 cm? SELECT COUNT(*) FROM players WHERE hand = 'L' AND height > 185;
192
+ How many right-handed players are shorter than 175 cm? SELECT COUNT(*) FROM players WHERE hand = 'R' AND height < 175;
193
+ What is the average height of left-handed players? SELECT AVG(height) FROM players WHERE hand = 'L';
194
+ What is the average height of right-handed players? SELECT AVG(height) FROM players WHERE hand = 'R';
195
+ List all left-handed players sorted by height. SELECT name, height FROM players WHERE hand = 'L' ORDER BY height DESC;
196
+ List all right-handed players sorted by height. SELECT name, height FROM players WHERE hand = 'R' ORDER BY height ASC;
197
+ Show matches where both players were left-handed. SELECT tourney_name FROM matches WHERE winner_hand = 'L' AND loser_hand = 'L';
198
+ Show matches where both players were right-handed. SELECT tourney_name FROM matches WHERE winner_hand = 'R' AND loser_hand = 'R';
199
+ How many matches in 2021 were won by left-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'L' AND tourney_date BETWEEN 20210101 AND 20211231;
200
+ How many matches in 2021 were won by right-handed players? SELECT COUNT(*) FROM matches WHERE winner_hand = 'R' AND tourney_date BETWEEN 20210101 AND 20211231;
201
+ List left-handed players from Australia. SELECT name FROM players WHERE hand = 'L' AND ioc = 'AUS';
202
+ List right-handed players from Switzerland. SELECT name FROM players WHERE hand = 'R' AND ioc = 'SUI';
203
+ How many left-handed players from Canada are in the database? SELECT COUNT(*) FROM players WHERE hand = 'L' AND ioc = 'CAN';
204
+ How many right-handed players from Italy are in the database? SELECT COUNT(*) FROM players WHERE hand = 'R' AND ioc = 'ITA';
205
+ Show the tallest left-handed player. SELECT name, height FROM players WHERE hand = 'L' ORDER BY height DESC LIMIT 1;
206
+ Show the shortest right-handed player. SELECT name, height FROM players WHERE hand = 'R' ORDER BY height ASC LIMIT 1;
207
+ How many left-handed players won more than 10 matches? SELECT COUNT(DISTINCT winner_name) FROM matches WHERE winner_hand = 'L' GROUP BY winner_name HAVING COUNT(*) > 10;
208
+ How many right-handed players lost more than 10 matches? SELECT COUNT(DISTINCT loser_name) FROM matches WHERE loser_hand = 'R' GROUP BY loser_name HAVING COUNT(*) > 10;
training-data/{tennis_train_set.tsv → tennis_train_set_dean.tsv} RENAMED
File without changes
training-data/tennis_train_set_mehul.tsv ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ natural_query sql_query result tennis
2
+ How many players are in the database? SELECT COUNT(*) FROM players; 65019 tennis
3
+ What is the average height of all players? SELECT AVG(height) FROM players WHERE height IS NOT NULL; 183.74813763746 tennis
4
+ Who is the tallest player? SELECT name, height FROM players ORDER BY height DESC LIMIT 1; Reilly Opelka|211.0 tennis
5
+ How many right-handed players are there? SELECT COUNT(*) FROM players WHERE hand = 'R'; 15666 tennis
6
+ What is the shortest player's height? SELECT MIN(height) FROM players WHERE height IS NOT NULL; 145.0 tennis
7
+ How many different countries are represented? SELECT COUNT(DISTINCT ioc) FROM players; 226 tennis
8
+ Show all players taller than 200cm SELECT name, height FROM players WHERE height > 200; Victor Amaya|201.0 Michiel Schapers|201.0 Greg Neuhart|206.0 Milan Srejber|203.0 Marc Rosset|201.0 Dick Norman|203.0 Andrew Richardson|201.0 Brian Dunn|201.0 Alexander Popp|201.0 Dario Pizzato|201.0 Ivo Karlovic|208.0 Nikola Ciric|201.0 Marcelo Melo|203.0 John Isner|206.0 Chris Guccione|201.0 Philipp Oswald|201.0 Kevin Anderson|203.0 Kenny De Schepper|203.0 Alexander Bury|203.0 Thomas Schoorel|203.0 Jerzy Janowicz|203.0 Albano Olivetti|203.0 Danilo Petrovic|203.0 Christopher Eubanks|201.0 Reilly Opelka|211.0 Louis Wessels|201.0 tennis
9
+ What is the average height of left-handed players? SELECT AVG(height) FROM players WHERE hand = 'L' AND height IS NOT NULL; 183.606060606061 tennis
10
+ How many matches are recorded in total? SELECT COUNT(*) FROM matches; 947720 tennis
11
+ What was the longest match by duration? SELECT tourney_name, winner_name, loser_name, minutes FROM matches WHERE minutes IS NOT NULL ORDER BY minutes DESC LIMIT 1; Guayaquil CH|Federico Coria|Tomas Lipovsek Puches|4756.0 tennis
12
+ How many matches were played in best of 5 format? SELECT COUNT(*) FROM matches WHERE best_of = '5'; 67502 tennis
13
+ Show me matches where the winner was under 15 years old SELECT tourney_name, winner_name, winner_age FROM matches WHERE winner_age < 15; Davis Cup G2 R2: EGY vs DEN|Holger Rune|14.9 Fergana CH|Aziz Dadabaev|14.7 Fergana CH|Olimjon Nabiev|14.9 Iran F2|Nicolas Merzetti|14.3 Guam F1|Kody Pearson|14.8 Hilton Head CH|Ali Madani|14.5 Hilton Head CH|Ali Madani|14.5 Spain 8 Masters 3|Feliciano Lopez|14.9 Brisbane|David Carter|14.5 Pakistan 1 1|Cheong Eui Kim|14.3 Pakistan 1 2|Cheong Eui Kim|14.3 Pakistan 1 4|Cheong Eui Kim|14.3 Pakistan 1 4|Cheong Eui Kim|14.3 Pakistan 1 4|Cheong Eui Kim|14.3 Croatia 2 Masters 1|Roko Karanusic|14.6 Croatia 2 Masters 4|Roko Karanusic|14.6 Israel 2 Masters 4|Mosche Levy|14.6 Pakistan F1|Shahzeb Niazi|14.7 Davis Cup G2 PO: JOR vs SIN|Laith Azzouni|14.3 Itu-Sao Paulo CH|Marcelo Saliola|14.3 Brazil F9|Nelio Mattos|14.0 Brazil F13|Nelio Mattos|14.1 USA F21|Stefan Kozlov|14.4 USA F21|Stefan Kozlov|14.4 France F14|Rayane Roumane|14.8 France F14|Rayane Roumane|14.8 France F14|Rayane Roumane|14.8 France F15|Rayane Roumane|14.8 Switzerland F4|Rayane Roumane|14.9 France F17|Rayane Roumane|14.9 Spain F28|Nicolas Alvarez Varona|14.2 Croatia F2|Mario Ancic|14.8 Croatia F2|Mario Ancic|14.8 M15 Villena|Darwin Blanch|14.3 Latvia F1|Ivan Puchkarov|14.0 Russia F2|Ivan Puchkarov|14.0 Ukraine F2|Ivan Puchkarov|14.1 Ukraine F2|Ivan Puchkarov|14.1 Spain 6 Masters 1|Alberto Martin|14.9 Spain 6 Masters 2|Alberto Martin|14.9 Spain 6 Masters 2|Alberto Martin|14.9 Spain 6 Masters 4|Alberto Martin|14.9 USA F4|Stefan Kozlov|14.9 Japan F4|Duck Hee Lee|14.8 Guatemala F1|Ryan Alexander Mueller|14.7 Korea F2|Duck Hee Lee|14.9 China F5|Duck Hee Lee|14.9 China F5|Duck Hee Lee|14.9 China F5|Duck Hee Lee|14.9 Turkey F21|Mert Naci Turker|14.7 Malaysia 1 Masters 1|Mitsuru Takada|14.5 Brazil F2|Silas Araujo De Cerqueira|14.9 Colombia F1|Mateo Andres Ruiz Naranjo|14.5 Australia F4|Thanasi Kokkinakis|14.9 Argentina F21|Francisco Bahamonde|14.9 Argentina F21|Francisco Bahamonde|14.9 Bangalore CH|Amirben Barua|14.7 Granby CH|Felix Auger Aliassime|14.9 Granby CH|Felix Auger Aliassime|14.9 Czechoslovakia Masters 1|David Skoch|14.8 Ecuador Masters 3|Luis Horna|14.8 Peru Bolivia Masters 1|Luis Horna|14.9 Peru Bolivia Masters 3|Luis Horna|14.9 Peru Bolivia Masters 4|Luis Horna|14.9 Spain F5|Carlos Alcaraz|14.7 Spain F5|Carlos Alcaraz|14.7 Italy F15|Luca Nardi|14.8 Italy F15|Luca Nardi|14.8 Nigeria F5|Mukhtar Andu|14.1 Wimbledon|Clement Haughton Langston Cazalet|14.98151951 Wimbledon|Curt Bergmann|14.13552361 tennis
14
+ What is the average age of match winners? SELECT AVG(winner_age) FROM matches WHERE winner_age IS NOT NULL; 24.0506641635802 tennis
15
+ How many matches did players from the USA win? SELECT COUNT(*) FROM matches WHERE winner_ioc = 'USA'; 99510 tennis
16
+ What is the average height of match winners? SELECT AVG(winner_ht) FROM matches WHERE winner_ht IS NOT NULL; 184.204154531561 tennis
17
+ How many matches lasted more than 180 minutes? SELECT COUNT(*) FROM matches WHERE minutes > 180; 5425 tennis
18
+ What is the oldest match date recorded? SELECT MIN(tourney_date) FROM matches WHERE tourney_date IS NOT NULL AND matches.tourney_date > 18000000; 18770709.0 tennis
19
+ How many different tournaments are in the database? SELECT COUNT(DISTINCT tourney_name) FROM matches; 9082 tennis
20
+ What is the average age difference between winners and losers? SELECT AVG(winner_age - loser_age) FROM matches WHERE winner_age IS NOT NULL AND loser_age IS NOT NULL; 0.35414132842825 tennis
21
+ How many ranking records are there? SELECT COUNT(*) FROM rankings; 3235639 tennis
22
+ What is the highest number of points any player has achieved? SELECT MAX(Points) FROM rankings; 16950.0 tennis
23
+ List all players ranked in the top 10 SELECT r.rank, p.name, r.Points FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.rank <= 10 AND r.ranking_date = (SELECT MAX(ranking_date) FROM rankings); 1|Novak Djokovic|9960.0 2|Jannik Sinner|8770.0 3|Carlos Alcaraz|7300.0 4|Alexander Zverev|6305.0 5|Daniil Medvedev|6295.0 6|Andrey Rublev|4700.0 7|Casper Ruud|4425.0 8|Hubert Hurkacz|3885.0 9|Stefanos Tsitsipas|3700.0 10|Grigor Dimitrov|3555.0 tennis
24
+ What is the average points for players ranked 1-10? SELECT AVG(Points) FROM rankings WHERE rank <= 10; 4127.61760391198 tennis
25
+ What is the minimum points needed to be in top 100? SELECT MIN(Points) FROM rankings WHERE rank <= 100 AND ranking_date = (SELECT MAX(ranking_date) FROM rankings); 623.0 tennis
26
+ List the top 5 countries by number of match wins SELECT winner_ioc, COUNT(*) as wins FROM matches GROUP BY winner_ioc ORDER BY wins DESC LIMIT 5; USA|99510 FRA|66714 ESP|63701 ITA|53514 ARG|50769 tennis
27
+ List tournaments with more than 3000 matches SELECT tourney_name, COUNT(*) as match_count FROM matches GROUP BY tourney_name HAVING COUNT(*) > 3000; Australian Open|10990 Barcelona|3741 Indian Wells Masters|3579 M15 Antalya|4318 M15 Monastir|6477 Miami Masters|4200 Queen's Club|3154 Roland Garros|13772 US Open|17246 Washington|3258 Wimbledon|17457 tennis
28
+ What is the win percentage of left-handed vs right-handed players? SELECT winner_hand, COUNT(*) as wins, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) as win_percentage FROM matches WHERE winner_hand IN ('L', 'R') GROUP BY winner_hand; L|100862|12.85 R|683813|87.15 tennis
29
+ List the top 10 players by number of match wins SELECT winner_name, COUNT(*) as wins FROM matches GROUP BY winner_name ORDER BY wins DESC LIMIT 10; |26399 Roger Federer|1305 Jimmy Connors|1279 Novak Djokovic|1179 Rafael Nadal|1167 Ivan Lendl|1075 Guillermo Vilas|953 Ilie Nastase|950 Andre Agassi|887 John McEnroe|886 tennis
30
+ How many matches were played in each best-of format? SELECT best_of, COUNT(*) as match_count FROM matches WHERE LENGTH(best_of) = 1 GROUP BY best_of; 1|36 3|853783 5|67502 F|653 tennis
31
+ Show countries with more than 1000 players SELECT ioc, COUNT(*) as player_count FROM players GROUP BY ioc HAVING COUNT(*) > 1000 ORDER BY player_count DESC; USA|13102 AUS|3266 GBR|3200 ESP|3026 GER|2675 ITA|2656 FRA|2582 BRA|2092 ARG|1759 MEX|1323 JPN|1305 RUS|1093 IND|1078 RSA|1040 tennis
32
+ List the distribution of player heights in 10cm ranges SELECT CAST(height/10 AS INTEGER)*10 as height_range, COUNT(*) as count FROM players WHERE height IS NOT NULL GROUP BY height_range ORDER BY height_range; 140|1 160|23 170|609 180|1624 190|536 200|25 210|1 tennis
33
+ Get player details for Rafael Nadal SELECT * FROM players WHERE name = 'Rafael Nadal'; 104745|L|19860603.0|ESP|185.0|Rafael Nadal tennis
34
+ How many matches did Roger Federer win? SELECT COUNT(*) FROM matches WHERE winner_name = 'Roger Federer'; 1305 tennis
35
+ List all players whose name contains 'Williams' SELECT name, ioc FROM players WHERE name LIKE '%Williams%'; James Williams|USA Ian Williams|USA Gareth Williams|RSA Jeff Williams|USA Rhyne Williams|USA Carl Williams|USA Phillip Williamson|USA R Williams|AUS Rald Williams|AUS Ted Williams|USA Duane Williams|BAR Seanon Williams|BAR John Williams|GBR Tom Williams|GBR Gavaskar Williams|ECA Jerry Williams|ECA Mark Williams|AUS Lachlan Williams|AUS Jamie Williams|GBR G Williams|AUS Owen Williams|RSA K Williams|AUS Dick Williams|USA E Ulysses Williams|GBR Eliot Crawshay Williams|GBR Lucien E Williams|USA David H Williams|GBR Llewelyn Williams|ARG Jorge Williams|ARG Richard O Williams|GBR Ec Williams|GBR Teddy Williams|GBR Sc Williams|USA J A Williams|USA Fp Williams|USA William Williams|USA Tp Williams|USA Hl Williamson|USA Joe Williams|USA X Williams|USA Richard Norris Williams|USA Dougal Williams|USA Jorge Williams Lopez|MEX Thomas Williams|GBR Rohan Williams|AUS Stefan Williams|NZL Wkwesi Williams|BAR Timmy Williams|AUS Yohansey Williams|TRI Dylan Williams|GBR Chris Williams|RSA Michael Williams|USA Nick Williams|USA Glenn Williams|USA Mark Williams W303|USA Francis Williams|USA Phillip Williamson|USA Paul Williams|USA John Paul Williams|USA Austin Williams|USA Carlos Williams|MEX Runeld Williams|USA Mark Gino Williams|RSA Steven Williams|USA Jack Williams|USA Lee Williams|USA Adam Williams|USA Earl Williams|USA Ashton Williams|USA Shane Williams|TRI Phillip Williams|JAM Narada Williams|JAM Justin Williams|USA Ellis Williamson|USA Gus Williams|USA Dick Williams|USA Blackwell Williams|USA Don Williams|USA Peter Williams|USA Charles Williams|USA Richard Williams|USA Scott Mcwilliams|USA Walter Williams|USA R N Williams|GBR Jeff Williams|USA Guy Williams|AUS A Williams|AUS T Williams|USA L K Williams|USA George Williams|USA J H A Williams|USA G M Williams|USA E T Williams|USA R G Williams|USA B C Williams|USA G Williamson|USA Adrian Williams|GBR David Williams|USA Brian Williams|AUS Logan Williams|USA Scott Williams|USA Brad Williams|AUS Bradwin Williams|RSA Nicholas Williams|AUS M Williams| P D R Williams|GBR J Williamson|IRL R E Williams|GBR Cooper Williams|USA Bill Williams|USA Aaron James Williams|GER tennis
36
+ Show latest rankings for player ID 104925 SELECT ranking_date, rank, Points FROM rankings WHERE player = 104925 ORDER BY ranking_date DESC LIMIT 1; 20240527|1|9960.0 tennis
37
+ Find the 10 latest matches between Nadal and Federer SELECT tourney_name, tourney_date, winner_name, loser_name, score FROM matches WHERE (winner_name = 'Rafael Nadal' AND loser_name = 'Roger Federer') OR (winner_name = 'Roger Federer' AND loser_name = 'Rafael Nadal') ORDER BY tourney_date DESC LIMIT 10; Wimbledon|20190701.0|Roger Federer|Rafael Nadal|7-6(3) 1-6 6-3 6-4 Roland Garros|20190527.0|Rafael Nadal|Roger Federer|6-3 6-4 6-2 Indian Wells Masters|20190304.0|Roger Federer|Rafael Nadal|W/O Shanghai Masters|20171009.0|Roger Federer|Rafael Nadal|6-4 6-3 Miami Masters|20170320.0|Roger Federer|Rafael Nadal|6-3 6-4 Indian Wells Masters|20170306.0|Roger Federer|Rafael Nadal|6-2 6-3 Australian Open|20170116.0|Roger Federer|Rafael Nadal|6-4 3-6 6-1 3-6 6-3 Basel|20151026.0|Roger Federer|Rafael Nadal|6-3 5-7 6-3 Australian Open|20140113.0|Rafael Nadal|Roger Federer|7-6(4) 6-3 6-3 Tour Finals|20131104.0|Rafael Nadal|Roger Federer|7-5 6-3 tennis
38
+ What is the highest rank achieved by player id 104925? SELECT MIN(rank) FROM rankings WHERE player = 104925; 1 tennis
39
+ Get the win-loss record for Carlos Alcaraz SELECT 'Wins' as result, COUNT(*) as count FROM matches WHERE winner_name = 'Carlos Alcaraz' UNION ALL SELECT 'Losses' as result, COUNT(*) as count FROM matches WHERE loser_name = 'Carlos Alcaraz'; Wins|247 Losses|67 tennis
40
+ Find the most common score in matches SELECT score, COUNT(*) as frequency FROM matches WHERE score IS NOT NULL GROUP BY score ORDER BY frequency DESC LIMIT 1; 6-4 6-4|30324 tennis
41
+ Show players born in the year 2008 SELECT name, dob FROM players WHERE dob >= 20080000 AND dob < 20090000; Vito Antonio Darderi|20080113.0 tennis
42
+ How many doubles matches are in the database? SELECT COUNT(*) FROM matches WHERE winner1_id IS NOT NULL; 26399 tennis
43
+ Find all players from Monaco SELECT name, ioc FROM players WHERE ioc IN ('MON'); Emmanuel Heussner|MON Sebastien Graeff|MON Guillaume Couillard|MON Jean Rene Lisnard|MON Thomas Oger|MON Benjamin Balleret|MON Clement Morel|MON Hugo Nys|MON Bernard Balleret|MON Gilles Ganancia|MON Luis Borfiga|MON Christophe Boggetti|MON Thomas Drouet|MON Jerome Seguin|MON Albert Viviani|MON Jacques Vincileoni|MON Eric Carlier|MON Andres Vatrican|MON Francisco Truchi|MON Alain Manigley|MON Michel Borfiga|MON Christian Collange|MON Patrick Landau|MON Rene Ruzic|MON Adrien Viviani|MON Emile Petit|MON Georges Pasquier|MON Roland Borghini|MON Vladimir M Landau|MON Rene Galeppe|MON Gaston Medecin|MON Aleco Noghes|MON Gaston Medecin|MON Aleco Noghes|MON Rene Gallepe|MON Roland Borghini|MON Arnaud Dalbergue|MON Marc Stillitano|MON Nicolas Klingelschmitt|MON Yvan Medecin|MON Emmanuel Van Der Pol|MON Edmund Michel Gastaud|MON Maurice Landau|MON Lucas Catarina|MON Olivier Peyret|MON tennis
44
+ Show matches where the height difference was more than 50cm SELECT tourney_name, winner_name, winner_ht, loser_name, loser_ht FROM matches WHERE ABS(winner_ht - loser_ht) > 50 AND winner_ht IS NOT NULL AND loser_ht IS NOT NULL; Egypt F14|Danilo Petrovic|203.0|Ilija Vucic|145.0 Nigeria F3|Ilija Vucic|145.0|Tucker Vorster|196.0 Nigeria F4|Tucker Vorster|196.0|Ilija Vucic|145.0 Serbia F5|Ilija Vucic|145.0|Nikola Ciric|201.0 Bosnia & Herzegovina F1|Danilo Petrovic|203.0|Ilija Vucic|145.0 Serbia F3|Danilo Petrovic|203.0|Ilija Vucic|145.0 tennis
45
+ How many matches had no duration recorded? SELECT COUNT(*) FROM matches WHERE minutes IS NULL; 748394 tennis
46
+ Show the youngest winner in any match SELECT winner_name, winner_age, tourney_name FROM matches WHERE winner_age IS NOT NULL ORDER BY winner_age ASC LIMIT 1; Nelio Mattos|14.0|Brazil F9 tennis
47
+ Find the oldest player to win a match SELECT winner_name, winner_age, tourney_name FROM matches WHERE winner_age IS NOT NULL ORDER BY winner_age DESC LIMIT 1; Tom Brown|101.5|M15 Santiago tennis
48
+ List all players shorter than 165cm SELECT name, height FROM players WHERE height < 165 AND height IS NOT NULL; Angel Gimenez|163.0 Eduardo Osta|160.0 Choon Ho Kim|163.0 Ramayah Ramachandran|160.0 Ilija Vucic|145.0 Yuta Shimizu|163.0 tennis
49
+ How many matches were won by players under 25? SELECT COUNT(*) FROM matches WHERE winner_age < 25; 573136 tennis
50
+ Show the average winner age by year between 1990 and 1995 SELECT CAST(tourney_date/10000 AS INTEGER) as year, AVG(winner_age) as avg_age FROM matches WHERE winner_age IS NOT NULL AND tourney_date >= 19900000 AND tourney_date <= 19951231 GROUP BY year ORDER BY year; 1990|23.740600162206 1991|22.8081504917807 1992|22.8424242424242 1993|22.852455520351 1994|22.9313226277372 1995|22.9975207977522 tennis
51
+ What percentage of matches are won by the taller player? SELECT (CAST(SUM(CASE WHEN winner_ht > loser_ht THEN 1 ELSE 0 END) AS FLOAT) / COUNT(*)) * 100 as percentage FROM matches WHERE winner_ht IS NOT NULL AND loser_ht IS NOT NULL; 45.2674228807518 tennis
52
+ Show the most successful player by win count SELECT winner_name, COUNT(*) as total_wins FROM matches WHERE winner_name IS NOT NULL GROUP BY winner_name ORDER BY total_wins DESC LIMIT 1; Roger Federer|1305 tennis
53
+ Show the average points difference between consecutive ranks for the top 10 ranks SELECT rank, AVG(Points - next_rank_points) as avg_diff FROM ( SELECT rank, Points, LEAD(Points) OVER (PARTITION BY ranking_date ORDER BY rank) as next_rank_points FROM rankings ) subquery WHERE next_rank_points IS NOT NULL GROUP BY rank ORDER BY rank LIMIT 10; 1|1636.20721271394 2|1143.42665036675 3|708.344743276284 4|536.484107579462 5|383.051344743276 6|307.098410757946 7|286.148533007335 8|240.486552567237 9|159.284841075795 10|128.124694376528 tennis
54
+ Find players who have improved their ranking by more than 2000 spots SELECT p.name, MAX(r.rank) as old_rank, MIN(r.rank) as new_rank, (MAX(r.rank) - MIN(r.rank)) as improvement FROM rankings r JOIN players p ON r.player = p.player_id GROUP BY p.player_id, p.name HAVING (MAX(r.rank) - MIN(r.rank)) > 2000 ORDER BY improvement DESC; Stefanos Tsitsipas|2208|3|2205 Denis Shapovalov|2151|10|2141 Ryan Peniston|2239|123|2116 Carlos Taberner|2197|85|2112 Bernabe Zapata Miralles|2139|37|2102 Alejandro Tabilo|2095|24|2071 Tomas Barrios Vera|2163|93|2070 Gian Marco Moroni|2221|159|2062 Tallon Griekspoor|2077|21|2056 Miguel Angel Lopez Jaen|2221|171|2050 Tomas Martin Etcheverry|2075|27|2048 Gijs Brouwer|2154|114|2040 Adam Chadaj|2237|202|2035 Soon Woo Kwon|2082|52|2030 Benjamin Bonzi|2070|42|2028 Andrea Pellegrino|2163|136|2027 Martin Verkerk|2035|14|2021 Marc Andrea Huesler|2060|47|2013 Thai Son Kwiatkowski|2189|181|2008 Lukas Klein|2122|116|2006 Altug Celikbilek|2159|154|2005 Ivan Gakhov|2143|142|2001 Jay Clarke|2154|153|2001 tennis
55
+ List countries with players in top 5 SELECT DISTINCT p.ioc FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.rank <= 5 AND r.ranking_date = (SELECT MAX(ranking_date) FROM rankings); SRB ITA ESP GER RUS tennis
56
+ Show match count by month for 2023 SELECT CAST((tourney_date % 10000) / 100 AS INTEGER) as month, COUNT(*) as matches FROM matches WHERE tourney_date >= 20230000 AND tourney_date < 20240000 GROUP BY month ORDER BY month; 1|2303 2|2179 3|2347 4|2581 5|2891 6|3061 7|3761 8|2904 9|2841 10|3323 11|2499 12|585 tennis
57
+ Find the player with the most consistent ranking (lowest variance) SELECT p.name, AVG(r.rank) as avg_rank, COUNT(*) as appearances FROM rankings r JOIN players p ON r.player = p.player_id GROUP BY r.player, p.name HAVING COUNT(*) > 10 ORDER BY AVG(r.rank) ASC LIMIT 1; Stefan Edberg|16.3657817109145|678 tennis
58
+ Show tournaments where players from more than 70 countries have won SELECT tourney_name, COUNT(DISTINCT winner_ioc) as country_count FROM matches GROUP BY tourney_name HAVING COUNT(DISTINCT winner_ioc) > 70; Australian Open|85 M15 Monastir|79 Miami Masters|71 Roland Garros|85 US Open|90 Washington|74 Wimbledon|91 tennis
59
+ Get the total points for all ranked players by country (only the top 10) SELECT p.ioc, SUM(r.Points) as total_points FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.ranking_date = (SELECT MAX(ranking_date) FROM rankings) GROUP BY p.ioc ORDER BY total_points DESC LIMIT 10; USA|26319.0 ITA|24835.0 FRA|24732.0 RUS|17876.0 ESP|17474.0 ARG|16757.0 AUS|14319.0 GER|14093.0 SRB|13632.0 GBR|8007.0 tennis
60
+ Who are the 10 tallest players? Show their name and height. SELECT name, height FROM players ORDER BY height DESC LIMIT 10; Reilly Opelka|211.0 Ivo Karlovic|208.0 Greg Neuhart|206.0 John Isner|206.0 Milan Srejber|203.0 Dick Norman|203.0 Marcelo Melo|203.0 Kevin Anderson|203.0 Kenny De Schepper|203.0 Alexander Bury|203.0 tennis
61
+ Find all players born after January 1, 2008. SELECT name, dob FROM players WHERE dob > 20080000; Vito Antonio Darderi|20080113.0 tennis
62
+ What is the name of the player with ID 104925? SELECT name FROM players WHERE player_id = 104925; Novak Djokovic tennis
63
+ How many matches were played in the 'Wimbledon' tournament? SELECT count(*) FROM matches WHERE tourney_name = 'Wimbledon'; 17457 tennis
64
+ List the winner and loser of tourney_id '2018-M020'. SELECT winner_name, loser_name FROM matches WHERE tourney_id = '2018-M020' AND round='F'; Nick Kyrgios|Ryan Harrison tennis
65
+ What was the score of the 'US Open' final in 2018? SELECT score FROM matches WHERE tourney_name = 'US Open' AND tourney_date LIKE '2018%' AND round = 'F'; 6-3 7-6(4) 6-3 tennis
66
+ Find all matches that lasted longer than 1000 minutes. SELECT tourney_name, winner_name, loser_name, minutes FROM matches WHERE minutes > 1000; Sydney|Gilles Muller|Jeremy Chardy|1146.0 Piracicaba CH|Federico Agustin Gomez|Igor Gimenez|1274.0 Vicenza CH|Federico Gaio|Joris De Loore|1241.0 Guayaquil CH|Federico Coria|Tomas Lipovsek Puches|4756.0 Samarkand CH|Dmitry Popko|Aleksandre Metreveli|1237.0 Santiago CH|Juan Pablo Paz|Cristopher Kohl|1392.0 Luedenscheid CH|Camilo Ugo Carabelli|Timofey Skatov|1531.0 tennis
67
+ What was the name and country of the winner of the 'Roland Garros' 2023 tournament? SELECT winner_name, winner_ioc FROM matches WHERE tourney_name = 'Roland Garros' AND round = 'F' AND tourney_date >= 20230000 AND tourney_date < 20240000; Novak Djokovic|SRB tennis
68
+ Who was the number 1 ranked player on March 20, 2023? SELECT p.name FROM players AS p JOIN rankings AS r ON p.player_id = r.player WHERE r.rank = 1 AND r.ranking_date = 20230320; Carlos Alcaraz tennis
69
+ What were the top 5 players and their points on 2024-01-01? SELECT p.name, r.points FROM players AS p JOIN rankings as r ON p.player_id = r.player WHERE ranking_date = 20240101 ORDER BY rank ASC LIMIT 5; Novak Djokovic|11245.0 Carlos Alcaraz|8855.0 Daniil Medvedev|7600.0 Jannik Sinner|6490.0 Andrey Rublev|4805.0 tennis
70
+ What is the most recent ranking for player 126207? SELECT rank, points FROM rankings WHERE player = 126207 ORDER BY ranking_date DESC LIMIT 1; 26|1630.0 tennis
71
+ How many ranking entries exist for player 104925? SELECT count(*) FROM rankings WHERE player = 104925; 988 tennis
72
+ What is the name of the player who was ranked #1 on 2023-06-12? SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.rank = 1 AND T2.ranking_date = 20230612; Novak Djokovic tennis
73
+ What is the date of birth of the player ranked #3 on 2023-11-20? SELECT T1.dob FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.rank = 3 AND T2.ranking_date = 20231120; 19960211.0 tennis
74
+ List the names and countries of all players who have a rank in the top 1. SELECT DISTINCT T1.name, T1.ioc FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.rank <= 1; Andre Agassi|USA Pete Sampras|USA Marat Safin|RUS Gustavo Kuerten|BRA Lleyton Hewitt|AUS Juan Carlos Ferrero|ESP Andy Roddick|USA Roger Federer|SUI Rafael Nadal|ESP Bjorn Borg|SWE John McEnroe|USA Ivan Lendl|USA Mats Wilander|SWE Jimmy Connors|USA Novak Djokovic|SRB Daniil Medvedev|RUS Carlos Alcaraz|ESP Stefan Edberg|SWE Boris Becker|GER Jim Courier|USA Thomas Muster|AUT Marcelo Rios|CHI Carlos Moya|ESP Yevgeny Kafelnikov|RUS Patrick Rafter|AUS Andy Murray|GBR Ilie Nastase|ROU John Newcombe|AUS tennis
75
+ What is the date of birth of 'Roger Federer'? SELECT dob FROM players WHERE name = 'Roger Federer'; 19810808.0 tennis
76
+ Find the name and height of the player with ID 104745. SELECT name, height FROM players WHERE player_id = 104745; Rafael Nadal|185.0 tennis
77
+ How many matches has 'Novak Djokovic' won? SELECT count(*) FROM matches WHERE winner_name = 'Novak Djokovic'; 1179 tennis
78
+ How many matches has 'Rafael Nadal' lost? SELECT count(*) FROM matches WHERE loser_name = 'Rafael Nadal'; 255 tennis
79
+ What is the average age of winners of the 'Australian Open'? SELECT avg(winner_age) FROM matches WHERE tourney_name = 'Australian Open'; 25.6905314807382 tennis
80
+ What is the average age of losers? SELECT avg(loser_age) FROM matches; 23.6776674381365 tennis
81
+ Find the name of the player who won the longest match (by minutes). SELECT winner_name FROM matches ORDER BY minutes DESC LIMIT 1; Federico Coria tennis
82
+ Find the name of the player who won the shortest match. SELECT winner_name FROM matches WHERE minutes IS NOT NULL ORDER BY minutes ASC LIMIT 1; Tim Smyczek tennis
83
+ List the countries with more than 2000 players. SELECT ioc, count(*) FROM players GROUP BY ioc HAVING count(*) > 50; AUS|3266 BRA|2092 ESP|3026 FRA|2582 GBR|3200 GER|2675 ITA|2656 USA|13102 tennis
84
+ What is the average height of players grouped by their playing hand? SELECT hand, avg(height) FROM players GROUP BY hand; A|180.25 L|183.606060606061 R|183.806518151815 U|181.035714285714 tennis
85
+ How many matches has each player won? Show the top 10. SELECT winner_name, count(*) FROM matches GROUP BY winner_name ORDER BY count(*) DESC LIMIT 10; Roger Federer|1305 Jimmy Connors|1279 Novak Djokovic|1179 Rafael Nadal|1167 Ivan Lendl|1075 Guillermo Vilas|953 Ilie Nastase|950 Andre Agassi|887 John McEnroe|886 tennis
86
+ How many matches has each player lost? Show the top 10. SELECT loser_name, count(*) FROM matches GROUP BY loser_name ORDER BY count(*) DESC LIMIT 10; Feliciano Lopez|627 Paolo Lorenzi|622 Andreas Seppi|577 Teymuraz Gabashvili|575 Andrea Arnaboldi|561 Fernando Verdasco|550 Ruben Ramirez Hidalgo|543 Sergiy Stakhovsky|534 Lukas Rosol|530 tennis
87
+ Find the date of birth of the player who won the 'US Open' in 2012. SELECT T1.dob FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.winner_id WHERE T2.tourney_name = 'US Open' AND T2.round = 'F' AND T2.tourney_date >= 20120000 AND T2.tourney_date < 20130000; 19870515.0 tennis
88
+ What is the highest rank 'Andy Murray' has ever achieved? SELECT min(T1.rank) FROM rankings AS T1 JOIN players AS T2 ON T1.player = T2.player_id WHERE T2.name = 'Andy Murray'; 1 tennis
89
+ What are the names and points of all players from 'ITA' ranked in the top 10? WITH RankedData AS ( SELECT T1.name, T2.rank, T2.points, T2.ranking_date, -- 1. Partition the data by player ROW_NUMBER() OVER( PARTITION BY T1.player_id -- 2. Order their ranks from best (1) to worst (50) ORDER BY T2.rank ASC ) AS rn FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.ioc = 'ITA' AND T2.rank <= 10 ) -- 3. Select only the #1 row (the best rank) for each player SELECT name, rank AS highest_rank, points, ranking_date AS date_of_rank FROM RankedData WHERE rn = 1; Adriano Panatta|6||19760614 Corrado Barazzutti|7||19780821 Fabio Fognini|9|2785.0|20190715 Matteo Berrettini|6|5278.0|20220131 Jannik Sinner|2|8710.0|20240401 tennis
90
+ How many tournaments were played in 2023? SELECT count(DISTINCT tourney_id) FROM matches WHERE tourney_date >= 20230000 AND tourney_date < 20240000; 909 tennis
91
+ List all opponents 'Carlos Alcaraz' lost to in 2023. SELECT DISTINCT winner_name FROM matches WHERE loser_name = 'Carlos Alcaraz' AND tourney_date >= 20230000 AND tourney_date < 20240000; Cameron Norrie Jannik Sinner Fabian Marozsan Novak Djokovic Tommy Paul Daniil Medvedev Grigor Dimitrov Roman Safiullin Alexander Zverev tennis
92
+ How many matches were won by a player older than their opponent? SELECT count(*) FROM matches WHERE winner_age > loser_age; 468479 tennis
93
+ Find the average age of all doubles winners. SELECT avg(T1.age) FROM (SELECT winner1_age AS age FROM matches WHERE winner1_id IS NOT NULL UNION ALL SELECT winner2_age AS age FROM matches WHERE winner2_id IS NOT NULL) AS T1; 345.696638675076 tennis
94
+ What is the name of the doubles pair Simon Aspelin and Julian Knowle? SELECT DISTINCT winner1_name, winner2_name FROM matches WHERE (winner1_name = 'Simon Aspelin' AND winner2_name = 'Julian Knowle') OR (winner1_name = 'Julian Knowle' AND winner2_name = 'Simon Aspelin');; Simon Aspelin|Julian Knowle tennis
95
+ What is the average, min, and max number of points for players? SELECT avg(points), min(points), max(points) FROM rankings; 117.12910941862|1.0|16950.0 tennis
96
+ How many matches were won by a player who was ranked number 1 at the time of the match? SELECT count(*) FROM matches AS T1 JOIN rankings AS T2 ON T1.winner_id = T2.player AND T1.tourney_date = T2.ranking_date WHERE T2.rank = 1; 2369 tennis
97
+ What is the average height of players who won a tournament on a hard court (e.g., 'US Open' or 'Australian Open')? SELECT avg(T1.height) FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.winner_id WHERE T2.tourney_name IN ('US Open', 'Australian Open'); 184.760615653313 tennis
98
+ List the names of all players who have lost a match to 'Carlos Alcaraz' and are shorter than him. SELECT DISTINCT T1.name FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.winner_id WHERE T2.loser_name = 'Carlos Alcaraz' AND T1.height < T2.loser_ht; Federico Coria Marco Trungelliti Hugo Gaston Mikael Ymer Thiago Monteiro Jaume Munar Zsombor Piros Filip Horansky Lorenzo Giustino Inigo Cervantes Huegun Pedro Sousa Frederico Ferreira Silva David Goffin tennis
99
+ Find the player with the most points who is left-handed. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.hand = 'L' ORDER BY T2.points DESC LIMIT 1; Rafael Nadal tennis
100
+ What is the name of the oldest player to win a match in 2023? SELECT winner_name FROM matches WHERE tourney_date >= 20230000 AND tourney_date < 20240000 ORDER BY winner_age DESC LIMIT 1; Tom Brown tennis
101
+ How many matches were played where both the winner and loser were from the same country? SELECT count(*) FROM matches WHERE winner_ioc = loser_ioc; 185838 tennis
102
+ List all players from 'FRA' who were ranked in the top 50 on '2023-01-02'. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.ioc = 'FRA' AND T2.rank <= 50 AND T2.ranking_date = 20230102; Adrian Mannarino Arthur Rinderknech tennis
103
+ What is the average number of points for players from 'USA' vs 'CAN'? SELECT T1.ioc, avg(T2.points) FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.ioc IN ('USA', 'CAN') GROUP BY T1.ioc; CAN|137.325744104014 USA|132.369032395567 tennis
104
+ Find all players who have a higher rank than 'Rafael Nadal' on '2023-05-29'. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.ranking_date = 20230529 AND T2.rank < (SELECT rank FROM rankings AS T3 JOIN players AS T4 ON T3.player = T4.player_id WHERE T4.name = 'Rafael Nadal' AND T3.ranking_date = 20230529); Carlos Alcaraz Daniil Medvedev Novak Djokovic Casper Ruud Stefanos Tsitsipas Holger Rune Andrey Rublev Taylor Fritz Jannik Sinner Felix Auger Aliassime Karen Khachanov Frances Tiafoe Cameron Norrie Hubert Hurkacz tennis
105
+ How many players in the 'players' table have no matches recorded? SELECT count(*) FROM players WHERE player_id NOT IN (SELECT winner_id FROM matches UNION SELECT loser_id FROM matches); 0 tennis
106
+ What is the name and country of the player who lost the longest match (by minutes)? SELECT T1.name, T1.ioc FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.loser_id ORDER BY T2.minutes DESC LIMIT 1; Tomas Lipovsek Puches|ARG tennis
107
+ List all doubles match winner pairs where both winners were left-handed. SELECT DISTINCT winner1_name, winner2_name FROM matches WHERE winner1_hand = 'L' AND winner2_hand = 'L'; Julian Knowle|Jurgen Melzer Rafael Nadal|Fernando Verdasco Feliciano Lopez|Rafael Nadal Feliciano Lopez|Fernando Verdasco Mariano Hood|Julian Knowle Rick Leach|Brian Macphie Ellis Ferreira|Brian Macphie Stefan Koubek|Jurgen Melzer Chris Haggard|Brian Macphie Donald Johnson|Rick Leach Chris Haggard|Donald Johnson Ellis Ferreira|David Rikl Ellis Ferreira|Rick Leach Juan Ignacio Carrasco|Mariano Hood Rick Leach|David Macpherson Ellis Ferreira|Jeff Tarango Stefan Koubek|Jarkko Nieminen Karsten Braasch|Jeff Tarango Eric Butorac|Jamie Murray Rafael Nadal|Bartolome Salva Vidal Johan Brunstrom|Michael Ryderstedt Wayne Arthurs|Fernando Verdasco Karsten Braasch|Mariusz Fyrstenberg Karsten Braasch|Jurgen Melzer Juan Ignacio Carrasco|Jason Weir Smith Barry Cowan|Irakli Labadze Chris Haggard|Jan Siemerink Ellis Ferreira|Donald Johnson Mose Navarra|Vincenzo Santopadre Sander Groen|Jan Siemerink Wayne Arthurs|Scott Draper Barry Cowan|Mose Navarra Wayne Arthurs|Goran Ivanisevic Patrick Galbraith|Brian Macphie Chris Haggard|Daniel Orsanic Patrick Galbraith|David Macpherson Mariano Puerta|Marcelo Rios Michael Llodra|Diego Nargiso Michael Berrer|Mischa Zverev Chris Haggard|Jurgen Melzer Eric Butorac|James Cerretani Johan Brunstrom|James Cerretani Jurgen Melzer|Andreas Vinciguerra Fernando Verdasco|Mischa Zverev Carsten Ball|Chris Guccione Eric Butorac|Thomaz Bellucci James Cerretani|Dick Norman Jurgen Melzer|Fernando Verdasco Carsten Ball|Andrew Coelho Michael Berrer|Kenneth Carlsen tennis
108
+ What is the average age of winners in matches that lasted less than 90 minutes? SELECT avg(winner_age) FROM matches WHERE minutes < 90; 25.7141327552797 tennis
109
+ How many players born in the 1990s are in the database? SELECT count(*) FROM players WHERE dob >= 19900101 AND dob <= 19991231; 14658 tennis
110
+ What is the most common country ('ioc') for players? SELECT ioc FROM players GROUP BY ioc ORDER BY count(*) DESC LIMIT 1; USA tennis
111
+ How many matches has 'Carlos Alcaraz' won against left-handed players? SELECT count(*) FROM matches WHERE winner_name = 'Carlos Alcaraz' AND loser_hand = 'L'; 31 tennis
112
+ Find the name and date of birth of the youngest player to be ranked number 1. SELECT T1.name, T1.dob FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.rank = 1 ORDER BY T1.dob DESC LIMIT 1; Carlos Alcaraz|20030505.0 tennis
113
+ What is the average number of points for players who are right-handed vs left-handed? SELECT T1.hand, avg(T2.points) FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.hand IS NOT NULL GROUP BY T1.hand; A|30.056 L|181.794778139621 R|146.788578611675 U|12.6757269988693 tennis
114
+ List all tournaments where the winner's age was under 16. SELECT DISTINCT tourney_name FROM matches WHERE winner_age < 16 and round='F'; Spain 7 Masters 4 Spain 3 2 Great Britain F3 Germany F4 Las Vegas CH Croatia F2 tennis
115
+ What is the win-loss ratio for Jannik Sinner? SELECT SUM(CASE WHEN winner_name = 'Jannik Sinner' THEN 1 ELSE 0 END) * 1.0 / NULLIF(SUM(CASE WHEN loser_name = 'Jannik Sinner' THEN 1 ELSE 0 END), 0) AS win_loss_ratio FROM matches WHERE winner_name = 'Jannik Sinner' OR loser_name = 'Jannik Sinner'; 2.51304347826087 tennis
116
+ Find all players who have won at least 500 matches and are over 200cm tall. SELECT T1.name FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.winner_id WHERE T1.height > 200 GROUP BY T1.name HAVING count(*) >= 500; Ivo Karlovic John Isner Kevin Anderson Marc Rosset tennis
117
+ What is the name of the player who has the most ranking entries? SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player GROUP BY T1.name ORDER BY count(*) DESC LIMIT 1; Feliciano Lopez tennis
118
+ What is the average rank of all players from 'SRB'? SELECT avg(T1.rank) FROM rankings AS T1 JOIN players AS T2 ON T1.player = T2.player_id WHERE T2.ioc = 'SRB'; 871.267699994322 tennis
119
+ How many matches were decided in 5 sets? SELECT count(*) FROM matches WHERE score LIKE '%-% %-% %-% %-%'; 30494 tennis
120
+ What is the average age of players who have been ranked in the top 10? SELECT avg(T1.winner_age) FROM matches AS T1 WHERE T1.winner_id IN ( SELECT DISTINCT T2.player FROM rankings AS T2 WHERE T2.rank <= 10 ); 24.943456266323 tennis
121
+ List all players whose name starts with 'Z' and are from 'SUI'. SELECT name FROM players WHERE name LIKE 'Z%' AND ioc = 'SUI'; Zigmund Zorny tennis
122
+ Find the player who has the biggest difference between their highest and lowest rank. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player GROUP BY T1.name ORDER BY (max(T2.rank) - min(T2.rank)) DESC LIMIT 1; Stefanos Tsitsipas tennis
123
+ How many matches were won by a player whose name is the same as the loser's name (e.g., in doubles)? SELECT count(*) FROM matches WHERE winner1_name = loser1_name OR winner1_name = loser2_name OR winner2_name = loser1_name OR winner2_name = loser2_name; 68 tennis
124
+ What is the average number of minutes for matches in the Roland Garros tournament? SELECT avg(minutes) FROM matches WHERE tourney_name = 'Roland Garros'; 138.685630004214 tennis
125
+ Find all players who are taller than 185cm and are ambidextrous. SELECT name FROM players WHERE height > 185 AND hand = 'A'; Luke Jensen tennis
126
+ List all matches in 2023 where the winner was from 'ECU' and the match lasted more than 3 hours (180 mins). SELECT tourney_name, winner_name, loser_name FROM matches WHERE winner_ioc = 'ECU' AND minutes > 180 AND tourney_date >= 20230000 AND tourney_date < 20240000; Lima CH|Alvaro Guillen Meza|Ignacio Buse Montevideo CH|Alvaro Guillen Meza|Luciano Darderi Montevideo CH|Alvaro Guillen Meza|Renzo Olivo tennis
127
+ What is the name of the player who won the most matches as a 'loser' (i.e. was a doubles partner)? SELECT 'N/A'; N/A tennis
128
+ Find the name and height of the player who won the 'US Open' in 2016. SELECT T1.name, T1.height FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.winner_id WHERE T2.tourney_name = 'US Open' AND T2.tourney_date >= 20160000 AND T2.tourney_date < 20170000 AND round='F'; Stan Wawrinka|183.0 tennis
129
+ What is the total number of points held by all players from Spain on 2024-01-01? SELECT sum(T1.points) FROM rankings AS T1 JOIN players AS T2 ON T1.player = T2.player_id WHERE T2.ioc = 'ESP' AND T1.ranking_date = 20240101; 18971.0 tennis
130
+ Find all players who have won a match against an opponent more than 50 years older than them. SELECT DISTINCT winner_name FROM matches WHERE winner_age < (loser_age - 50); Jaimee Floyd Angele Ezekiel Clark Tyler Stice Taha Baadi Toby D Boyer Zachary Svajda Sekou Bangoura Dan Martin Govind Nanda Emile Hudd Damien Salvestre Ignacio Monzon Samir Banerjee Karlis Ozolins Ricardo Rodriguez Guy Den Ouden Amador Salazar Andres Andrade Facundo Bermejo tennis
131
+ How many matches did Novak Djokovic win in 2015? SELECT COUNT(*) FROM matches WHERE winner_name = 'Novak Djokovic' AND tourney_date BETWEEN 20150000 AND 20151231; 83 tennis
132
+ What is the average height of players from Argentina? SELECT AVG(height) FROM players WHERE ioc = 'ARG'; 181.592920353982 tennis
133
+ Who won the final of the 2022 Australian Open? SELECT winner_name FROM matches WHERE tourney_name = 'Australian Open' AND tourney_date BETWEEN 20220000 AND 20221231 AND round = 'F'; Rafael Nadal tennis
134
+ How many left-handed players are ranked in the top 50 on 2016-07-25? SELECT COUNT(DISTINCT p.player_id) FROM players p JOIN rankings r ON p.player_id = r.player WHERE p.hand = 'L' AND r.rank <= 50 AND r.ranking_date = 20160725; 9 tennis
135
+ What is the highest rank achieved by Jannik Sinner in 2023? SELECT MIN(rank) FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name = 'Jannik Sinner' AND r.ranking_date BETWEEN 20230000 AND 20231231; 4 tennis
136
+ List the total points for Carlos Alcaraz on the last recorded ranking date. SELECT Points FROM rankings WHERE player = (SELECT player_id FROM players WHERE name = 'Carlos Alcaraz') ORDER BY ranking_date DESC LIMIT 1; 7300.0 tennis
137
+ How many matches went to a 5th set in the 2019 Wimbledon tournament? SELECT COUNT(*) FROM matches WHERE tourney_name = 'Wimbledon' AND tourney_date LIKE '2019%' AND best_of = '5' AND score LIKE '%-% %-% %-% %-% %-%'; 21 tennis
138
+ Which country has the most players born after the year 2000? SELECT ioc, COUNT(*) as cnt FROM players WHERE dob >= 20000101 GROUP BY ioc ORDER BY cnt DESC LIMIT 1; USA|202 tennis
139
+ What is the average duration (in minutes) of matches won by Rafael Nadal? SELECT AVG(minutes) FROM matches WHERE winner_name = 'Rafael Nadal'; 116.695992179863 tennis
140
+ Find the name of the player ranked #1 on the date 2010-01-01. SELECT p.name FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.rank = 1 AND r.ranking_date > 20100101 AND r.ranking_date < 20100108; Roger Federer tennis
141
+ How many finals has Roger Federer lost? SELECT COUNT(*) FROM matches WHERE loser_name = 'Roger Federer' AND round = 'F'; 55 tennis
142
+ What is the average age of players who reached the final of the US Open in 2006? SELECT AVG(age) FROM (SELECT winner_age as age FROM matches WHERE tourney_name = 'US Open' AND tourney_date LIKE '2006%' AND round = 'F' UNION ALL SELECT loser_age FROM matches WHERE tourney_name = 'US Open' AND tourney_date LIKE '2006%' AND round = 'F') t; 24.45 tennis
143
+ List the players who have beaten Novak Djokovic more than twice. SELECT winner_name FROM matches WHERE loser_name = 'Novak Djokovic' GROUP BY winner_name HAVING COUNT(*) > 2; Alexander Zverev Andy Murray Andy Roddick Daniil Medvedev David Ferrer Dominic Thiem Fernando Verdasco Jannik Sinner Jo-Wilfried Tsonga Juan Martin del Potro Mikhail Youzhny Olivier Rochus Rafael Nadal Roberto Bautista Agut Roger Federer Stan Wawrinka Tomas Berdych Tommy Haas tennis
144
+ What is the win rate of right-handed players against left-handed players in 2022? SELECT CAST(SUM(CASE WHEN winner_hand = 'R' AND loser_hand = 'L' THEN 1 ELSE 0 END) AS FLOAT) / SUM(CASE WHEN (winner_hand = 'R' AND loser_hand = 'L') OR (winner_hand = 'L' AND loser_hand = 'R') THEN 1 ELSE 0 END) FROM matches WHERE tourney_date BETWEEN 20220000 AND 20221231; 0.500155811779371 tennis
145
+ How many different partners did Bob Bryan have in doubles matches? SELECT COUNT(DISTINCT partner) FROM (SELECT winner2_name as partner FROM matches WHERE winner1_name = 'Bob Bryan' UNION SELECT winner1_name FROM matches WHERE winner2_name = 'Bob Bryan' UNION SELECT loser2_name FROM matches WHERE loser1_name = 'Bob Bryan' UNION SELECT loser1_name FROM matches WHERE loser2_name = 'Bob Bryan') t; 1 tennis
146
+ Who is the shortest player to ever be ranked in the top 10? SELECT p.name, p.height FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.rank <= 10 AND p.height IS NOT NULL ORDER BY p.height ASC LIMIT 1; Harold Solomon|168.0\ tennis
147
+ What is the total number of ranking points for all USA players combined on 2022-01-01? SELECT SUM(r.Points) FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.ioc = 'USA' AND r.ranking_date > 20220101 and r.ranking_date < 20220108; 25480.0 tennis
148
+ How many matches in the database were won by a player from Japan? SELECT COUNT(*) FROM matches WHERE winner_ioc = 'JPN'; 18724 tennis
149
+ Find the date of the match between Federer and Nadal where the score was '6-4 6-4'. SELECT tourney_date FROM matches WHERE (winner_name = 'Roger Federer' AND loser_name = 'Rafael Nadal' OR winner_name = 'Rafael Nadal' AND loser_name = 'Roger Federer') AND score = '6-4 6-4'; 20090510.0 20070415.0 tennis
150
+ What is the average ranking of the opponent defeated by Carlos Alcaraz in 2023? SELECT AVG(r.rank) FROM matches m JOIN rankings r ON m.loser_id = r.player AND m.tourney_date = r.ranking_date WHERE m.winner_name = 'Carlos Alcaraz' AND m.tourney_date BETWEEN 20230000 AND 20231231; 62.1290322580645 tennis
151
+ Who has the most wins in the 'Miami Masters' tournament? SELECT winner_name, COUNT(*) as wins FROM matches WHERE tourney_name = 'Miami Masters' AND winner_name IS NOT NULL GROUP BY winner_name ORDER BY wins DESC LIMIT 1; Andre Agassi|60 tennis
152
+ How many players have a recorded height over 210cm? SELECT COUNT(*) FROM players WHERE height > 210; 1 tennis
153
+ List the tourney_name of all tournaments won by Andy Murray in 2016. SELECT DISTINCT tourney_name FROM matches WHERE winner_name = 'Andy Murray' AND round = 'F' AND tourney_date BETWEEN 20160000 AND 20161231; Rome Masters Queen's Club Wimbledon Rio Olympics Beijing Shanghai Masters Vienna Paris Masters Tour Finals tennis
154
+ What is the average age difference in matches won by the younger player? SELECT AVG(loser_age - winner_age) FROM matches WHERE winner_age < loser_age; 4.00363596060453 tennis
155
+ How many matches did Daniil Medvedev win on hard courts (implied by US Open/Australian Open)? SELECT COUNT(*) FROM matches WHERE winner_name = 'Daniil Medvedev' AND tourney_name IN ('US Open', 'Australian Open'); 35 tennis
156
+ Which player had the most ranking points on 2021-12-27? SELECT p.name FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.ranking_date = 20211227 ORDER BY r.Points DESC LIMIT 1; Novak Djokovic tennis
157
+ How many tie-breaks (7-6 or 6-7) occurred in the 2018 US Open final? SELECT (LENGTH(score) - LENGTH(REPLACE(score, '7-6', ''))) / 3 + (LENGTH(score) - LENGTH(REPLACE(score, '6-7', ''))) / 3 FROM matches WHERE tourney_name = 'US Open' AND tourney_date LIKE '2018%' AND round = 'F'; 1 tennis
158
+ What is the lowest rank a player has held while winning a match in a Grand Slam (US/Aus/French/Wimbledon)? SELECT MAX(r.rank) FROM matches m JOIN rankings r ON m.winner_id = r.player AND m.tourney_date = r.ranking_date WHERE m.tourney_name IN ('US Open', 'Australian Open', 'Roland Garros', 'Wimbledon'); 1917 tennis
159
+ How many matches were played between two players from the same country in 2023? SELECT COUNT(*) FROM matches WHERE winner_ioc = loser_ioc AND tourney_date BETWEEN 20230000 AND 20231231; 5374 tennis
160
+ Find the name of the player with the most losses in 2022. SELECT loser_name, COUNT(*) as losses FROM matches WHERE tourney_date BETWEEN 20220000 AND 20221231 GROUP BY loser_name ORDER BY losses DESC LIMIT 1; Matthew Dellavedova|39 tennis
161
+ What is the average height of winners in the Wimbledon final over all years? SELECT AVG(winner_ht) FROM matches WHERE tourney_name = 'Wimbledon' AND round = 'F'; 184.229508196721 tennis
162
+ How many players from France are in the database? SELECT COUNT(*) FROM players WHERE ioc = 'FRA'; 2582 tennis
163
+ Who was the opponent in the match where Novak Djokovic won in the shortest amount of time? SELECT loser_name FROM matches WHERE winner_name = 'Novak Djokovic' AND minutes IS NOT NULL ORDER BY minutes ASC LIMIT 1; Gael Monfils tennis
164
+ How many matches did players under 18 win in 2020? SELECT COUNT(*) FROM matches WHERE winner_age < 18 AND tourney_date BETWEEN 20200000 AND 20201231; 210 tennis
165
+ What is the most common Best of format for matches in the database? SELECT best_of FROM matches GROUP BY best_of ORDER BY COUNT(*) DESC LIMIT 1; 3 tennis
166
+ Retrieve the DOB of the player 'Holger Rune'. SELECT dob FROM players WHERE name = 'Holger Rune'; 20030429.0 tennis
167
+ How many doubles matches were won by the team of Mike Bryan and Bob Bryan? SELECT COUNT(*) FROM matches WHERE (winner1_name = 'Mike Bryan' AND winner2_name = 'Bob Bryan') OR (winner1_name = 'Bob Bryan' AND winner2_name = 'Mike Bryan'); 553 tennis
168
+ List the names of players who have been ranked #1 for at least one week in 2023. SELECT DISTINCT p.name FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.rank = 1 AND r.ranking_date BETWEEN 20230000 AND 20231231; Carlos Alcaraz Novak Djokovic tennis
169
+ What is the sum of ranking points for the top 10 players on 2024-01-01? SELECT SUM(Points) FROM rankings WHERE rank <= 10 AND ranking_date = 20240101; 57220.0 tennis
170
+ How many players have 'Tennis' in their name? SELECT COUNT(*) FROM players WHERE name LIKE '%Tennis%'; 0 tennis
171
+ What was the score of the final match at Indian Wells in 2018? SELECT score FROM matches WHERE tourney_name = 'Indian Wells Masters' AND tourney_date LIKE '2018%' AND round = 'F'; 6-4 6-7(8) 7-6(2) tennis
172
+ How many matches has Alexander Zverev won against top 10 opponents? SELECT COUNT(*) FROM matches m JOIN rankings r ON m.loser_id = r.player AND m.tourney_date = r.ranking_date WHERE m.winner_name = 'Alexander Zverev' AND r.rank <= 10; 47 tennis
173
+ Find the country with the highest average player height. SELECT ioc FROM players GROUP BY ioc HAVING COUNT(*) > 5 ORDER BY AVG(height) DESC LIMIT 1; YUG tennis
174
+ Who won the longest match recorded in the 2021 season? SELECT winner_name FROM matches WHERE tourney_date BETWEEN 20210000 AND 20211231 ORDER BY minutes DESC LIMIT 1; Jozef Kovalik tennis
175
+ How many times did matches result in a retirement? SELECT COUNT(*) FROM matches WHERE score LIKE '%RET%'; 27840 tennis
176
+ What is the average age of players named 'David'? SELECT AVG(2024 - (dob/10000)) FROM players WHERE name LIKE 'David %'; 42.7836746478873 tennis
177
+ How many unique tournaments were held in the year 2019? SELECT COUNT(DISTINCT tourney_name) FROM matches WHERE tourney_date BETWEEN 20190000 AND 20191231; 591 tennis
178
+ Who is the youngest player to win a match in the US Open? SELECT winner_name FROM matches WHERE tourney_name = 'US Open' AND winner_age IS NOT NULL ORDER BY winner_age ASC LIMIT 1; Wallace Ford Johnson tennis
179
+ List the players who have never had a rank better (lower) than 100. SELECT p.name FROM players p JOIN rankings r ON p.player_id = r.player GROUP BY p.player_id HAVING MIN(r.rank) > 100 LIMIT 10; Gardnar Mulloy Pancho Segura Frank Sedgman Giuseppe Merlo Richard Gonzalez Grant Golden Abe Segal Kurt Nielsen Istvan Gulyas Luis Ayala tennis
180
+ How many matches did Italy (ITA) win against Spain (ESP) in 2023? SELECT COUNT(*) FROM matches WHERE winner_ioc = 'ITA' AND loser_ioc = 'ESP' AND tourney_date BETWEEN 20230000 AND 20231231; 117 tennis
181
+ What is the rank of the player 'Casper Ruud' on 2023-09-11? SELECT rank FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name = 'Casper Ruud' AND r.ranking_date = 20230911; 9 tennis
182
+ How many players have a height recorded as exactly 185 cm? SELECT COUNT(*) FROM players WHERE height = 185; 424 tennis
183
+ Which tournament had the most matches played in 2022? SELECT tourney_name FROM matches WHERE tourney_date BETWEEN 20220000 AND 20221231 GROUP BY tourney_name ORDER BY COUNT(*) DESC LIMIT 1; M15 Monastir tennis
184
+ What is the aggregate height of the doubles team 'Bryan/Bryan' (Bob and Mike)? SELECT SUM(height) FROM players WHERE name IN ('Bob Bryan', 'Mike Bryan'); 383.0 tennis
185
+ How many matches were won by a player who lost the first set? SELECT COUNT(*) FROM matches WHERE score LIKE '0-6%' OR score LIKE '1-6%' OR score LIKE '2-6%' OR score LIKE '3-6%' OR score LIKE '4-6%' OR score LIKE '5-6%' OR score LIKE '6-7%'; 138823 tennis
186
+ Find the name of the player who has played the most matches in the database (win or lose). SELECT name FROM (SELECT winner_name as name FROM matches UNION ALL SELECT loser_name FROM matches) t WHERE name IS NOT NULL GROUP BY name ORDER BY COUNT(*) DESC LIMIT 1; Roger Federer tennis
187
+ What is the average rank of winners in the Roland Garros tournament? SELECT AVG(r.rank) FROM matches m JOIN rankings r ON m.winner_id = r.player AND m.tourney_date = r.ranking_date WHERE m.tourney_name = 'Roland Garros'; 91.8841528594335 tennis
188
+ How many players were born in 1995? SELECT COUNT(*) FROM players WHERE dob BETWEEN 19950101 AND 19951231; 1781 tennis
189
+ Who has the highest win count in matches lasting over 200 minutes? SELECT winner_name, COUNT(*) as wins FROM matches WHERE minutes > 200 GROUP BY winner_name ORDER BY wins DESC LIMIT 1; Novak Djokovic|63 tennis
190
+ What is the name of the oldest match winner in the database who is from the USA? SELECT winner_name FROM matches WHERE winner_ioc = 'USA' ORDER BY winner_age DESC LIMIT 1; Tom Brown tennis
191
+ What is the average number of games played in matches won by Roger Federer? SELECT AVG(minutes) FROM matches WHERE winner_name = 'Roger Federer'; 96.8959587274291 tennis
192
+ How many distinct countries have had a player ranked in the top 1? SELECT COUNT(DISTINCT p.ioc) FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.rank = 1; 13 tennis
193
+ How many matches did winner have height > 200 and loser height < 170? SELECT COUNT(*) FROM matches WHERE winner_ht > 200 AND loser_ht < 170; 25 tennis
194
+ Find the year with the most matches played. SELECT CAST(tourney_date/10000 AS INT) as year, COUNT(*) as cnt FROM matches GROUP BY year ORDER BY cnt DESC LIMIT 1; 2016|32640 tennis
195
+ What is the average rank of players named 'John'? SELECT AVG(r.rank) FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name LIKE 'John %'; 575.942671051783 tennis
196
+ How many finals were played between Nadal and Djokovic? SELECT COUNT(*) FROM matches WHERE ((winner_name = 'Rafael Nadal' AND loser_name = 'Novak Djokovic') OR (winner_name = 'Novak Djokovic' AND loser_name = 'Rafael Nadal')) AND round = 'F'; 29 tennis
197
+ What percentage of matches in 2023 were best of 3 sets? SELECT CAST(SUM(CASE WHEN best_of = '3' THEN 1 ELSE 0 END) AS FLOAT) / COUNT(*) * 100 FROM matches WHERE tourney_date BETWEEN 20230000 AND 20231231; 98.2765787370104 tennis
198
+ Who is the youngest player currently in the top 100 (based on latest ranking date)? SELECT p.name FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.ranking_date = (SELECT MAX(ranking_date) FROM rankings) AND r.rank <= 100 ORDER BY p.dob DESC LIMIT 1; Jakub Mensik tennis
199
+ How many matches did 'Nick Kyrgios' win in Australia? SELECT COUNT(*) FROM matches WHERE winner_name = 'Nick Kyrgios' AND (tourney_name LIKE 'Australian%' OR tourney_name IN ('Brisbane', 'Sydney', 'Adelaide')); 22 tennis
200
+ What is the average points of players ranked exactly 50? SELECT AVG(Points) FROM rankings WHERE rank = 50; 831.880733944954 tennis
201
+ How many players have the same birthday as Rafael Nadal? SELECT COUNT(*) FROM players WHERE dob = (SELECT dob FROM players WHERE name = 'Rafael Nadal') AND name != 'Rafael Nadal'; 4 tennis
202
+ Who won the most matches on the tour in 2011? SELECT winner_name FROM matches WHERE tourney_date BETWEEN 20110000 AND 20111231 GROUP BY winner_name ORDER BY COUNT(*) DESC LIMIT 1; Novak Djokovic tennis
203
+ What is the name of the player with ID 100001? SELECT name FROM players WHERE player_id = 100001; Gardnar Mulloy tennis
204
+ How many matches have incomplete scores? SELECT COUNT(*) FROM matches WHERE score IS NULL; 1338 tennis
training-data/test_set.tsv CHANGED
@@ -1,151 +1,251 @@
1
- natural_query sql_query result is_nba
2
- 0 What is the average number of fg_pct in home games by the Chicago Bulls? SELECT AVG(fg_pct_home) FROM game WHERE team_name_home = 'Chicago Bulls'; 0.4636694306246544 True
3
- 1 How many lead changes occurred in games where the Denver Nuggets played away? SELECT SUM(lead_changes) as total_lead_changes FROM other_stats WHERE team_abbreviation_away = 'DEN'; 5828.0 True
4
- 2 Which team had the most away games where they had more offensive than defensive rebounds? SELECT team_abbreviation_away FROM game WHERE oreb_away > dreb_away GROUP BY team_abbreviation_away ORDER BY COUNT(*) DESC LIMIT 1; ATL True
5
- 3 What is the maximum number of team rebounds recorded by the Dallas Mavericks in away games where they committed more than 20 fouls? SELECT MAX(o.team_rebounds_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_away = 'DAL' AND g.pf_away > 20 AND g.season_id = '22021'; 16 True
6
- 4 What was the average margin of victory for the Miami Heat during the 2013 NBA season? SELECT AVG(victory_margin) AS avg_victory_margin FROM ( SELECT plus_minus_home AS victory_margin FROM game WHERE team_name_home = 'Miami Heat' AND wl_home = 'W' AND season_id = '22013' UNION ALL SELECT plus_minus_away AS victory_margin FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'W' AND season_id = '22013' ) AS victories 11.48148148 True
7
- 5 What is the average fast break points scored by the Philadelphia 76ers at home during the 2018 season? SELECT AVG(os.pts_fb_home) AS avg_fast_break FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22018'; 16.32352941 True
8
- 6 Which team has the nickname 'Celtics'? SELECT full_name FROM team WHERE nickname = 'Celtics'; Boston Celtics True
9
- 7 How many games did the Milwaukee Bucks play at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Milwaukee Bucks' AND season_id = '22020'; 36 True
10
- 8 What is the average second-chance points for Toronto Raptors home games between 2015-2020? SELECT AVG(os.pts_2nd_chance_home) AS avg_second_chance FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'TOR' AND g.season_id BETWEEN '22015' AND '22020'; 13.07653061 True
11
- 9 Which team had the most fast break points in a single home game during the 2020 season? SELECT team_name_home, MAX(pts_fb_home) FROM other_stats JOIN game ON other_stats.game_id = game.game_id WHERE game.season_id = '22020'; Houston Rockets|35 True
12
- 10 What's the average points in the paint for the Boston Celtics in home games where they won by at least 10 points? SELECT AVG(os.pts_paint_home) FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Boston Celtics' AND g.plus_minus_home >= 10; 41.85 True
13
- 11 What is the highest combined total score (home + away) in a single game in the dataset? SELECT game_date, (pts_home + pts_away) AS total_points FROM game ORDER BY total_points DESC LIMIT 1; 2017-02-19 00:00:00|374.0 True
14
- 12 Which team had the best three-point shooting percentage in home games during the 2020 season? SELECT team_name_home, AVG(fg3_pct_home) AS avg_3pt_pct FROM game WHERE season_id = '22020' GROUP BY team_name_home ORDER BY avg_3pt_pct DESC LIMIT 1; LA Clippers | 0.423777777777778 True
15
- 13 Which team is located in the state of Indiana? SELECT full_name FROM team WHERE state = 'Indiana'; Indiana Pacers True
16
- 14 What was the most blocks recorded by the Orlando Magic in a single home game in the 1999 season? SELECT MAX(blk_home) AS max_blocks FROM game WHERE team_abbreviation_home = 'ORL' AND season_id = '21999'; 10.0 True
17
- 15 What was the average number of fastbreak points scored by the Houston Rockets in games they won by more than 15 points at home? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Houston Rockets' AND g.wl_home = 'W' AND (g.pts_home - g.pts_away) > 15; 13.39790576 True
18
- 16 How many times did the Los Angeles Clippers lose at home in the 2002 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'LAC' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22002'; 4 True
19
- 17 What is the full name of the team based in Dallas? SELECT full_name FROM team WHERE city = 'Dallas'; Dallas Mavericks True
20
- 18 Which team played the most total games (home + away) between 1995 and 2005? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE season_id BETWEEN '21995' AND '22005' UNION ALL SELECT team_abbreviation_away FROM game WHERE season_id BETWEEN '21995' AND '22005') GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; WAS True
21
- 19 How many games did the Miami Heat lose away in the 1996 season? SELECT COUNT(*) as losses FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'L' AND season_id = '21996'; 9.0 True
22
- 20 What is the average number of tov in away games by the Miami Heat? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Miami Heat'; 15.235255570117957 True
23
- 21 "What is the total second chance points by the Miami Heat at home?""" SELECT SUM(pts_2nd_chance_home) as total_2nd_chance FROM other_stats WHERE team_abbreviation_home = 'MIA'; 11670.0 True
24
- 22 How many home games did the Orlando Magic play in the 2013 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Orlando Magic' AND season_id = '22013'; 41.0 True
25
- 23 In which season did the Boston Celtics have the highest average tov at home? SELECT season_id, AVG(tov_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2005.0 True
26
- 24 In which season did the Chicago Bulls have the highest average ft_pct at home? SELECT season_id, AVG(ft_pct_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2016.0 True
27
- 25 How many games did the Cleveland Cavaliers play at home with more than 8 times tied in 1996? SELECT COUNT(*) as games FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Cleveland Cavaliers' AND os.times_tied > 8 AND g.season_id = '21996'; 5.0 True
28
- 26 What was the average number of offensive rebounds per game for the Chicago Bulls in the 2019 season? SELECT AVG(oreb) AS avg_offensive_rebounds FROM ( SELECT game_id, oreb_home AS oreb FROM game WHERE team_name_home = 'Chicago Bulls' AND season_id = '22019' UNION ALL SELECT game_id, oreb_away AS oreb FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22019' ); 10.46153846 True
29
- 27 What was the highest combined steals and blocks total for the Toronto Raptors in any home game during their championship season? SELECT MAX(stl_home + blk_home) AS combined_steals_blocks FROM game WHERE team_name_home = 'Toronto Raptors' AND season_id = '22019'; 24 True
30
- 28 How many times have the Boston Celtics won an away game by at least 20 points? SELECT COUNT(*) FROM game WHERE team_abbreviation_away = 'BOS' AND wl_away = 'W' AND (pts_away - pts_home) >= 20; 179 True
31
- 29 How many total turnovers did the Sacramento Kings commit in the 2001 season? SELECT SUM(tov) AS total_turnovers FROM ( SELECT tov_home AS tov FROM game WHERE team_abbreviation_home = 'SAC' AND season_id = '22001' UNION ALL SELECT tov_away AS tov FROM game WHERE team_abbreviation_away = 'SAC' AND season_id = '22001' ); 1128.0 True
32
- 30 What is the largest margin of victory the Miami Heat have ever had in an away game? SELECT MAX(ABS(pts_away - pts_home)) AS largest_margin FROM game WHERE team_abbreviation_away = 'MIA' AND pts_away > pts_home; 34.0 True
33
- 31 What was the average margin of victory for the Boston Celtics in home games during the 2000 season? SELECT AVG(pts_home - pts_away) AS avg_victory_margin FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22000'; 9.75 True
34
- 32 What are the nicknames of teams based in Florida? SELECT nickname FROM team WHERE state = 'Florida'; Heat, Magic True
35
- 33 What was the highest total rebound count by an away team in a game? SELECT team_abbreviation_away, reb_away, game_date FROM game ORDER BY reb_away DESC LIMIT 1; BOS|90.0|1957-10-22 00:00:00 True
36
- 34 What is the total number of rebounds by the San Antonio Spurs in home games during the 2015 season? SELECT SUM(reb_home) FROM game WHERE team_abbreviation_home = 'SAS' AND season_id = '22015'; 1845.0 True
37
- 35 Which away team scored the most points off turnovers in a single game? SELECT team_abbreviation_away FROM other_stats ORDER BY pts_off_to_away DESC LIMIT 1; ATL True
38
- 36 What is the highest fast break points by the Houston Rockets at home? SELECT MAX(pts_fb_home) as max_fb_points FROM other_stats WHERE team_abbreviation_home = 'HOU'; 37.0 True
39
- 37 What is the average number of tov in home games by the Miami Heat? SELECT AVG(tov_home) FROM game WHERE team_name_home = 'Miami Heat'; 14.627184466019418 True
40
- 38 What is the total number of points scored by the Los Angeles Clippers in the 2014 season in games where they had more team turnovers but fewer total turnovers than their opponent? SELECT SUM(g.pts_home) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'LAC' AND o.team_turnovers_home > o.team_turnovers_away AND o.total_turnovers_home < o.total_turnovers_away AND g.season_id = '22014'; 295.0 True
41
- 39 Which home team had the most games with a positive plus-minus but still lost? SELECT team_name_home FROM game WHERE wl_home = 'L' AND plus_minus_home > 0 GROUP BY team_name_home ORDER BY COUNT(*) DESC LIMIT 1; West NBA All Stars West True
42
- 40 In which season did the Miami Heat have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Miami Heat' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2019.0 True
43
- 41 How many games did the Chicago Bulls win at home in the 2010 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'CHI' AND wl_home = 'W' AND season_id = '22010'; 36 True
44
- 42 What was the average points scored by the Denver Nuggets in home games during the 2019 season? SELECT AVG(pts_home) AS avg_home_points FROM game WHERE team_name_home = 'Denver Nuggets' AND season_id = '22019'; 111.8378378 True
45
- 43 When was the Los Angeles Clippers team founded according to the team database? SELECT year_founded FROM team WHERE full_name = 'Los Angeles Clippers'; 1970 True
46
- 44 What is the average number of ast in home games by the Boston Celtics? SELECT AVG(ast_home) FROM game WHERE team_name_home = 'Boston Celtics'; 24.886892177589857 True
47
- 45 What is the average number of ast in away games by the Los Angeles Lakers? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 22.594638949671772 True
48
- 46 What team had the most turnovers in a single game during the 2019 season? SELECT CASE WHEN tov_home > tov_away THEN team_name_home ELSE team_name_away END AS team_with_most_turnovers FROM game WHERE season_id = '22019' ORDER BY CASE WHEN tov_home > tov_away THEN tov_home ELSE tov_away END DESC LIMIT 1 Sacramento Kings True
49
- 47 What is the highest points scored by the Miami Heat at home when they had more than 10 second chance points? SELECT MAX(g.pts_home) as max_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_home = 'Miami Heat' AND os.pts_2nd_chance_home > 10; 149.0 True
50
- 48 What is the total points in the paint by the Chicago Bulls at home in games they lost in 1996? SELECT SUM(os.pts_paint_home) as total_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.wl_home = 'L' AND g.season_id = '21996'; 56.0 True
51
- 49 How many games did the Oklahoma City Thunder score more than 30 points in the first quarter during the 2017 season? SELECT COUNT(*) AS high_scoring_first_quarters FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'Oklahoma City Thunder' AND g.pts_home / 4 > 30) OR (g.team_name_away = 'Oklahoma City Thunder' AND g.pts_away / 4 > 30) AND g.season_id = '22017'; 83 True
52
- 50 What is the total number of points scored by the Milwaukee Bucks away when they had more than 5 lead changes? SELECT SUM(g.pts_away) as total_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_away = 'Milwaukee Bucks' AND os.lead_changes > 5; 44835.0 True
53
- 51 List all games where the Houston Rockets and Dallas Mavericks played each other in the 2015 season. SELECT * FROM game WHERE season_id = '22015' AND ((team_abbreviation_home = 'HOU' AND team_abbreviation_away = 'DAL') OR (team_abbreviation_home = 'DAL' AND team_abbreviation_away = 'HOU')); 22015|1610612745|HOU|Houston Rockets|0021500140|2015-11-14 00:00:00|HOU vs. DAL|L|240|32.0|84.0|0.381|9.0|34.0|0.265|25.0|32.0|0.781|12.0|31.0|43.0|22.0|9.0|5.0|14.0|23.0|98.0|-12|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|W|43.0|89.0|0.483|8.0|28.0|0.286|16.0|21.0|0.762|8.0|37.0|45.0|24.0|6.0|7.0|11.0|21.0|110.0|12|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021500287|2015-12-04 00:00:00|DAL vs. HOU|L|240|37.0|81.0|0.457|8.0|29.0|0.276|14.0|20.0|0.7|11.0|31.0|42.0|23.0|8.0|5.0|18.0|17.0|96.0|-4|1|1610612745|HOU|Houston Rockets|HOU @ DAL|W|39.0|84.0|0.464|12.0|26.0|0.462|10.0|18.0|0.556|15.0|30.0|45.0|20.0|12.0|5.0|18.0|22.0|100.0|4|1|Regular Season 22015|1610612745|HOU|Houston Rockets|0021500665|2016-01-24 00:00:00|HOU vs. DAL|W|240|43.0|89.0|0.483|15.0|44.0|0.341|14.0|21.0|0.667|9.0|31.0|40.0|27.0|9.0|7.0|9.0|21.0|115.0|11|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|L|36.0|79.0|0.456|15.0|30.0|0.5|17.0|22.0|0.773|8.0|28.0|36.0|17.0|4.0|4.0|16.0|20.0|104.0|-11|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021501170|2016-04-06 00:00:00|DAL vs. HOU|W|240|33.0|80.0|0.413|10.0|33.0|0.303|12.0|14.0|0.857|13.0|27.0|40.0|19.0|9.0|4.0|14.0|20.0|88.0|2|1|1610612745|HOU|Houston Rockets|HOU @ DAL|L|34.0|78.0|0.436|6.0|20.0|0.3|12.0|18.0|0.667|12.0|29.0|41.0|19.0|6.0|4.0|16.0|17.0|86.0|-2|1|Regular Season True
54
- 52 What is the highest combined reb in any game involving the San Antonio Spurs? SELECT MAX(reb_home + reb_away) FROM game WHERE team_name_home = 'San Antonio Spurs' OR team_name_away = 'San Antonio Spurs'; 134.0 True
55
- 53 In which season did the Chicago Bulls have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2021.0 True
56
- 54 What is the lowest plus-minus score for the New York Knicks away? SELECT MIN(plus_minus_away) as min_plus_minus FROM game WHERE team_name_away = 'New York Knicks'; -47.0 True
57
- 55 How many total points did the Chicago Bulls score across all games in the 1988 season? SELECT SUM(pts) AS total_points FROM ( SELECT pts_home AS pts FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '21988' UNION ALL SELECT pts_away AS pts FROM game WHERE team_abbreviation_away = 'CHI' AND season_id = '21988' ); 8726.0 True
58
- 56 What is the total number of fast break points scored by the Memphis Grizzlies at home during the 2005 season? SELECT SUM(pts_fb_home) FROM other_stats WHERE game_id IN ( SELECT game_id FROM game WHERE team_name_home = 'Memphis Grizzlies' AND season_id = '22005' ); 368 True
59
- 57 What was the average points difference in home games won by the Denver Nuggets? SELECT AVG(pts_home - pts_away) FROM game WHERE team_abbreviation_home = 'DEN' AND wl_home = 'W'; 11.96471532 True
60
- 58 How many times did the Memphis Grizzlies lose at home in the 2008 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'MEM' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22008'; 3 True
61
- 59 In which season did the Boston Celtics have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1958.0 True
62
- 60 In the 2020 season, what was the average number of second chance points allowed by the New Orleans Pelicans in games they won by less than 5 points? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE ((g.team_abbreviation_home = 'NOP' AND g.wl_home = 'W' AND ABS(g.pts_home - g.pts_away) < 5) OR (g.team_abbreviation_away = 'NOP' AND g.wl_away = 'W' AND ABS(g.pts_home - g.pts_away) < 5)) AND g.season_id = '22020'; 16.6 True
63
- 61 How many games did the Golden State Warriors lose away in 1996? SELECT COUNT(*) as away_losses FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'L' AND season_id = '21996'; 29.0 True
64
- 62 Which team was most often held under 60 points in a game? SELECT team FROM (SELECT team_abbreviation_home AS team, pts_home AS pts FROM game UNION ALL SELECT team_abbreviation_away, pts_away FROM game) WHERE pts < 60 GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; BOS True
65
- 63 What is the average number of three-pointers made by the Golden State Warriors at home in the 2018 season? SELECT AVG(fg3m_home) FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22018'; 13.1951219512195 True
66
- 64 What is the Los Angeles Lakers' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'LAL' AND season_id = '22016'; 27 True
67
- 65 What is the average number of points in the paint allowed by the Philadelphia 76ers when playing at home in the 2020 season in games with more than 15 lead changes? SELECT AVG(o.pts_paint_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22020' AND o.lead_changes > 15; 50.0 True
68
- 66 How many points did the home team score in the game with the most lead changes and the fewest total fouls? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats JOIN game USING(game_id) ORDER BY lead_changes DESC, (pf_home + pf_away) ASC LIMIT 1); 122.0 True
69
- 67 How many games did the Cleveland Cavaliers lose away with more than 10 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Cleveland Cavaliers' AND g.wl_away = 'L' AND os.pts_fb_away > 10 AND g.season_id = '21996'; 4.0 True
70
- 68 What is the highest combined ast in any game involving the Orlando Magic? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Orlando Magic' OR team_name_away = 'Orlando Magic'; 74.0 True
71
- 69 What is the average points in the paint by the Utah Jazz away when they won? SELECT AVG(os.pts_paint_away) as avg_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Utah Jazz' AND g.wl_away = 'W'; 42.48 True
72
- 70 How many games did the Los Angeles Lakers play away in 1996? SELECT COUNT(*) as away_games FROM game WHERE team_name_away = 'Los Angeles Lakers' AND season_id = '21996'; 41.0 True
73
- 71 How many games had at least one team with 30+ assists? SELECT COUNT(*) FROM game WHERE ast_home >= 30 OR ast_away >= 30; 11305 True
74
- 72 What is the highest three-point percentage the Phoenix Suns achieved in an away game? SELECT MAX(fg3_pct_away) FROM game WHERE team_abbreviation_away = 'PHX'; 1 True
75
- 73 How many away games did the Miami Heat play in the 2021 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Miami Heat' AND season_id = '22021'; 41.0 True
76
- 74 How many times did the Boston Celtics win at home during the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'BOS' AND season_id = '22015' AND wl_home = 'W'; 28 True
77
- 75 How many free throws did the Houston Rockets attempt in away games they won during the 2020 season? SELECT SUM(fta_away) FROM game WHERE team_name_away = 'Houston Rockets' AND wl_away = 'W' AND season_id = '22020'; 149.0 True
78
- 76 Which away team has scored the most points against the Miami Heat in a single game? SELECT team_name_away, pts_away FROM game WHERE team_abbreviation_home = 'MIA' ORDER BY pts_away DESC LIMIT 1; Milwaukee Bucks|144.0 True
79
- 77 How many points were scored in the earliest recorded game in the database? SELECT (pts_home + pts_away) FROM game ORDER BY game_date ASC LIMIT 1; 134.0 True
80
- 78 What is the average number of tov in away games by the Los Angeles Lakers? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 14.554896142433234 True
81
- 79 What is the total number of rebounds by the Milwaukee Bucks at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Milwaukee Bucks'; 76050.0 True
82
- 80 What is the highest number of assists recorded by the Indiana Pacers in a single home game? SELECT MAX(ast_home) FROM game WHERE team_name_home = 'Indiana Pacers'; 44.0 True
83
- 81 How many times did the Miami Heat score more than 120 points at home in the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'MIA' AND season_id = '22015' AND pts_home > 120; 3 True
84
- 82 What was the lowest number of combined turnovers in any game involving the San Antonio Spurs during the 2019 season? SELECT MIN(o.total_turnovers_home + o.total_turnovers_away) AS min_combined_turnovers FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'San Antonio Spurs' OR g.team_name_away = 'San Antonio Spurs') AND g.season_id = '22019'; 13 True
85
- 83 What was the average number of fastbreak points scored by the Los Angeles Lakers in home wins during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Los Angeles Lakers' AND g.wl_home = 'W' AND g.season_id = '22020'; 13.64705882 True
86
- 84 What was the highest number of steals by the Detroit Pistons in a single game during the 2004 season? SELECT MAX(stl) AS max_steals FROM ( SELECT stl_home AS stl FROM game WHERE team_abbreviation_home = 'DET' AND season_id = '22004' UNION ALL SELECT stl_away AS stl FROM game WHERE team_abbreviation_away = 'DET' AND season_id = '22004' ); 13 True
87
- 85 In 2018, which team has the most home wins and how many home wins did they have? SELECT team_abbreviation_home, COUNT(*) FROM game WHERE wl_home = 'W' AND season_id = '22018' GROUP BY team_abbreviation_home ORDER BY COUNT(*) DESC LIMIT 1; (DEN, 34) True
88
- 86 How many three-pointers did the Golden State Warriors attempt in total during the 2017 season? SELECT SUM(fg3a) AS total_three_attempts FROM ( SELECT fg3a_home AS fg3a FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22017' UNION ALL SELECT fg3a_away AS fg3a FROM game WHERE team_abbreviation_away = 'GSW' AND season_id = '22017' ); 2369.0 True
89
- 87 What is the highest number of three-pointers made in a single game by the Houston Rockets at home? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Houston Rockets'; 27.0 True
90
- 88 How many games did the Boston Celtics win on the road during the 2018 season? SELECT COUNT(*) AS away_wins FROM game WHERE team_name_away = 'Boston Celtics' AND wl_away = 'W' AND season_id = '22018'; 21 True
91
- 89 What is the most three-pointers the Brooklyn Nets have ever made in a home game? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Brooklyn Nets'; 22.0 True
92
- 90 How many total offensive rebounds did the Houston Rockets have in away games during the 2018 season? SELECT SUM(oreb_away) FROM game WHERE team_name_away = 'Houston Rockets' AND season_id = '22018'; 419.0 True
93
- 91 What is the average number of pts in away games by the Miami Heat? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Miami Heat'; 96.7824377457405 True
94
- 92 What is the state of the team nicknamed 'Jazz'? SELECT state FROM team WHERE nickname = 'Jazz'; Utah True
95
- 93 How many points did the Phoenix Suns score in the highest scoring away game they played? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'PHX'; 161.0 True
96
- 94 In which season did the Charlotte Hornets have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Charlotte Hornets' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2017.0 True
97
- 95 Which team had the worst average point differential in the 2007 season? SELECT team_abbreviation, AVG(point_diff) AS avg_point_differential FROM ( SELECT team_abbreviation_home AS team_abbreviation, (pts_home - pts_away) AS point_diff FROM game WHERE season_id = '22007' UNION ALL SELECT team_abbreviation_away, (pts_away - pts_home) FROM game WHERE season_id = '22007' ) GROUP BY team_abbreviation ORDER BY avg_point_differential ASC LIMIT 1; SEA|-8.75609756097561 True
98
- 96 In which season did the Milwaukee Bucks have the highest average fg_pct at home? SELECT season_id, AVG(fg_pct_home) as avg_stat FROM game WHERE team_name_home = 'Milwaukee Bucks' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 42017.0 True
99
- 97 In games where the Brooklyn Nets scored more than 50 points in the paint at home, what was their assist-to-field goal made ratio? SELECT SUM(g.ast_home) * 1.0 / SUM(g.fgm_home) AS assist_to_fgm_ratio FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Brooklyn Nets' AND o.pts_paint_home > 50; 0.588761175 True
100
- 98 How many away games did the Chicago Bulls play in the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22020'; 36.0 True
101
- 99 What is the average scoring ouput for home teams. Round to 2 decimal places. SELECT ROUND(AVG(pts_home),2) AS avg_home_points FROM game WHERE season_type = 'Regular Season'; 104.76 True
102
- 100 In which season did the Golden State Warriors have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Golden State Warriors' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1974.0 True
103
- 101 Which team founded in the 70s has a nickname starting with 'C'? SELECT full_name FROM team WHERE year_founded BETWEEN 1970 AND 1979 AND nickname LIKE 'C%'; Cleveland Cavaliers, Los Angeles Clippers True
104
- 102 What is the highest combined ft_pct in any game involving the Los Angeles Lakers? SELECT MAX(ft_pct_home + ft_pct_away) FROM game WHERE team_name_home = 'Los Angeles Lakers' OR team_name_away = 'Los Angeles Lakers'; 1.957 True
105
- 103 How many fastbreak points did the Los Angeles Clippers average in home games during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'LA Clippers' AND g.season_id = '22020'; 11.5 True
106
- 104 What is the average number of three-pointers made by away teams in games where they had more turnovers than assists? SELECT AVG(fg3m_away) FROM game WHERE tov_away > ast_away; 4.511052937754508 True
107
- 105 What was the difference in average free throw attempts between the Brooklyn Nets and their opponents in home games during the 2020 season? SELECT AVG(fta_home - fta_away) AS fta_diff FROM game WHERE team_name_home = 'Brooklyn Nets' AND season_id = '22020'; 1.083333333 True
108
- 106 What is the total points scored by the Philadelphia Warriors away? SELECT SUM(pts_away) as total_points FROM game WHERE team_name_away = 'Philadelphia 76ers'; 251917.0 True
109
- 107 When was the last time the New York Knicks won a home game? SELECT game_date FROM game WHERE team_abbreviation_home = 'NYK' AND wl_home = 'W' ORDER BY game_date DESC LIMIT 1; 2023-05-10 00:00:00 True
110
- 108 What was the lowest-scoring game involving the Indiana Pacers in the 1994 season? SELECT MIN(total_points) AS lowest_scoring_game FROM ( SELECT (pts_home + pts_away) AS total_points FROM game WHERE season_id = '21994' AND (team_abbreviation_home = 'IND' OR team_abbreviation_away = 'IND') ); 155.0 True
111
- 109 How many games did the Sacramento Kings lose at home in 1996? SELECT COUNT(*) as home_losses FROM game WHERE team_name_home = 'Sacramento Kings' AND wl_home = 'L' AND season_id = '21996'; 19.0 True
112
- 110 What was the total score of the only game in which the home team made exactly 33 field goals? SELECT pts_home + pts_away FROM game WHERE fgm_home = 33 LIMIT 1; 144.0 True
113
- 111 What was the difference in second-chance points between the Chicago Bulls and their opponents in their closest home game of the 2016 season? SELECT o.pts_2nd_chance_home - o.pts_2nd_chance_away AS second_chance_diff FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.season_id = '22016' ORDER BY ABS(g.pts_home - g.pts_away) ASC LIMIT 1; -5 True
114
- 112 What is the highest plus-minus score for the Indiana Pacers at home? SELECT MAX(plus_minus_home) as max_plus_minus FROM game WHERE team_name_home = 'Indiana Pacers'; 65.0 True
115
- 113 What is the total number of three-pointers made by the Golden State Warriors at home versus the Cleveland Cavaliers in all seasons combined? SELECT SUM(fg3m_home) AS total_threes FROM game WHERE team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers'; 407 True
116
- 114 How many points did the away team score in the only game where the home team had exactly 69 field goal attempts? SELECT pts_away FROM game WHERE fga_home = 69 LIMIT 1; 81.0 True
117
- 115 What is the average number of ast in away games by the Milwaukee Bucks? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Milwaukee Bucks'; 22.16927374301676 True
118
- 116 What is the total number of steals recorded by the Miami Heat in games against the Boston Celtics? SELECT SUM(CASE WHEN team_name_home = 'Miami Heat' THEN stl_home ELSE stl_away END) AS total_steals FROM game WHERE (team_name_home = 'Miami Heat' AND team_name_away = 'Boston Celtics') OR (team_name_home = 'Boston Celtics' AND team_name_away = 'Miami Heat'); 1253 True
119
- 117 Which team had the most games where both teams scored over 110 points? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE pts_home > 110 AND pts_away > 110 UNION ALL SELECT team_abbreviation_away FROM game WHERE pts_home > 110 AND pts_away > 110) GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; LAL True
120
- 118 What is the highest number of points the Los Angeles Lakers have scored in a single away game? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'LAL'; 153.0 True
121
- 119 What is the total second chance points by the Washington Wizards away? SELECT SUM(pts_2nd_chance_away) as total_2nd_chance FROM other_stats WHERE team_abbreviation_away = 'WAS'; 13226.0 True
122
- 120 What is the average number of assists per game for the Golden State Warriors when they won during the 2018 season? SELECT AVG(assists) AS avg_assists FROM ( SELECT ast_home AS assists FROM game WHERE team_name_home = 'Golden State Warriors' AND wl_home = 'W' AND season_id = '22018' UNION ALL SELECT ast_away AS assists FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'W' AND season_id = '22018' ) AS winning_games 31 True
123
- 121 What was the total number of points in the game where both teams had the exact same number of personal fouls? SELECT pts_home + pts_away FROM game WHERE pf_home = pf_away ORDER BY game_date DESC LIMIT 1; 258.0 True
124
- 122 How many games did the Boston Celtics win at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22020'; 21 True
125
- 123 Which team had the highest average free throw percentage at home in the 2016 season? SELECT team_name_home, AVG(ft_pct_home) AS avg_ft_percentage FROM game WHERE season_id = '22016' GROUP BY team_name_home ORDER BY avg_ft_percentage DESC LIMIT 1; Boston Celtics | 0.820975609756098 True
126
- 124 In the 2001 season, what was the average number of second chance points scored by the opponents when the Atlanta Hawks played at home and lost? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'ATL' AND g.wl_home = 'L' AND g.season_id = '22001'; 13.333333333333334 True
127
- 125 Which team had the highest average points from second chance opportunities in home games they won during the 2016 season? SELECT g.team_name_home, AVG(o.pts_2nd_chance_home) AS avg_second_chance_pts FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.wl_home = 'W' AND g.season_id = '22016' GROUP BY g.team_name_home ORDER BY avg_second_chance_pts DESC LIMIT 1; Los Angeles Lakers | 15.6153846153846 True
128
- 126 What is the highest number of points the Golden State Warriors have ever scored in a single home game? SELECT MAX(pts_home) FROM game WHERE team_abbreviation_home = 'GSW'; 149.0 True
129
- 127 What is the average number of ft_pct in home games by the Los Angeles Lakers? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Los Angeles Lakers'; 0.7450706106870195 True
130
- 128 How many team turnovers did the New York Knicks have at home? SELECT SUM(team_turnovers_home) as total_team_turnovers FROM other_stats WHERE team_abbreviation_home = 'NYK'; 550.0 True
131
- 129 How many three-pointers did the Golden State Warriors make in total during the 2016 season? SELECT SUM(fg3m_home + fg3m_away) AS total_three_pointers FROM game WHERE season_id = '22016' AND (team_name_home = 'Golden State Warriors' OR team_name_away = 'Golden State Warriors'); 1719.0 True
132
- 130 What is the total rebounds by the Miami Heat at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Miami Heat'; 65199.0 True
133
- 131 What is the average number of fg_pct in away games by the Los Angeles Lakers? SELECT AVG(fg_pct_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 0.4678996728462382 True
134
- 132 How many points did the home team score in the game with the most second chance points? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats ORDER BY (pts_2nd_chance_home + pts_2nd_chance_away) DESC LIMIT 1); 115.0 True
135
- 133 What was the total number of points in the only game where the sum of both teams' free throws made was exactly 42? SELECT pts_home + pts_away FROM game WHERE (ftm_home + ftm_away) = 42 LIMIT 1; 156.0 True
136
- 134 What is the average number of ft_pct in home games by the Charlotte Hornets? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Charlotte Hornets'; 0.7601475237091683 True
137
- 135 Which team is based in the city of Chicago? SELECT full_name FROM team WHERE city = 'Chicago'; Chicago Bulls True
138
- 136 What is the Chicago Bulls' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '22016'; 47 True
139
- 137 Which players scored 50 or more points in a game during the 1990s? SELECT game_id, game_date, CASE WHEN pts_home >= 50 THEN team_name_home ELSE team_name_away END AS team_name, CASE WHEN pts_home >= 50 THEN pts_home ELSE pts_away END AS points FROM game WHERE (pts_home >= 50 OR pts_away >= 50) AND CAST(SUBSTR(season_id, 2) AS INTEGER) BETWEEN 1990 AND 1999 ORDER BY points DESC True
140
- 138 How many home games did the Los Angeles Lakers play in the 2022 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Los Angeles Lakers' AND season_id = '22022'; 41.0 True
141
- 139 What is the total points in the paint by the Milwaukee Bucks away? SELECT SUM(pts_paint_away) as total_pts_paint FROM other_stats WHERE team_abbreviation_away = 'MIL'; 39056.0 True
142
- 140 What is the largest margin of victory in a game, whether home or away? SELECT game_date, ABS(pts_home - pts_away) AS margin FROM game ORDER BY margin DESC LIMIT 1; 2021-12-02 00:00:00|73.0 True
143
- 141 What is the average number of pts in away games by the Portland Trail Blazers? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Portland Trail Blazers'; 102.6668215613383 True
144
- 142 What is the highest number of rebounds recorded by a home team in a game during the 2005 season? SELECT MAX(reb_home) FROM game WHERE season_id = '22005'; 65.0 True
145
- 143 What is the highest combined ast in any game involving the Boston Celtics? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Boston Celtics' OR team_name_away = 'Boston Celtics'; 79.0 True
146
- 144 How many times were games tied when the Indiana Pacers played away? SELECT SUM(times_tied) as total_times_tied FROM other_stats WHERE team_abbreviation_away = 'IND'; 4910.0 True
147
- 145 How many points did the away team score when the home team had more than 20 offensive rebounds? SELECT SUM(pts_away) FROM game WHERE game_id IN (SELECT game_id FROM game WHERE oreb_home > 20); 199836.0 True
148
- 146 What is the highest combined score in a game between the Golden State Warriors and the Cleveland Cavaliers? SELECT MAX(pts_home + pts_away) FROM game WHERE (team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers') OR (team_name_home = 'Cleveland Cavaliers' AND team_name_away = 'Golden State Warriors'); 266.0 True
149
- 147 Which game had the highest total points scored by both teams when the Los Angeles Lakers played at home? SELECT game_id, (pts_home + pts_away) AS total_points FROM game WHERE team_abbreviation_home = 'LAL' ORDER BY total_points DESC LIMIT 1; (0028000933, 294.0) True
150
- 148 How many games did the Sacramento Kings lose away with more than 15 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Sacramento Kings' AND g.wl_away = 'L' AND os.pts_fb_away > 15 AND g.season_id = '21996'; 10.0 True
151
- 149 What is the lowest number of points the Golden State Warriors have scored in an away game? SELECT MIN(pts_away) FROM game WHERE team_abbreviation_away = 'GSW'; 65.0 True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ natural_query sql_query result is_nba
2
+ How many spanish (ESP) players are there? SELECT COUNT(*) AS spanish_players FROM players WHERE ioc = 'ESP'; 3026 False
3
+ How many distinct players appear in the rankings table? SELECT COUNT(DISTINCT player) AS distinct_players FROM rankings; 16174 False
4
+ How many times did the Los Angeles Clippers lose at home in the 2002 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'LAC' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22002'; 4 True
5
+ How many times have the Boston Celtics won an away game by at least 20 points? SELECT COUNT(*) FROM game WHERE team_abbreviation_away = 'BOS' AND wl_away = 'W' AND (pts_away - pts_home) >= 20; 179 True
6
+ Show the most successful player by win count SELECT winner_name, COUNT(*) as total_wins FROM matches WHERE winner_name IS NOT NULL GROUP BY winner_name ORDER BY total_wins DESC LIMIT 1; Roger Federer|1305 False
7
+ In which season did the Chicago Bulls have the highest average ft_pct at home? SELECT season_id, AVG(ft_pct_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2016.0 True
8
+ Which team was most often held under 60 points in a game? SELECT team FROM (SELECT team_abbreviation_home AS team, pts_home AS pts FROM game UNION ALL SELECT team_abbreviation_away, pts_away FROM game) WHERE pts < 60 GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; BOS True
9
+ Find the name of the player ranked #1 on the date 2010-01-01. SELECT p.name FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.rank = 1 AND r.ranking_date > 20100101 AND r.ranking_date < 20100108; Roger Federer False
10
+ How many lead changes occurred in games where the Denver Nuggets played away? SELECT SUM(lead_changes) as total_lead_changes FROM other_stats WHERE team_abbreviation_away = 'DEN'; 5828.0 True
11
+ How many different countries have players who won matches at the US Open? SELECT COUNT(DISTINCT winner_ioc) FROM matches WHERE tourney_name = 'US Open'; 90 False
12
+ What is the total number of steals recorded by the Miami Heat in games against the Boston Celtics? SELECT SUM(CASE WHEN team_name_home = 'Miami Heat' THEN stl_home ELSE stl_away END) AS total_steals FROM game WHERE (team_name_home = 'Miami Heat' AND team_name_away = 'Boston Celtics') OR (team_name_home = 'Boston Celtics' AND team_name_away = 'Miami Heat'); 1253 True
13
+ What is the total number of losses Nick Kyrgios has at the Roland Garros? SELECT COUNT(*) FROM matches WHERE loser_name = 'Nick Kyrgios' AND tourney_name = 'Roland Garros'; 5 False
14
+ Which team had the highest average free throw percentage at home in the 2016 season? SELECT team_name_home, AVG(ft_pct_home) AS avg_ft_percentage FROM game WHERE season_id = '22016' GROUP BY team_name_home ORDER BY avg_ft_percentage DESC LIMIT 1; Boston Celtics | 0.820975609756098 True
15
+ What is the average fast break points scored by the Philadelphia 76ers at home during the 2018 season? SELECT AVG(os.pts_fb_home) AS avg_fast_break FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22018'; 16.32352941 True
16
+ How many total points did the Chicago Bulls score across all games in the 1988 season? SELECT SUM(pts) AS total_points FROM ( SELECT pts_home AS pts FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '21988' UNION ALL SELECT pts_away AS pts FROM game WHERE team_abbreviation_away = 'CHI' AND season_id = '21988' ); 8726.0 True
17
+ In which season did the Milwaukee Bucks have the highest average fg_pct at home? SELECT season_id, AVG(fg_pct_home) as avg_stat FROM game WHERE team_name_home = 'Milwaukee Bucks' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 42017.0 True
18
+ What is the shortest player's height? SELECT MIN(height) FROM players WHERE height IS NOT NULL; 145.0 False
19
+ How many matches has 'Rafael Nadal' lost? SELECT count(*) FROM matches WHERE loser_name = 'Rafael Nadal'; 255 False
20
+ How many matches went to “best of 5” at the US Open? SELECT COUNT(*) FROM matches WHERE tourney_name = 'US Open' AND best_of = '5'; 14144 False
21
+ Find all players born after January 1, 2008. SELECT name, dob FROM players WHERE dob > 20080000; Vito Antonio Darderi|20080113.0 False
22
+ What is the number of wins Rafael Nadal has at the Australian Open? SELECT COUNT(*) FROM matches WHERE winner_name = 'Rafael Nadal' AND tourney_name = 'Australian Open'; 77 False
23
+ What is the total number of losses Taylor Fritz has at the Roland Garros? SELECT COUNT(*) FROM matches WHERE loser_name = 'Taylor Fritz' AND tourney_name = 'Roland Garros'; 7 False
24
+ What is the state of the team nicknamed 'Jazz'? SELECT state FROM team WHERE nickname = 'Jazz'; Utah True
25
+ How many games did the Sacramento Kings lose at home in 1996? SELECT COUNT(*) as home_losses FROM game WHERE team_name_home = 'Sacramento Kings' AND wl_home = 'L' AND season_id = '21996'; 19.0 True
26
+ Who won the longest match recorded in the 2021 season? SELECT winner_name FROM matches WHERE tourney_date BETWEEN 20210000 AND 20211231 ORDER BY minutes DESC LIMIT 1; Jozef Kovalik False
27
+ How many fastbreak points did the Los Angeles Clippers average in home games during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'LA Clippers' AND g.season_id = '22020'; 11.5 True
28
+ How many games did the Cleveland Cavaliers play at home with more than 8 times tied in 1996? SELECT COUNT(*) as games FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Cleveland Cavaliers' AND os.times_tied > 8 AND g.season_id = '21996'; 5.0 True
29
+ Which team had the highest average points from second chance opportunities in home games they won during the 2016 season? SELECT g.team_name_home, AVG(o.pts_2nd_chance_home) AS avg_second_chance_pts FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.wl_home = 'W' AND g.season_id = '22016' GROUP BY g.team_name_home ORDER BY avg_second_chance_pts DESC LIMIT 1; Los Angeles Lakers | 15.6153846153846 True
30
+ Which team had the most games where both teams scored over 110 points? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE pts_home > 110 AND pts_away > 110 UNION ALL SELECT team_abbreviation_away FROM game WHERE pts_home > 110 AND pts_away > 110) GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; LAL True
31
+ Get the total points for all ranked players by country (only the top 10) SELECT p.ioc, SUM(r.Points) as total_points FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.ranking_date = (SELECT MAX(ranking_date) FROM rankings) GROUP BY p.ioc ORDER BY total_points DESC LIMIT 10; USA|26319.0 ITA|24835.0 FRA|24732.0 RUS|17876.0 ESP|17474.0 ARG|16757.0 AUS|14319.0 GER|14093.0 SRB|13632.0 GBR|8007.0 False
32
+ How many players have 'Tennis' in their name? SELECT COUNT(*) FROM players WHERE name LIKE '%Tennis%'; 0 False
33
+ When was the last time the New York Knicks won a home game? SELECT game_date FROM game WHERE team_abbreviation_home = 'NYK' AND wl_home = 'W' ORDER BY game_date DESC LIMIT 1; 2023-05-10 00:00:00 True
34
+ What is the total number of fast break points scored by the Memphis Grizzlies at home during the 2005 season? SELECT SUM(pts_fb_home) FROM other_stats WHERE game_id IN ( SELECT game_id FROM game WHERE team_name_home = 'Memphis Grizzlies' AND season_id = '22005' ); 368 True
35
+ How many times did matches result in a retirement? SELECT COUNT(*) FROM matches WHERE score LIKE '%RET%'; 27840 False
36
+ What is the average number of assists per game for the Golden State Warriors when they won during the 2018 season? SELECT AVG(assists) AS avg_assists FROM ( SELECT ast_home AS assists FROM game WHERE team_name_home = 'Golden State Warriors' AND wl_home = 'W' AND season_id = '22018' UNION ALL SELECT ast_away AS assists FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'W' AND season_id = '22018' ) AS winning_games 31 True
37
+ How many games did the Golden State Warriors lose away in 1996? SELECT COUNT(*) as away_losses FROM game WHERE team_name_away = 'Golden State Warriors' AND wl_away = 'L' AND season_id = '21996'; 29.0 True
38
+ What is the total points in the paint by the Chicago Bulls at home in games they lost in 1996? SELECT SUM(os.pts_paint_home) as total_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.wl_home = 'L' AND g.season_id = '21996'; 56.0 True
39
+ When was the Los Angeles Clippers team founded according to the team database? SELECT year_founded FROM team WHERE full_name = 'Los Angeles Clippers'; 1970 True
40
+ What is the average age of winners of the 'Australian Open'? SELECT avg(winner_age) FROM matches WHERE tourney_name = 'Australian Open'; 25.6905314807382 False
41
+ Which team is located in the state of Indiana? SELECT full_name FROM team WHERE state = 'Indiana'; Indiana Pacers True
42
+ In how many different tournaments did Pete Sampras play? SELECT COUNT(DISTINCT tourney_name) FROM matches WHERE winner_name = 'Pete Sampras' OR loser_name = 'Pete Sampras'; 82 False
43
+ What is the name and country of the player who lost the longest match (by minutes)? SELECT T1.name, T1.ioc FROM players AS T1 JOIN matches AS T2 ON T1.player_id = T2.loser_id ORDER BY T2.minutes DESC LIMIT 1; Tomas Lipovsek Puches|ARG False
44
+ How many home games did the Orlando Magic play in the 2013 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Orlando Magic' AND season_id = '22013'; 41.0 True
45
+ What was the average number of fastbreak points scored by the Houston Rockets in games they won by more than 15 points at home? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Houston Rockets' AND g.wl_home = 'W' AND (g.pts_home - g.pts_away) > 15; 13.39790576 True
46
+ What is the highest combined ft_pct in any game involving the Los Angeles Lakers? SELECT MAX(ft_pct_home + ft_pct_away) FROM game WHERE team_name_home = 'Los Angeles Lakers' OR team_name_away = 'Los Angeles Lakers'; 1.957 True
47
+ What is the total number of points scored by the Los Angeles Clippers in the 2014 season in games where they had more team turnovers but fewer total turnovers than their opponent? SELECT SUM(g.pts_home) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'LAC' AND o.team_turnovers_home > o.team_turnovers_away AND o.total_turnovers_home < o.total_turnovers_away AND g.season_id = '22014'; 295.0 True
48
+ Which away team scored the most points off turnovers in a single game? SELECT team_abbreviation_away FROM other_stats ORDER BY pts_off_to_away DESC LIMIT 1; ATL True
49
+ What is the total number of rebounds by the San Antonio Spurs in home games during the 2015 season? SELECT SUM(reb_home) FROM game WHERE team_abbreviation_home = 'SAS' AND season_id = '22015'; 1845.0 True
50
+ What was Roger Federer’s average age when winning matches at Wimbledon? SELECT AVG(winner_age) FROM matches WHERE tourney_name = 'Wimbledon' AND winner_name = 'Roger Federer'; 29.3160377358491 False
51
+ How many matches were won by players under 25? SELECT COUNT(*) FROM matches WHERE winner_age < 25; 573136 False
52
+ How many points did the Phoenix Suns score in the highest scoring away game they played? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'PHX'; 161.0 True
53
+ What is the average loser age in Wimbledon finals (best of 5)? SELECT AVG(loser_age) FROM matches WHERE tourney_name = 'Wimbledon' AND best_of = '5'; 26.8972819437329 False
54
+ What was the total score of the only game in which the home team made exactly 33 field goals? SELECT pts_home + pts_away FROM game WHERE fgm_home = 33 LIMIT 1; 144.0 True
55
+ What is the maximum number of minutes John McEnroe played at Wimbledon? SELECT MAX(minutes) FROM matches WHERE (winner_name = 'John McEnroe' OR loser_name = 'John McEnroe') AND tourney_name = 'Wimbledon'; 249.0 False
56
+ What is the average number of pts in away games by the Miami Heat? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Miami Heat'; 96.7824377457405 True
57
+ What is the largest margin of victory in a game, whether home or away? SELECT game_date, ABS(pts_home - pts_away) AS margin FROM game ORDER BY margin DESC LIMIT 1; 2021-12-02 00:00:00|73.0 True
58
+ How many matches has Carlos Alcaraz won at Wimbledon? SELECT COUNT(*) FROM matches WHERE winner_name = 'Carlos Alcaraz' AND tourney_name = 'Wimbledon'; 11 False
59
+ How many players are taller than the average height? SELECT COUNT(*) FROM players WHERE height > (SELECT AVG(height) FROM players); 1366 False
60
+ How many matches did Italy (ITA) win against Spain (ESP) in 2023? SELECT COUNT(*) FROM matches WHERE winner_ioc = 'ITA' AND loser_ioc = 'ESP' AND tourney_date BETWEEN 20230000 AND 20231231; 117 False
61
+ What is the highest points scored by the Miami Heat at home when they had more than 10 second chance points? SELECT MAX(g.pts_home) as max_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_home = 'Miami Heat' AND os.pts_2nd_chance_home > 10; 149.0 True
62
+ What is the highest fast break points by the Houston Rockets at home? SELECT MAX(pts_fb_home) as max_fb_points FROM other_stats WHERE team_abbreviation_home = 'HOU'; 37.0 True
63
+ What is the average number of pts in away games by the Portland Trail Blazers? SELECT AVG(pts_away) FROM game WHERE team_name_away = 'Portland Trail Blazers'; 102.6668215613383 True
64
+ What is Andre Agassi’s dominant playing hand? SELECT hand FROM players WHERE name = 'Andre Agassi'; R False
65
+ What was the total number of points in the game where both teams had the exact same number of personal fouls? SELECT pts_home + pts_away FROM game WHERE pf_home = pf_away ORDER BY game_date DESC LIMIT 1; 258.0 True
66
+ What is the total points in the paint by the Milwaukee Bucks away? SELECT SUM(pts_paint_away) as total_pts_paint FROM other_stats WHERE team_abbreviation_away = 'MIL'; 39056.0 True
67
+ What is the average points in the paint by the Utah Jazz away when they won? SELECT AVG(os.pts_paint_away) as avg_pts_paint FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Utah Jazz' AND g.wl_away = 'W'; 42.48 True
68
+ What is the average number of fg_pct in away games by the Los Angeles Lakers? SELECT AVG(fg_pct_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 0.4678996728462382 True
69
+ Which player defeated Andre Agassi the most at the US Open? SELECT winner_name, COUNT(*) AS wins FROM matches WHERE loser_name = 'Andre Agassi' AND tourney_name = 'US Open' GROUP BY winner_name ORDER BY wins DESC LIMIT 1; Pete Sampras|4 False
70
+ How many three-pointers did the Golden State Warriors make in total during the 2016 season? SELECT SUM(fg3m_home + fg3m_away) AS total_three_pointers FROM game WHERE season_id = '22016' AND (team_name_home = 'Golden State Warriors' OR team_name_away = 'Golden State Warriors'); 1719.0 True
71
+ What is the full name of the team based in Dallas? SELECT full_name FROM team WHERE city = 'Dallas'; Dallas Mavericks True
72
+ Find the average points for Pete Sampras across all rankings. SELECT AVG(Points) FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name = 'Pete Sampras'; False
73
+ How many players have the same birthday as Rafael Nadal? SELECT COUNT(*) FROM players WHERE dob = (SELECT dob FROM players WHERE name = 'Rafael Nadal') AND name != 'Rafael Nadal'; 4 False
74
+ How many times has Carlos Alcaraz defeated Novak Djokovic? SELECT COUNT(*) FROM matches WHERE winner_name = 'Carlos Alcaraz' AND loser_name = 'Novak Djokovic'; 2 False
75
+ How many games did the Los Angeles Lakers play away in 1996? SELECT COUNT(*) as away_games FROM game WHERE team_name_away = 'Los Angeles Lakers' AND season_id = '21996'; 41.0 True
76
+ Which game had the highest total points scored by both teams when the Los Angeles Lakers played at home? SELECT game_id, (pts_home + pts_away) AS total_points FROM game WHERE team_abbreviation_home = 'LAL' ORDER BY total_points DESC LIMIT 1; (0028000933, 294.0) True
77
+ What was the highest number of steals by the Detroit Pistons in a single game during the 2004 season? SELECT MAX(stl) AS max_steals FROM ( SELECT stl_home AS stl FROM game WHERE team_abbreviation_home = 'DET' AND season_id = '22004' UNION ALL SELECT stl_away AS stl FROM game WHERE team_abbreviation_away = 'DET' AND season_id = '22004' ); 13 True
78
+ What is the average age of Alexander Zverev when losing matches? SELECT AVG(loser_age) FROM matches WHERE loser_name = 'Alexander Zverev'; 20.9510638297872 False
79
+ What was the lowest-scoring game involving the Indiana Pacers in the 1994 season? SELECT MIN(total_points) AS lowest_scoring_game FROM ( SELECT (pts_home + pts_away) AS total_points FROM game WHERE season_id = '21994' AND (team_abbreviation_home = 'IND' OR team_abbreviation_away = 'IND') ); 155.0 True
80
+ What is the shortest match played by Novak Djokovic? SELECT MIN(minutes) FROM matches WHERE winner_name = 'Novak Djokovic' OR loser_name = 'Novak Djokovic'; 0.0 False
81
+ What is the date of birth of Andy Murray? SELECT dob FROM players WHERE name = 'Andy Murray'; 19870515 False
82
+ What is the average number of tov in away games by the Los Angeles Lakers? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 14.554896142433234 True
83
+ What is the average number of points in the paint allowed by the Philadelphia 76ers when playing at home in the 2020 season in games with more than 15 lead changes? SELECT AVG(o.pts_paint_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'PHI' AND g.season_id = '22020' AND o.lead_changes > 15; 50.0 True
84
+ What was the highest total rebound count by an away team in a game? SELECT team_abbreviation_away, reb_away, game_date FROM game ORDER BY reb_away DESC LIMIT 1; BOS|90.0|1957-10-22 00:00:00 True
85
+ What is the highest three-point percentage the Phoenix Suns achieved in an away game? SELECT MAX(fg3_pct_away) FROM game WHERE team_abbreviation_away = 'PHX'; 1 True
86
+ What is the most three-pointers the Brooklyn Nets have ever made in a home game? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Brooklyn Nets'; 22.0 True
87
+ How many matches has Alexander Zverev won against top 10 opponents? SELECT COUNT(*) FROM matches m JOIN rankings r ON m.loser_id = r.player AND m.tourney_date = r.ranking_date WHERE m.winner_name = 'Alexander Zverev' AND r.rank <= 10; 47 False
88
+ What's the average points in the paint for the Boston Celtics in home games where they won by at least 10 points? SELECT AVG(os.pts_paint_home) FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_home = 'Boston Celtics' AND g.plus_minus_home >= 10; 41.85 True
89
+ How many unique tournaments were held in the year 2019? SELECT COUNT(DISTINCT tourney_name) FROM matches WHERE tourney_date BETWEEN 20190000 AND 20191231; 591 False
90
+ How many games did the Chicago Bulls win at home in the 2010 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'CHI' AND wl_home = 'W' AND season_id = '22010'; 36 True
91
+ What was the average margin of victory for the Miami Heat during the 2013 NBA season? SELECT AVG(victory_margin) AS avg_victory_margin FROM ( SELECT plus_minus_home AS victory_margin FROM game WHERE team_name_home = 'Miami Heat' AND wl_home = 'W' AND season_id = '22013' UNION ALL SELECT plus_minus_away AS victory_margin FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'W' AND season_id = '22013' ) AS victories 11.48148148 True
92
+ How many matches did Pete Sampras play at Wimbledon? SELECT COUNT(*) FROM matches WHERE tourney_name = 'Wimbledon' AND (winner_name = 'Pete Sampras' OR loser_name = 'Pete Sampras'); 70 False
93
+ What is the highest plus-minus score for the Indiana Pacers at home? SELECT MAX(plus_minus_home) as max_plus_minus FROM game WHERE team_name_home = 'Indiana Pacers'; 65.0 True
94
+ List all games where the Houston Rockets and Dallas Mavericks played each other in the 2015 season. SELECT * FROM game WHERE season_id = '22015' AND ((team_abbreviation_home = 'HOU' AND team_abbreviation_away = 'DAL') OR (team_abbreviation_home = 'DAL' AND team_abbreviation_away = 'HOU')); 22015|1610612745|HOU|Houston Rockets|0021500140|2015-11-14 00:00:00|HOU vs. DAL|L|240|32.0|84.0|0.381|9.0|34.0|0.265|25.0|32.0|0.781|12.0|31.0|43.0|22.0|9.0|5.0|14.0|23.0|98.0|-12|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|W|43.0|89.0|0.483|8.0|28.0|0.286|16.0|21.0|0.762|8.0|37.0|45.0|24.0|6.0|7.0|11.0|21.0|110.0|12|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021500287|2015-12-04 00:00:00|DAL vs. HOU|L|240|37.0|81.0|0.457|8.0|29.0|0.276|14.0|20.0|0.7|11.0|31.0|42.0|23.0|8.0|5.0|18.0|17.0|96.0|-4|1|1610612745|HOU|Houston Rockets|HOU @ DAL|W|39.0|84.0|0.464|12.0|26.0|0.462|10.0|18.0|0.556|15.0|30.0|45.0|20.0|12.0|5.0|18.0|22.0|100.0|4|1|Regular Season 22015|1610612745|HOU|Houston Rockets|0021500665|2016-01-24 00:00:00|HOU vs. DAL|W|240|43.0|89.0|0.483|15.0|44.0|0.341|14.0|21.0|0.667|9.0|31.0|40.0|27.0|9.0|7.0|9.0|21.0|115.0|11|1|1610612742|DAL|Dallas Mavericks|DAL @ HOU|L|36.0|79.0|0.456|15.0|30.0|0.5|17.0|22.0|0.773|8.0|28.0|36.0|17.0|4.0|4.0|16.0|20.0|104.0|-11|1|Regular Season 22015|1610612742|DAL|Dallas Mavericks|0021501170|2016-04-06 00:00:00|DAL vs. HOU|W|240|33.0|80.0|0.413|10.0|33.0|0.303|12.0|14.0|0.857|13.0|27.0|40.0|19.0|9.0|4.0|14.0|20.0|88.0|2|1|1610612745|HOU|Houston Rockets|HOU @ DAL|L|34.0|78.0|0.436|6.0|20.0|0.3|12.0|18.0|0.667|12.0|29.0|41.0|19.0|6.0|4.0|16.0|17.0|86.0|-2|1|Regular Season True
95
+ What is the average number of three-pointers made by the Golden State Warriors at home in the 2018 season? SELECT AVG(fg3m_home) FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22018'; 13.1951219512195 True
96
+ Find players who have improved their ranking by more than 2000 spots SELECT p.name, MAX(r.rank) as old_rank, MIN(r.rank) as new_rank, (MAX(r.rank) - MIN(r.rank)) as improvement FROM rankings r JOIN players p ON r.player = p.player_id GROUP BY p.player_id, p.name HAVING (MAX(r.rank) - MIN(r.rank)) > 2000 ORDER BY improvement DESC; Stefanos Tsitsipas|2208|3|2205 Denis Shapovalov|2151|10|2141 Ryan Peniston|2239|123|2116 Carlos Taberner|2197|85|2112 Bernabe Zapata Miralles|2139|37|2102 Alejandro Tabilo|2095|24|2071 Tomas Barrios Vera|2163|93|2070 Gian Marco Moroni|2221|159|2062 Tallon Griekspoor|2077|21|2056 Miguel Angel Lopez Jaen|2221|171|2050 Tomas Martin Etcheverry|2075|27|2048 Gijs Brouwer|2154|114|2040 Adam Chadaj|2237|202|2035 Soon Woo Kwon|2082|52|2030 Benjamin Bonzi|2070|42|2028 Andrea Pellegrino|2163|136|2027 Martin Verkerk|2035|14|2021 Marc Andrea Huesler|2060|47|2013 Thai Son Kwiatkowski|2189|181|2008 Lukas Klein|2122|116|2006 Altug Celikbilek|2159|154|2005 Ivan Gakhov|2143|142|2001 Jay Clarke|2154|153|2001 False
97
+ How many games did the Oklahoma City Thunder score more than 30 points in the first quarter during the 2017 season? SELECT COUNT(*) AS high_scoring_first_quarters FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'Oklahoma City Thunder' AND g.pts_home / 4 > 30) OR (g.team_name_away = 'Oklahoma City Thunder' AND g.pts_away / 4 > 30) AND g.season_id = '22017'; 83 True
98
+ Show countries with more than 1000 players SELECT ioc, COUNT(*) as player_count FROM players GROUP BY ioc HAVING COUNT(*) > 1000 ORDER BY player_count DESC; USA|13102 AUS|3266 GBR|3200 ESP|3026 GER|2675 ITA|2656 FRA|2582 BRA|2092 ARG|1759 MEX|1323 JPN|1305 RUS|1093 IND|1078 RSA|1040 False
99
+ What is the highest number of points the Golden State Warriors have ever scored in a single home game? SELECT MAX(pts_home) FROM game WHERE team_abbreviation_home = 'GSW'; 149.0 True
100
+ How many right handed players are there? SELECT COUNT(*) FROM players WHERE hand = 'R'; 15666 False
101
+ What was the difference in second-chance points between the Chicago Bulls and their opponents in their closest home game of the 2016 season? SELECT o.pts_2nd_chance_home - o.pts_2nd_chance_away AS second_chance_diff FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Chicago Bulls' AND g.season_id = '22016' ORDER BY ABS(g.pts_home - g.pts_away) ASC LIMIT 1; -5 True
102
+ Find all players who have a higher rank than 'Rafael Nadal' on '2023-05-29'. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.ranking_date = 20230529 AND T2.rank < (SELECT rank FROM rankings AS T3 JOIN players AS T4 ON T3.player = T4.player_id WHERE T4.name = 'Rafael Nadal' AND T3.ranking_date = 20230529); Carlos Alcaraz Daniil Medvedev Novak Djokovic Casper Ruud Stefanos Tsitsipas Holger Rune Andrey Rublev Taylor Fritz Jannik Sinner Felix Auger Aliassime Karen Khachanov Frances Tiafoe Cameron Norrie Hubert Hurkacz False
103
+ "What is the total second chance points by the Miami Heat at home?""" SELECT SUM(pts_2nd_chance_home) as total_2nd_chance FROM other_stats WHERE team_abbreviation_home = 'MIA'; 11670.0 True
104
+ How many points did the home team score in the game with the most second chance points? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats ORDER BY (pts_2nd_chance_home + pts_2nd_chance_away) DESC LIMIT 1); 115.0 True
105
+ How many times has Pete Sampras been ranked in the top 5? SELECT COUNT(*) AS top5_count FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name = 'Pete Sampras' AND r.rank <= 5; 509 False
106
+ List the names of players who have been ranked #1 for at least one week in 2023. SELECT DISTINCT p.name FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.rank = 1 AND r.ranking_date BETWEEN 20230000 AND 20231231; Carlos Alcaraz Novak Djokovic False
107
+ How many total turnovers did the Sacramento Kings commit in the 2001 season? SELECT SUM(tov) AS total_turnovers FROM ( SELECT tov_home AS tov FROM game WHERE team_abbreviation_home = 'SAC' AND season_id = '22001' UNION ALL SELECT tov_away AS tov FROM game WHERE team_abbreviation_away = 'SAC' AND season_id = '22001' ); 1128.0 True
108
+ Which team founded in the 70s has a nickname starting with 'C'? SELECT full_name FROM team WHERE year_founded BETWEEN 1970 AND 1979 AND nickname LIKE 'C%'; Cleveland Cavaliers, Los Angeles Clippers True
109
+ What is the average height of winners in the Wimbledon final over all years? SELECT AVG(winner_ht) FROM matches WHERE tourney_name = 'Wimbledon' AND round = 'F'; 184.229508196721 False
110
+ In which season did the Boston Celtics have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1958.0 True
111
+ What is the name of the player who was ranked #1 on 2023-06-12? SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T2.rank = 1 AND T2.ranking_date = 20230612; Novak Djokovic False
112
+ What is the highest combined score in a game between the Golden State Warriors and the Cleveland Cavaliers? SELECT MAX(pts_home + pts_away) FROM game WHERE (team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers') OR (team_name_home = 'Cleveland Cavaliers' AND team_name_away = 'Golden State Warriors'); 266.0 True
113
+ What is the average scoring ouput for home teams. Round to 2 decimal places. SELECT ROUND(AVG(pts_home),2) AS avg_home_points FROM game WHERE season_type = 'Regular Season'; 104.76 True
114
+ Which team had the most fast break points in a single home game during the 2020 season? SELECT team_name_home, MAX(pts_fb_home) FROM other_stats JOIN game ON other_stats.game_id = game.game_id WHERE game.season_id = '22020'; Houston Rockets|35 True
115
+ What is the highest number of points the Los Angeles Lakers have scored in a single away game? SELECT MAX(pts_away) FROM game WHERE team_abbreviation_away = 'LAL'; 153.0 True
116
+ How many ranking entries exist for player 104925? SELECT count(*) FROM rankings WHERE player = 104925; 988 False
117
+ In which season did the Miami Heat have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Miami Heat' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2019.0 True
118
+ What is the Chicago Bulls' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'CHI' AND season_id = '22016'; 47 True
119
+ Which country has the tallest average players? SELECT ioc, AVG(height) AS avg_height FROM players GROUP BY ioc ORDER BY avg_height DESC LIMIT 1; YUG|194.0 False
120
+ How many points did the away team score when the home team had more than 20 offensive rebounds? SELECT SUM(pts_away) FROM game WHERE game_id IN (SELECT game_id FROM game WHERE oreb_home > 20); 199836.0 True
121
+ Which team played the most total games (home + away) between 1995 and 2005? SELECT team FROM (SELECT team_abbreviation_home AS team FROM game WHERE season_id BETWEEN '21995' AND '22005' UNION ALL SELECT team_abbreviation_away FROM game WHERE season_id BETWEEN '21995' AND '22005') GROUP BY team ORDER BY COUNT(*) DESC LIMIT 1; WAS True
122
+ In the 2001 season, what was the average number of second chance points scored by the opponents when the Atlanta Hawks played at home and lost? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_home = 'ATL' AND g.wl_home = 'L' AND g.season_id = '22001'; 13.333333333333334 True
123
+ How many games did the Cleveland Cavaliers lose away with more than 10 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Cleveland Cavaliers' AND g.wl_away = 'L' AND os.pts_fb_away > 10 AND g.season_id = '21996'; 4.0 True
124
+ What is the highest combined ast in any game involving the Orlando Magic? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Orlando Magic' OR team_name_away = 'Orlando Magic'; 74.0 True
125
+ What is the win percentage of left-handed vs right-handed players? SELECT winner_hand, COUNT(*) as wins, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (), 2) as win_percentage FROM matches WHERE winner_hand IN ('L', 'R') GROUP BY winner_hand; L|100862|12.85 R|683813|87.15 False
126
+ What is the average age of losers? SELECT avg(loser_age) FROM matches; 23.6776674381365 False
127
+ In which season did the Charlotte Hornets have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Charlotte Hornets' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2017.0 True
128
+ In the 2020 season, what was the average number of second chance points allowed by the New Orleans Pelicans in games they won by less than 5 points? SELECT AVG(o.pts_2nd_chance_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE ((g.team_abbreviation_home = 'NOP' AND g.wl_home = 'W' AND ABS(g.pts_home - g.pts_away) < 5) OR (g.team_abbreviation_away = 'NOP' AND g.wl_away = 'W' AND ABS(g.pts_home - g.pts_away) < 5)) AND g.season_id = '22020'; 16.6 True
129
+ Which home team had the most games with a positive plus-minus but still lost? SELECT team_name_home FROM game WHERE wl_home = 'L' AND plus_minus_home > 0 GROUP BY team_name_home ORDER BY COUNT(*) DESC LIMIT 1; West NBA All Stars West True
130
+ Find the country with the highest average player height. SELECT ioc FROM players GROUP BY ioc HAVING COUNT(*) > 5 ORDER BY AVG(height) DESC LIMIT 1; YUG False
131
+ What is the average number of ft_pct in home games by the Charlotte Hornets? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Charlotte Hornets'; 0.7601475237091683 True
132
+ Count the number of matches where both players were from the same country at the US Open. SELECT COUNT(*) FROM matches WHERE tourney_name = 'US Open' AND winner_ioc = loser_ioc; 6604 False
133
+ List all matches in 2023 where the winner was from 'ECU' and the match lasted more than 3 hours (180 mins). SELECT tourney_name, winner_name, loser_name FROM matches WHERE winner_ioc = 'ECU' AND minutes > 180 AND tourney_date >= 20230000 AND tourney_date < 20240000; Lima CH|Alvaro Guillen Meza|Ignacio Buse Montevideo CH|Alvaro Guillen Meza|Luciano Darderi Montevideo CH|Alvaro Guillen Meza|Renzo Olivo False
134
+ How many matches were played in each best-of format? SELECT best_of, COUNT(*) as match_count FROM matches WHERE LENGTH(best_of) = 1 GROUP BY best_of; 1|36 3|853783 5|67502 F|653 False
135
+ How many times did the Miami Heat score more than 120 points at home in the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'MIA' AND season_id = '22015' AND pts_home > 120; 3 True
136
+ What is the total number of wins Pete Sampras has at the US Open? SELECT COUNT(*) FROM matches WHERE winner_name = 'Pete Sampras' AND tourney_name = 'US Open'; 71 False
137
+ What is the largest margin of victory the Miami Heat have ever had in an away game? SELECT MAX(ABS(pts_away - pts_home)) AS largest_margin FROM game WHERE team_abbreviation_away = 'MIA' AND pts_away > pts_home; 34.0 True
138
+ What is the average height of all players? SELECT AVG(height) FROM players; 183.74813763746 False
139
+ How many matches did Novak Djokovic win at the Rolland Garros? SELECT COUNT(*) FROM matches WHERE winner_name = 'Novak Djokovic' AND tourney_name = 'Roland Garros'; 96 False
140
+ How many total offensive rebounds did the Houston Rockets have in away games during the 2018 season? SELECT SUM(oreb_away) FROM game WHERE team_name_away = 'Houston Rockets' AND season_id = '22018'; 419.0 True
141
+ What is the minimum age Rafael Nadal lost a match? SELECT MIN(loser_age) FROM matches WHERE loser_name = 'Rafael Nadal'; 15.2 False
142
+ What is the most common country ('ioc') for players? SELECT ioc FROM players GROUP BY ioc ORDER BY count(*) DESC LIMIT 1; USA False
143
+ List the total points for Carlos Alcaraz on the last recorded ranking date. SELECT Points FROM rankings WHERE player = (SELECT player_id FROM players WHERE name = 'Carlos Alcaraz') ORDER BY ranking_date DESC LIMIT 1; 7300.0 False
144
+ What is the highest rank achieved by Jannik Sinner in 2023? SELECT MIN(rank) FROM rankings r JOIN players p ON r.player = p.player_id WHERE p.name = 'Jannik Sinner' AND r.ranking_date BETWEEN 20230000 AND 20231231; 4 False
145
+ How many free throws did the Houston Rockets attempt in away games they won during the 2020 season? SELECT SUM(fta_away) FROM game WHERE team_name_away = 'Houston Rockets' AND wl_away = 'W' AND season_id = '22020'; 149.0 True
146
+ How many left-handed players are ranked in the top 50 on 2016-07-25? SELECT COUNT(DISTINCT p.player_id) FROM players p JOIN rankings r ON p.player_id = r.player WHERE p.hand = 'L' AND r.rank <= 50 AND r.ranking_date = 20160725; 9 False
147
+ What is the maximum number of minutes played in a single match? SELECT MAX(minutes) FROM matches; 4756.0 False
148
+ What was the average number of offensive rebounds per game for the Chicago Bulls in the 2019 season? SELECT AVG(oreb) AS avg_offensive_rebounds FROM ( SELECT game_id, oreb_home AS oreb FROM game WHERE team_name_home = 'Chicago Bulls' AND season_id = '22019' UNION ALL SELECT game_id, oreb_away AS oreb FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22019' ); 10.46153846 True
149
+ How many games did the Boston Celtics win on the road during the 2018 season? SELECT COUNT(*) AS away_wins FROM game WHERE team_name_away = 'Boston Celtics' AND wl_away = 'W' AND season_id = '22018'; 21 True
150
+ What is the lowest number of points the Golden State Warriors have scored in an away game? SELECT MIN(pts_away) FROM game WHERE team_abbreviation_away = 'GSW'; 65.0 True
151
+ What was the difference in average free throw attempts between the Brooklyn Nets and their opponents in home games during the 2020 season? SELECT AVG(fta_home - fta_away) AS fta_diff FROM game WHERE team_name_home = 'Brooklyn Nets' AND season_id = '22020'; 1.083333333 True
152
+ In which season did the Boston Celtics have the highest average tov at home? SELECT season_id, AVG(tov_home) as avg_stat FROM game WHERE team_name_home = 'Boston Celtics' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2005.0 True
153
+ How many team turnovers did the New York Knicks have at home? SELECT SUM(team_turnovers_home) as total_team_turnovers FROM other_stats WHERE team_abbreviation_home = 'NYK'; 550.0 True
154
+ How many games did the Milwaukee Bucks play at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Milwaukee Bucks' AND season_id = '22020'; 36 True
155
+ What percentage of matches are won by the taller player? SELECT (CAST(SUM(CASE WHEN winner_ht > loser_ht THEN 1 ELSE 0 END) AS FLOAT) / COUNT(*)) * 100 as percentage FROM matches WHERE winner_ht IS NOT NULL AND loser_ht IS NOT NULL; 45.2674228807518 False
156
+ Who was the number 1 ranked player on March 20, 2023? SELECT p.name FROM players AS p JOIN rankings AS r ON p.player_id = r.player WHERE r.rank = 1 AND r.ranking_date = 20230320; Carlos Alcaraz False
157
+ What is the maximum height of any opponent Jannik Sinner has beaten? SELECT MAX(loser_ht) FROM matches WHERE winner_name = 'Jannik Sinner'; 211.0 False
158
+ What is the total number of points scored by the Milwaukee Bucks away when they had more than 5 lead changes? SELECT SUM(g.pts_away) as total_points FROM game g JOIN other_stats os ON g.game_id = os.game_id WHERE g.team_name_away = 'Milwaukee Bucks' AND os.lead_changes > 5; 44835.0 True
159
+ Find the name and height of the player with ID 104745. SELECT name, height FROM players WHERE player_id = 104745; Rafael Nadal|185.0 False
160
+ How many matches were won by a player who was ranked number 1 at the time of the match? SELECT count(*) FROM matches AS T1 JOIN rankings AS T2 ON T1.winner_id = T2.player AND T1.tourney_date = T2.ranking_date WHERE T2.rank = 1; 2369 False
161
+ Find the tournament with the fewest matches overall. SELECT tourney_name, COUNT(*) AS match_count FROM matches GROUP BY tourney_name ORDER BY match_count ASC LIMIT 1; Cannes Chps|1 False
162
+ What is the highest combined ast in any game involving the Boston Celtics? SELECT MAX(ast_home + ast_away) FROM game WHERE team_name_home = 'Boston Celtics' OR team_name_away = 'Boston Celtics'; 79.0 True
163
+ How many matches has each player won? Show the top 10. SELECT winner_name, count(*) FROM matches GROUP BY winner_name ORDER BY count(*) DESC LIMIT 10; Roger Federer|1305 Jimmy Connors|1279 Novak Djokovic|1179 Rafael Nadal|1167 Ivan Lendl|1075 Guillermo Vilas|953 Ilie Nastase|950 Andre Agassi|887 John McEnroe|886 False
164
+ Which player had the most match wins? SELECT winner_name, COUNT(*) AS wins FROM matches GROUP BY winner_name ORDER BY wins DESC LIMIT 1; |26399 False
165
+ What is the average age difference between winners and losers? SELECT AVG(winner_age - loser_age) FROM matches WHERE winner_age IS NOT NULL AND loser_age IS NOT NULL; 0.35414132842825 False
166
+ What is the highest number of three-pointers made in a single game by the Houston Rockets at home? SELECT MAX(fg3m_home) FROM game WHERE team_name_home = 'Houston Rockets'; 27.0 True
167
+ Which player has defeated Rafael Nadal the most? SELECT winner_name, COUNT(*) AS wins_against FROM matches WHERE loser_name = 'Rafael Nadal' GROUP BY winner_name ORDER BY wins_against DESC LIMIT 1; Novak Djokovic|30 False
168
+ What was the average points scored by the Denver Nuggets in home games during the 2019 season? SELECT AVG(pts_home) AS avg_home_points FROM game WHERE team_name_home = 'Denver Nuggets' AND season_id = '22019'; 111.8378378 True
169
+ What is the average rank of winners in the Roland Garros tournament? SELECT AVG(r.rank) FROM matches m JOIN rankings r ON m.winner_id = r.player AND m.tourney_date = r.ranking_date WHERE m.tourney_name = 'Roland Garros'; 91.8841528594335 False
170
+ How many players from France are in the database? SELECT COUNT(*) FROM players WHERE ioc = 'FRA'; 2582 False
171
+ Which team had the best three-point shooting percentage in home games during the 2020 season? SELECT team_name_home, AVG(fg3_pct_home) AS avg_3pt_pct FROM game WHERE season_id = '22020' GROUP BY team_name_home ORDER BY avg_3pt_pct DESC LIMIT 1; LA Clippers | 0.423777777777778 True
172
+ What is the highest number of rebounds recorded by a home team in a game during the 2005 season? SELECT MAX(reb_home) FROM game WHERE season_id = '22005'; 65.0 True
173
+ What was the average margin of victory for the Boston Celtics in home games during the 2000 season? SELECT AVG(pts_home - pts_away) AS avg_victory_margin FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22000'; 9.75 True
174
+ What is the total second chance points by the Washington Wizards away? SELECT SUM(pts_2nd_chance_away) as total_2nd_chance FROM other_stats WHERE team_abbreviation_away = 'WAS'; 13226.0 True
175
+ Which team had the worst average point differential in the 2007 season? SELECT team_abbreviation, AVG(point_diff) AS avg_point_differential FROM ( SELECT team_abbreviation_home AS team_abbreviation, (pts_home - pts_away) AS point_diff FROM game WHERE season_id = '22007' UNION ALL SELECT team_abbreviation_away, (pts_away - pts_home) FROM game WHERE season_id = '22007' ) GROUP BY team_abbreviation ORDER BY avg_point_differential ASC LIMIT 1; SEA|-8.75609756097561 True
176
+ What is the average number of three-pointers made by away teams in games where they had more turnovers than assists? SELECT AVG(fg3m_away) FROM game WHERE tov_away > ast_away; 4.511052937754508 True
177
+ What was the lowest number of combined turnovers in any game involving the San Antonio Spurs during the 2019 season? SELECT MIN(o.total_turnovers_home + o.total_turnovers_away) AS min_combined_turnovers FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE (g.team_name_home = 'San Antonio Spurs' OR g.team_name_away = 'San Antonio Spurs') AND g.season_id = '22019'; 13 True
178
+ How many games had at least one team with 30+ assists? SELECT COUNT(*) FROM game WHERE ast_home >= 30 OR ast_away >= 30; 11305 True
179
+ What country is Rafael Nadal from? SELECT ioc FROM players WHERE name = 'Rafael Nadal'; ESP False
180
+ What is the average second-chance points for Toronto Raptors home games between 2015-2020? SELECT AVG(os.pts_2nd_chance_home) AS avg_second_chance FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_abbreviation_home = 'TOR' AND g.season_id BETWEEN '22015' AND '22020'; 13.07653061 True
181
+ What are the nicknames of teams based in Florida? SELECT nickname FROM team WHERE state = 'Florida'; Heat, Magic True
182
+ Get the full names of all players taller than 210 cm. SELECT name FROM players WHERE height > 210; Reilly|Opelka False
183
+ How many games did the Sacramento Kings lose away with more than 15 fast break points in 1996? SELECT COUNT(*) as losses FROM other_stats os JOIN game g ON os.game_id = g.game_id WHERE g.team_name_away = 'Sacramento Kings' AND g.wl_away = 'L' AND os.pts_fb_away > 15 AND g.season_id = '21996'; 10.0 True
184
+ Show players born in the year 2008 SELECT name, dob FROM players WHERE dob >= 20080000 AND dob < 20090000; Vito Antonio Darderi|20080113.0 False
185
+ How many times did the Memphis Grizzlies lose at home in the 2008 season despite recording more steals and blocks than their opponent? SELECT COUNT(*) FROM game g WHERE g.team_abbreviation_home = 'MEM' AND g.wl_home = 'L' AND g.stl_home > g.stl_away AND g.blk_home > g.blk_away AND g.season_id = '22008'; 3 True
186
+ How many times were games tied when the Indiana Pacers played away? SELECT SUM(times_tied) as total_times_tied FROM other_stats WHERE team_abbreviation_away = 'IND'; 4910.0 True
187
+ What was the most blocks recorded by the Orlando Magic in a single home game in the 1999 season? SELECT MAX(blk_home) AS max_blocks FROM game WHERE team_abbreviation_home = 'ORL' AND season_id = '21999'; 10.0 True
188
+ What team had the most turnovers in a single game during the 2019 season? SELECT CASE WHEN tov_home > tov_away THEN team_name_home ELSE team_name_away END AS team_with_most_turnovers FROM game WHERE season_id = '22019' ORDER BY CASE WHEN tov_home > tov_away THEN tov_home ELSE tov_away END DESC LIMIT 1 Sacramento Kings True
189
+ What was the total number of points in the only game where the sum of both teams' free throws made was exactly 42? SELECT pts_home + pts_away FROM game WHERE (ftm_home + ftm_away) = 42 LIMIT 1; 156.0 True
190
+ How many games did the Boston Celtics win at home during the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Boston Celtics' AND wl_home = 'W' AND season_id = '22020'; 21 True
191
+ How many away games did the Chicago Bulls play in the 2020 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Chicago Bulls' AND season_id = '22020'; 36.0 True
192
+ Which team had the most away games where they had more offensive than defensive rebounds? SELECT team_abbreviation_away FROM game WHERE oreb_away > dreb_away GROUP BY team_abbreviation_away ORDER BY COUNT(*) DESC LIMIT 1; ATL True
193
+ What is the Los Angeles Lakers' largest lead in a home game during the 2016 season? SELECT MAX(plus_minus_home) FROM game WHERE team_abbreviation_home = 'LAL' AND season_id = '22016'; 27 True
194
+ Who is the youngest player currently in the top 100 (based on latest ranking date)? SELECT p.name FROM players p JOIN rankings r ON p.player_id = r.player WHERE r.ranking_date = (SELECT MAX(ranking_date) FROM rankings) AND r.rank <= 100 ORDER BY p.dob DESC LIMIT 1; Jakub Mensik False
195
+ How many home games did the Los Angeles Lakers play in the 2022 season? SELECT COUNT(*) FROM game WHERE team_name_home = 'Los Angeles Lakers' AND season_id = '22022'; 41.0 True
196
+ What was the highest combined steals and blocks total for the Toronto Raptors in any home game during their championship season? SELECT MAX(stl_home + blk_home) AS combined_steals_blocks FROM game WHERE team_name_home = 'Toronto Raptors' AND season_id = '22019'; 24 True
197
+ List the players who have beaten Novak Djokovic more than twice. SELECT winner_name FROM matches WHERE loser_name = 'Novak Djokovic' GROUP BY winner_name HAVING COUNT(*) > 2; Alexander Zverev Andy Murray Andy Roddick Daniil Medvedev David Ferrer Dominic Thiem Fernando Verdasco Jannik Sinner Jo-Wilfried Tsonga Juan Martin del Potro Mikhail Youzhny Olivier Rochus Rafael Nadal Roberto Bautista Agut Roger Federer Stan Wawrinka Tomas Berdych Tommy Haas False
198
+ List the tourney_name of all tournaments won by Andy Murray in 2016. SELECT DISTINCT tourney_name FROM matches WHERE winner_name = 'Andy Murray' AND round = 'F' AND tourney_date BETWEEN 20160000 AND 20161231; Rome Masters Queen's Club Wimbledon Rio Olympics Beijing Shanghai Masters Vienna Paris Masters Tour Finals False
199
+ What is the average height of players who defeated Roger Federer? SELECT AVG(winner_ht) FROM matches WHERE loser_name = 'Roger Federer'; 186.934482758621 False
200
+ What is the total rebounds by the Miami Heat at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Miami Heat'; 65199.0 True
201
+ How many points were scored in the earliest recorded game in the database? SELECT (pts_home + pts_away) FROM game ORDER BY game_date ASC LIMIT 1; 134.0 True
202
+ In which season did the Chicago Bulls have the highest average ast at home? SELECT season_id, AVG(ast_home) as avg_stat FROM game WHERE team_name_home = 'Chicago Bulls' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 2021.0 True
203
+ What is the average number of fg_pct in home games by the Chicago Bulls? SELECT AVG(fg_pct_home) FROM game WHERE team_name_home = 'Chicago Bulls'; 0.4636694306246544 True
204
+ How many matches lasted more than 180 minutes? SELECT COUNT(*) FROM matches WHERE minutes > 180; 5425 False
205
+ How many matches were won by a player who lost the first set? SELECT COUNT(*) FROM matches WHERE score LIKE '0-6%' OR score LIKE '1-6%' OR score LIKE '2-6%' OR score LIKE '3-6%' OR score LIKE '4-6%' OR score LIKE '5-6%' OR score LIKE '6-7%'; 138823 False
206
+ What is the highest combined reb in any game involving the San Antonio Spurs? SELECT MAX(reb_home + reb_away) FROM game WHERE team_name_home = 'San Antonio Spurs' OR team_name_away = 'San Antonio Spurs'; 134.0 True
207
+ How many different countries are represented? SELECT COUNT(DISTINCT ioc) FROM players; 226 False
208
+ What is the highest combined total score (home + away) in a single game in the dataset? SELECT game_date, (pts_home + pts_away) AS total_points FROM game ORDER BY total_points DESC LIMIT 1; 2017-02-19 00:00:00|374.0 True
209
+ In games where the Brooklyn Nets scored more than 50 points in the paint at home, what was their assist-to-field goal made ratio? SELECT SUM(g.ast_home) * 1.0 / SUM(g.fgm_home) AS assist_to_fgm_ratio FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Brooklyn Nets' AND o.pts_paint_home > 50; 0.588761175 True
210
+ What is the average number of tov in home games by the Miami Heat? SELECT AVG(tov_home) FROM game WHERE team_name_home = 'Miami Heat'; 14.627184466019418 True
211
+ What is the average height of all US Open winners? SELECT AVG(winner_ht) FROM matches WHERE tourney_name = 'US Open'; 184.635440803266 False
212
+ What was the longest match by duration? SELECT tourney_name, winner_name, loser_name, minutes FROM matches WHERE minutes IS NOT NULL ORDER BY minutes DESC LIMIT 1; Guayaquil CH|Federico Coria|Tomas Lipovsek Puches|4756.0 False
213
+ Which team is based in the city of Chicago? SELECT full_name FROM team WHERE city = 'Chicago'; Chicago Bulls True
214
+ List all players from 'FRA' who were ranked in the top 50 on '2023-01-02'. SELECT T1.name FROM players AS T1 JOIN rankings AS T2 ON T1.player_id = T2.player WHERE T1.ioc = 'FRA' AND T2.rank <= 50 AND T2.ranking_date = 20230102; Adrian Mannarino Arthur Rinderknech False
215
+ What is the average age of match winners? SELECT AVG(winner_age) FROM matches WHERE winner_age IS NOT NULL; 24.0506641635802 False
216
+ What is the win-loss ratio for Jannik Sinner? SELECT SUM(CASE WHEN winner_name = 'Jannik Sinner' THEN 1 ELSE 0 END) * 1.0 / NULLIF(SUM(CASE WHEN loser_name = 'Jannik Sinner' THEN 1 ELSE 0 END), 0) AS win_loss_ratio FROM matches WHERE winner_name = 'Jannik Sinner' OR loser_name = 'Jannik Sinner'; 2.51304347826087 False
217
+ What is the average ranking of players defeated by Novak Djokovic? SELECT AVG(r.rank) FROM matches m JOIN rankings r ON m.loser_id = r.player WHERE m.winner_name = 'Novak Djokovic'; 212.317855446654 False
218
+ How many matches did Taylor Fritz win on grass courts? SELECT COUNT(*) FROM matches WHERE winner_name = 'Taylor Fritz' AND surface = 'Grass'; 32 False
219
+ What is the maximum number of team rebounds recorded by the Dallas Mavericks in away games where they committed more than 20 fouls? SELECT MAX(o.team_rebounds_away) FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_abbreviation_away = 'DAL' AND g.pf_away > 20 AND g.season_id = '22021'; 16 True
220
+ What is the highest number of assists recorded by the Indiana Pacers in a single home game? SELECT MAX(ast_home) FROM game WHERE team_name_home = 'Indiana Pacers'; 44.0 True
221
+ In which season did the Golden State Warriors have the highest average reb at home? SELECT season_id, AVG(reb_home) as avg_stat FROM game WHERE team_name_home = 'Golden State Warriors' GROUP BY season_id ORDER BY avg_stat DESC LIMIT 1; 1974.0 True
222
+ What is the average number of ft_pct in home games by the Los Angeles Lakers? SELECT AVG(ft_pct_home) FROM game WHERE team_name_home = 'Los Angeles Lakers'; 0.7450706106870195 True
223
+ What is the average number of ast in away games by the Los Angeles Lakers? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Los Angeles Lakers'; 22.594638949671772 True
224
+ What is the total points scored by the Philadelphia Warriors away? SELECT SUM(pts_away) as total_points FROM game WHERE team_name_away = 'Philadelphia 76ers'; 251917.0 True
225
+ What is the average number of ast in home games by the Boston Celtics? SELECT AVG(ast_home) FROM game WHERE team_name_home = 'Boston Celtics'; 24.886892177589857 True
226
+ How many points did the away team score in the only game where the home team had exactly 69 field goal attempts? SELECT pts_away FROM game WHERE fga_home = 69 LIMIT 1; 81.0 True
227
+ Which players scored 50 or more points in a game during the 1990s? SELECT game_id, game_date, CASE WHEN pts_home >= 50 THEN team_name_home ELSE team_name_away END AS team_name, CASE WHEN pts_home >= 50 THEN pts_home ELSE pts_away END AS points FROM game WHERE (pts_home >= 50 OR pts_away >= 50) AND CAST(SUBSTR(season_id, 2) AS INTEGER) BETWEEN 1990 AND 1999 ORDER BY points DESC True
228
+ What is the number of matches Novak Djokovic played in 2019? SELECT COUNT(*) FROM matches WHERE (winner_name = 'Novak Djokovic' OR loser_name = 'Novak Djokovic') AND tourney_date BETWEEN 20190101 AND 20191231; 65 False
229
+ Get the total number of matches Rafael Nadal has played on or after 2021. SELECT COUNT(*) FROM matches WHERE (winner_name = 'Rafael Nadal' OR loser_name = 'Rafael Nadal') AND tourney_date >= 20210101; 93 False
230
+ What is the longest match (in minutes) ever played at the US Open? SELECT MAX(minutes) FROM matches WHERE tourney_name = 'US Open'; 326.0 False
231
+ What is the total number of losses Jannik Sinner has at the Roland Garros? SELECT COUNT(*) FROM matches WHERE loser_name = 'Jannik Sinner' AND tourney_name = 'Roland Garros'; 4 False
232
+ What percentage of matches in 2023 were best of 3 sets? SELECT CAST(SUM(CASE WHEN best_of = '3' THEN 1 ELSE 0 END) AS FLOAT) / COUNT(*) * 100 FROM matches WHERE tourney_date BETWEEN 20230000 AND 20231231; 98.2765787370104 False
233
+ How many distinct countries have had a player ranked in the top 1? SELECT COUNT(DISTINCT p.ioc) FROM rankings r JOIN players p ON r.player = p.player_id WHERE r.rank = 1; 13 False
234
+ What is the lowest plus-minus score for the New York Knicks away? SELECT MIN(plus_minus_away) as min_plus_minus FROM game WHERE team_name_away = 'New York Knicks'; -47.0 True
235
+ How many times did Novak Djokovic beat Roger Federer at Wimbledon? SELECT COUNT(*) FROM matches WHERE winner_name = 'Novak Djokovic' AND loser_name = 'Roger Federer' AND tourney_name = 'Wimbledon'; 3 False
236
+ How many three-pointers did the Golden State Warriors attempt in total during the 2017 season? SELECT SUM(fg3a) AS total_three_attempts FROM ( SELECT fg3a_home AS fg3a FROM game WHERE team_abbreviation_home = 'GSW' AND season_id = '22017' UNION ALL SELECT fg3a_away AS fg3a FROM game WHERE team_abbreviation_away = 'GSW' AND season_id = '22017' ); 2369.0 True
237
+ In 2018, which team has the most home wins and how many home wins did they have? SELECT team_abbreviation_home, COUNT(*) FROM game WHERE wl_home = 'W' AND season_id = '22018' GROUP BY team_abbreviation_home ORDER BY COUNT(*) DESC LIMIT 1; (DEN, 34) True
238
+ What was the average number of fastbreak points scored by the Los Angeles Lakers in home wins during the 2020 season? SELECT AVG(o.pts_fb_home) AS avg_fastbreak_points FROM game g JOIN other_stats o ON g.game_id = o.game_id WHERE g.team_name_home = 'Los Angeles Lakers' AND g.wl_home = 'W' AND g.season_id = '22020'; 13.64705882 True
239
+ How many games did the Miami Heat lose away in the 1996 season? SELECT COUNT(*) as losses FROM game WHERE team_name_away = 'Miami Heat' AND wl_away = 'L' AND season_id = '21996'; 9.0 True
240
+ Find the name of the player who won the longest match (by minutes). SELECT winner_name FROM matches ORDER BY minutes DESC LIMIT 1; Federico Coria False
241
+ What is the average number of ast in away games by the Milwaukee Bucks? SELECT AVG(ast_away) FROM game WHERE team_name_away = 'Milwaukee Bucks'; 22.16927374301676 True
242
+ What was the average points difference in home games won by the Denver Nuggets? SELECT AVG(pts_home - pts_away) FROM game WHERE team_abbreviation_home = 'DEN' AND wl_home = 'W'; 11.96471532 True
243
+ What is the total number of rebounds by the Milwaukee Bucks at home? SELECT SUM(reb_home) as total_rebounds FROM game WHERE team_name_home = 'Milwaukee Bucks'; 76050.0 True
244
+ Which team has the nickname 'Celtics'? SELECT full_name FROM team WHERE nickname = 'Celtics'; Boston Celtics True
245
+ Which away team has scored the most points against the Miami Heat in a single game? SELECT team_name_away, pts_away FROM game WHERE team_abbreviation_home = 'MIA' ORDER BY pts_away DESC LIMIT 1; Milwaukee Bucks|144.0 True
246
+ How many points did the home team score in the game with the most lead changes and the fewest total fouls? SELECT pts_home FROM game WHERE game_id = (SELECT game_id FROM other_stats JOIN game USING(game_id) ORDER BY lead_changes DESC, (pf_home + pf_away) ASC LIMIT 1); 122.0 True
247
+ How many away games did the Miami Heat play in the 2021 season? SELECT COUNT(*) FROM game WHERE team_name_away = 'Miami Heat' AND season_id = '22021'; 41.0 True
248
+ What is the average number of tov in away games by the Miami Heat? SELECT AVG(tov_away) FROM game WHERE team_name_away = 'Miami Heat'; 15.235255570117957 True
249
+ How many times has Jannik Sinner defeated Novak Djokovic? SELECT COUNT(*) FROM matches WHERE winner_name = 'Jannik Sinner' AND loser_name = 'Novak Djokovic'; 3 False
250
+ How many times did the Boston Celtics win at home during the 2015 season? SELECT COUNT(*) FROM game WHERE team_abbreviation_home = 'BOS' AND season_id = '22015' AND wl_home = 'W'; 28 True
251
+ What is the total number of three-pointers made by the Golden State Warriors at home versus the Cleveland Cavaliers in all seasons combined? SELECT SUM(fg3m_home) AS total_threes FROM game WHERE team_name_home = 'Golden State Warriors' AND team_name_away = 'Cleveland Cavaliers'; 407 True
utils/processing/combine_datasets.ipynb CHANGED
@@ -10,7 +10,7 @@
10
  },
11
  {
12
  "cell_type": "code",
13
- "execution_count": 7,
14
  "id": "155a7ecb",
15
  "metadata": {},
16
  "outputs": [
@@ -18,29 +18,39 @@
18
  "name": "stdout",
19
  "output_type": "stream",
20
  "text": [
21
- "Total NBA dataset examples: 600\n",
22
- " natural_query \\\n",
23
- "205 How many points did the home team score in the... \n",
 
 
 
 
24
  "\n",
25
- " sql_query result \n",
26
- "205 SELECT pts_home FROM game WHERE game_id = (SEL... 122.0 \n",
 
27
  "\n",
 
 
28
  "\n",
29
- "Total Tennis dataset examples: 204\n",
30
- " natural_query \\\n",
31
- "0 Get the full names of all players taller than ... \n",
32
  "\n",
33
- " sql_query result \n",
34
- "0 SELECT name FROM players WHERE height > 210; Reilly|Opelka \n"
 
 
 
 
35
  ]
36
  },
37
  {
38
  "name": "stderr",
39
  "output_type": "stream",
40
  "text": [
41
- "C:\\Users\\Dean\\AppData\\Local\\Temp\\ipykernel_21248\\149351044.py:11: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
42
  " nba_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
43
- "C:\\Users\\Dean\\AppData\\Local\\Temp\\ipykernel_21248\\149351044.py:12: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
 
 
44
  " tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n"
45
  ]
46
  }
@@ -49,26 +59,44 @@
49
  "import pandas as pd\n",
50
  "import re\n",
51
  "\n",
52
- "SAMPLE_SIZE = 600\n",
53
  "\n",
54
  "# Open two datasets\n",
55
  "nba_df = pd.read_csv(\"../../training-data/nba_train_set.tsv\", sep='\\t')\n",
56
- "tennis_df = pd.read_csv(\"../../training-data/tennis_train_set.tsv\", sep='\\t')\n",
 
 
 
 
 
 
57
  "\n",
58
  "# Fix any spacing issues\n",
59
  "nba_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
60
  "tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
61
  "\n",
 
 
 
 
 
62
  "# Downsample NBA\n",
63
  "nba_df = nba_df.sample(n=SAMPLE_SIZE)\n",
64
  "\n",
 
 
 
65
  "# Display dataset info\n",
66
  "print(f\"Total NBA dataset examples: {len(nba_df)}\")\n",
67
  "print(nba_df.head(1))\n",
68
  "print()\n",
69
  "print()\n",
70
  "print(f\"Total Tennis dataset examples: {len(tennis_df)}\")\n",
71
- "print(tennis_df.head(1))"
 
 
 
 
72
  ]
73
  },
74
  {
@@ -81,7 +109,7 @@
81
  },
82
  {
83
  "cell_type": "code",
84
- "execution_count": 11,
85
  "id": "b3acd217",
86
  "metadata": {},
87
  "outputs": [
@@ -89,7 +117,7 @@
89
  "name": "stdout",
90
  "output_type": "stream",
91
  "text": [
92
- "Saved combined dataset with 804 rows\n"
93
  ]
94
  }
95
  ],
@@ -104,9 +132,44 @@
104
  "\n",
105
  "\n",
106
  "# Save to combined TSV\n",
107
- "combined_df.to_csv(\"../../training-data/combined_dataset.tsv\", sep=\"\\t\", index=False)\n",
108
  "print(\"Saved combined dataset with\", len(combined_df), \"rows\")"
109
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
  }
111
  ],
112
  "metadata": {
 
10
  },
11
  {
12
  "cell_type": "code",
13
+ "execution_count": 1,
14
  "id": "155a7ecb",
15
  "metadata": {},
16
  "outputs": [
 
18
  "name": "stdout",
19
  "output_type": "stream",
20
  "text": [
21
+ "Total NBA dataset examples: 500\n",
22
+ " natural_query \\\n",
23
+ "2096 How many times have the Memphis Grizzlies won ... \n",
24
+ "\n",
25
+ " sql_query result \n",
26
+ "2096 SELECT COUNT(*) FROM game WHERE (team_abbrevia... 31 \n",
27
+ "\n",
28
  "\n",
29
+ "Total Tennis dataset examples: 514\n",
30
+ " natural_query \\\n",
31
+ "1 How many players are left-handed? \n",
32
  "\n",
33
+ " sql_query result \n",
34
+ "1 SELECT COUNT(*) FROM players WHERE hand = 'L'; 1435 \n",
35
  "\n",
 
 
 
36
  "\n",
37
+ "Total Tennis test examples: 100\n",
38
+ " natural_query \\\n",
39
+ "144 What is the average ranking of players defeate... \n",
40
+ "\n",
41
+ " sql_query result \n",
42
+ "144 SELECT AVG(r.rank) FROM matches m JOIN ranking... 212.317855446654 \n"
43
  ]
44
  },
45
  {
46
  "name": "stderr",
47
  "output_type": "stream",
48
  "text": [
49
+ "C:\\Users\\Dean\\AppData\\Local\\Temp\\ipykernel_22452\\2246720866.py:17: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
50
  " nba_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
51
+ "C:\\Users\\Dean\\AppData\\Local\\Temp\\ipykernel_22452\\2246720866.py:18: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
52
+ " tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
53
+ "C:\\Users\\Dean\\AppData\\Local\\Temp\\ipykernel_22452\\2246720866.py:29: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
54
  " tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n"
55
  ]
56
  }
 
59
  "import pandas as pd\n",
60
  "import re\n",
61
  "\n",
62
+ "SAMPLE_SIZE = 500\n",
63
  "\n",
64
  "# Open two datasets\n",
65
  "nba_df = pd.read_csv(\"../../training-data/nba_train_set.tsv\", sep='\\t')\n",
66
+ "dean_df = pd.read_csv(\"../../training-data/tennis_train_set_dean.tsv\", sep='\\t')\n",
67
+ "connor_df = pd.read_csv(\"../../training-data/tennis_train_set_connor.tsv\", sep='\\t')\n",
68
+ "mehul_df = pd.read_csv(\"../../training-data/tennis_train_set_mehul.tsv\", sep='\\t')\n",
69
+ "mehul_df = mehul_df.drop('tennis', axis=1)\n",
70
+ "\n",
71
+ "# Merge all tennis datasets into one\n",
72
+ "tennis_df = pd.concat([dean_df, mehul_df], ignore_index=True)\n",
73
  "\n",
74
  "# Fix any spacing issues\n",
75
  "nba_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
76
  "tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
77
  "\n",
78
+ "# Separate testing data for tennis\n",
79
+ "test_tennis_df = tennis_df.sample(n=100)\n",
80
+ "tennis_df = pd.concat([dean_df, mehul_df, connor_df], ignore_index=True)\n",
81
+ "tennis_df = tennis_df.drop(test_tennis_df.index)\n",
82
+ "\n",
83
  "# Downsample NBA\n",
84
  "nba_df = nba_df.sample(n=SAMPLE_SIZE)\n",
85
  "\n",
86
+ "# Pull in Connor's data\n",
87
+ "tennis_df.applymap(lambda x: re.sub(r'\\s+', ' ', x) if isinstance(x, str) else x)\n",
88
+ "\n",
89
  "# Display dataset info\n",
90
  "print(f\"Total NBA dataset examples: {len(nba_df)}\")\n",
91
  "print(nba_df.head(1))\n",
92
  "print()\n",
93
  "print()\n",
94
  "print(f\"Total Tennis dataset examples: {len(tennis_df)}\")\n",
95
+ "print(tennis_df.head(1))\n",
96
+ "print()\n",
97
+ "print()\n",
98
+ "print(f\"Total Tennis test examples: {len(test_tennis_df)}\")\n",
99
+ "print(test_tennis_df.head(1))"
100
  ]
101
  },
102
  {
 
109
  },
110
  {
111
  "cell_type": "code",
112
+ "execution_count": 2,
113
  "id": "b3acd217",
114
  "metadata": {},
115
  "outputs": [
 
117
  "name": "stdout",
118
  "output_type": "stream",
119
  "text": [
120
+ "Saved combined dataset with 1014 rows\n"
121
  ]
122
  }
123
  ],
 
132
  "\n",
133
  "\n",
134
  "# Save to combined TSV\n",
135
+ "combined_df.to_csv(\"../../training-data/combined_full_dataset.tsv\", sep=\"\\t\", index=False)\n",
136
  "print(\"Saved combined dataset with\", len(combined_df), \"rows\")"
137
  ]
138
+ },
139
+ {
140
+ "cell_type": "markdown",
141
+ "id": "4ce62029",
142
+ "metadata": {},
143
+ "source": [
144
+ "# Combine tennis test data with NBA test tsv"
145
+ ]
146
+ },
147
+ {
148
+ "cell_type": "code",
149
+ "execution_count": 3,
150
+ "id": "72a934e8",
151
+ "metadata": {},
152
+ "outputs": [
153
+ {
154
+ "name": "stdout",
155
+ "output_type": "stream",
156
+ "text": [
157
+ "Saved combined test dataset with 250 rows\n"
158
+ ]
159
+ }
160
+ ],
161
+ "source": [
162
+ "nba_test_df = pd.read_csv(\"../../training-data/nba_test_set.tsv\", sep='\\t')\n",
163
+ "\n",
164
+ "nba_test_df[\"is_nba\"] = True\n",
165
+ "test_tennis_df[\"is_nba\"] = False\n",
166
+ "\n",
167
+ "combined_test_df = pd.concat([nba_test_df, test_tennis_df], ignore_index=True)\n",
168
+ "combined_test_df = combined_test_df.sample(frac=1).reset_index(drop=True)\n",
169
+ "\n",
170
+ "combined_test_df.to_csv(\"../../training-data/test_set.tsv\", sep='\\t', index=False)\n",
171
+ "print(\"Saved combined test dataset with\", len(combined_test_df), \"rows\")"
172
+ ]
173
  }
174
  ],
175
  "metadata": {
val-16-full.hf/data-00000-of-00001.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce11f8f922cdca5ebdf479c1c75b1e984f3374e0abb3a5280fe3b854835fb55e
3
+ size 10261312
val-16-full.hf/dataset_info.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "citation": "",
3
+ "description": "",
4
+ "features": {
5
+ "is_nba": {
6
+ "dtype": "bool",
7
+ "_type": "Value"
8
+ },
9
+ "input_ids": {
10
+ "feature": {
11
+ "dtype": "int32",
12
+ "_type": "Value"
13
+ },
14
+ "_type": "Sequence"
15
+ },
16
+ "attention_mask": {
17
+ "feature": {
18
+ "dtype": "int8",
19
+ "_type": "Value"
20
+ },
21
+ "_type": "Sequence"
22
+ },
23
+ "labels": {
24
+ "feature": {
25
+ "dtype": "int64",
26
+ "_type": "Value"
27
+ },
28
+ "_type": "Sequence"
29
+ }
30
+ },
31
+ "homepage": "",
32
+ "license": ""
33
+ }
val-16-full.hf/state.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_data_files": [
3
+ {
4
+ "filename": "data-00000-of-00001.arrow"
5
+ }
6
+ ],
7
+ "_fingerprint": "01e8202b5757102a",
8
+ "_format_columns": null,
9
+ "_format_kwargs": {},
10
+ "_format_type": null,
11
+ "_output_all_columns": false,
12
+ "_split": null
13
+ }