Kevin Gold commited on
Commit
daebe13
·
1 Parent(s): fd80aaf

Lecture Python examples added

Browse files
Files changed (2) hide show
  1. app.py +4 -1
  2. master.py +2263 -0
app.py CHANGED
@@ -26,7 +26,10 @@ HF_TOKEN=os.environ.get("HF_TOKEN")
26
  PROFILES_URL = "https://huggingface.co/datasets/klgold/tutor_profiles"
27
  PROMPT_PREFIX="Give me a practice problem for an introductory course in python and data science that uses the following concepts: "
28
  prefix_length = len(PROMPT_PREFIX)
29
- PROMPT_SUFFIX = 'The first section of your response should say "PROBLEM" followed by the problem. The second section of your response should say "SOLUTION" before the Python code that solves the problem. The third section of your response should say "HINT" before a hint that would help with the one issue the student was most likely to get stuck on.'
 
 
 
30
  suffix_length = len(PROMPT_SUFFIX)
31
  #Currently get error on trying to create Repository - need email?
32
  repo = Repository(
 
26
  PROFILES_URL = "https://huggingface.co/datasets/klgold/tutor_profiles"
27
  PROMPT_PREFIX="Give me a practice problem for an introductory course in python and data science that uses the following concepts: "
28
  prefix_length = len(PROMPT_PREFIX)
29
+ PROMPT_SUFFIX = 'The first section of your response should say "PROBLEM" followed by the problem. The second section of your response should say "SOLUTION" before the Python code that solves the problem. The third section of your response should say "HINT" before a hint that would help with the one issue the student was most likely to get stuck on. Additionally, in your solution you must
30
+ not use any Python keywords, syntax, or concepts that are not included in the Python examples that follow, which are all the examples provided in lecture:'
31
+ with open('master.py', 'r') as pythonfile:
32
+ PROMPT_SUFFIX += pythonfile.read()
33
  suffix_length = len(PROMPT_SUFFIX)
34
  #Currently get error on trying to create Repository - need email?
35
  repo = Repository(
master.py ADDED
@@ -0,0 +1,2263 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Lecture2HelloWorldAndExpressions.py
3
+ print('Hello, world!')
4
+ print('Hello 1')
5
+ print('Hello 2')
6
+ print('Hello 3')
7
+ print('Hello, world!')
8
+ print(Hello, world!)
9
+ print(Hello, world!) # Intentionally creates an error!
10
+ print(1) # Technically an expression
11
+ print(1+2) # Two operands and an operator make an expression
12
+ print(10*(10+1)) # The expression (10+1) acting as an operand
13
+ print(3 + 8 / 2) # What do you predict?
14
+ print(4 * 2 + 3 + 5 * 2) # And this one?
15
+ print('Hello', 'world', '!')
16
+ print(max(2,5,7))
17
+ print(max(2,7) + max(3,9)) # Using function calls as operands
18
+ print(max(2,7) + max(3,9)) # Calc 7, calc 9, then add
19
+ 1
20
+ 2
21
+ 3
22
+ max(2,7)
23
+ None
24
+ print(2) + 2
25
+ print('Hello, world!')
26
+ max(2 ** 8, 3 ** 6, 5 ** 3)
27
+ 1.0000000000000001 - 1
28
+ print(type(-100)) # int
29
+ print(type(10.1)) # float
30
+ print(type('A')) # str
31
+ print(type(True)) # bool
32
+ print(type('10')) # str
33
+ print(type(10)) # int
34
+ print(type(10.0)) # float
35
+ print(type(True)) # bool
36
+ 0.1 + 0.1 + 0.1
37
+ 'Hello ' + 1111
38
+ 'Hello ' + 'world' + '!'
39
+ 'Hello ' + str(1111)
40
+ 20 * 9/5 + 32
41
+ print('Temp: 68.0 F')
42
+ print('Temp: ' + 20 * 9/5 + 32 + ' F')
43
+ print('Temp: ' + str(20 * 9/5 + 32) + ' F')
44
+ # Lecture3VariablesAndConditions.py
45
+ two_to_the_eighth = 2 ** 8
46
+ print(two_to_the_eighth)
47
+ two_to_the_eighth * 2
48
+ pay_per_hour = 18
49
+ pay_per_hour = 20 # Pay raise!
50
+ print(pay_per_hour)
51
+ counter = 0
52
+ counter = counter + 1 # It's an instruction, not an equality!
53
+ print(counter)
54
+ counter = counter + 1
55
+ print(counter)
56
+ pay_per_hour = 20
57
+ hours = 40
58
+ total_pay = pay_per_hour * hours
59
+ print(total_pay)
60
+ Pay_Per_Hour = 15 # please avoid this capitalization style!
61
+ print(pay_per_hour) # remembers the lowercase value
62
+ silent_assignment = 0
63
+ 20 = pay_per_hour
64
+ print(undefined_var + 7)
65
+ color = input('What is your favorite color? ')
66
+ print('Yeah, ' + color + ' is pretty great!')
67
+ to_square_str = input('What should I square? ')
68
+ print(int(to_square_str) ** 2)
69
+ city = input('What city are we in? ')
70
+ print(city == 'Boston')
71
+
72
+ answer = input('What is 2+2? ')
73
+ print(answer == 4) # not going to work
74
+ answer == '4' # but this works
75
+ int(answer) == 4 # or this
76
+ float(answer) == 4 # or even this
77
+ print(1 < 1)
78
+ print(1 > 1)
79
+ print(1 != 1)
80
+ print(1 <= 1)
81
+ print(1 >= 1)
82
+ print('aardvark' < 'zebra')
83
+ print('capitalized' == 'Capitalized')
84
+ 2 + 5 > 7 - 4 # 5 > 7 would be false, but (2+5) > (7-4) is True
85
+ total = 0
86
+ value_str = input('Enter a value: ')
87
+ value_int = int(value_str)
88
+ if value_int < 0:
89
+ print('Sorry, that was a negative value.')
90
+ else:
91
+ total = total + value_int
92
+ print(total)
93
+ if condition:
94
+ statement_if_true1
95
+ statement_if_true2
96
+ statement_if_true3
97
+ ...
98
+ else:
99
+ statement_if_false1
100
+ statement_if_false2
101
+ ...
102
+ statement_regardless1
103
+ statement_regardless2
104
+ ...
105
+ value = int(input('Enter an integer:'))
106
+ if value < 0:
107
+ print('Negative')
108
+ else:
109
+ print('Positive')
110
+ print('Done')
111
+ password = input('Enter the password: ')
112
+ if password == '1234':
113
+ print('Correct!')
114
+ print('Your account has $1000000 in it.')
115
+ else:
116
+ print('Incorrect.')
117
+ print('Have a nice day.')
118
+ num1_str = input('Enter an integer: ')
119
+ num2_str = input('Enter a different integer: ')
120
+ num1_int = int(num1_str)
121
+ num2_int = int(num2_str)
122
+ if num1_str == num2_str:
123
+ print('The numbers were supposed to be different...')
124
+ print('But you entered ' + num1_str + ' twice!')
125
+ else:
126
+ print(num2_str + ' divided by ' + num1_str + ' is...')
127
+ print(num2_int / num1_int) # Divide by zero would be error, btw
128
+ print('Done...')
129
+ language = input('What is your favorite language? ')
130
+ if language == 'Python':
131
+ print('Mine too!')
132
+ print('But there sure are a lot of languages out there....')
133
+ value = int(input('Enter an integer between 0 and 100: '))
134
+ if value < 0:
135
+ print('No negative numbers!')
136
+ elif value > 100:
137
+ print('That value is too large!')
138
+ elif value == 42:
139
+ print('That was the number I was thinking of!')
140
+ else:
141
+ print('Guess again.')
142
+
143
+ value = int(input('Enter an integer between 0 and 100: '))
144
+ if value < 0:
145
+ print('No negative numbers!')
146
+ elif value > 100:
147
+ print('That value is too large!')
148
+ elif value >= 50:
149
+ print('Big!')
150
+ else:
151
+ print('Small!')
152
+ value = int(input('Enter an integer between 0 and 100: '))
153
+ if value < 0:
154
+ print('No negative numbers!')
155
+ else:
156
+ if value > 100:
157
+ print('That value is too large!')
158
+ else:
159
+ if value >= 50:
160
+ print('Big!')
161
+ else:
162
+ print('Small!')
163
+ age = int(input('Enter your age: '))
164
+ if age < 18:
165
+ if age < 5:
166
+ print('Just a toddler, then.')
167
+ elif age < 12:
168
+ print('Not quite a teenager, then.')
169
+ else:
170
+ print('Teenage years ... a difficult time!')
171
+ else:
172
+ print('An adult, then.')
173
+ if age >= 55:
174
+ print('And a senior citizen, too!')
175
+ num1 = int(input('First number: '))
176
+ num2 = int(input('Second number: '))
177
+ num3 = int(input('Third number: '))
178
+ my_max = max(num1, num2, num3)
179
+ my_min = min(num1, num2, num3)
180
+ my_mean = (num1+num2+num3)/3 # Note importance of parens!
181
+ print('Min: ' + str(my_min))
182
+ print('Max: ' + str(my_max))
183
+ print('Mean: ' + str(my_mean))
184
+ if num1 == num2:
185
+ print(str(num1) + ' was repeated')
186
+ elif num2 == num3:
187
+ print(str(num2) + ' was repeated')
188
+ elif num1 == num3:
189
+ print(str(num3) + ' was repeated')
190
+ else:
191
+ print('The numbers were unique')
192
+ # Lecture4WhileAndLists.py
193
+ string = input('Enter a number: ')
194
+ while string != 'stop':
195
+ print(string + ' squared is ' + str(int(string) ** 2))
196
+ string = input('Enter a number: ')
197
+ print('Done.')
198
+ counter = 0
199
+ while counter < 21:
200
+ print(counter)
201
+ counter = counter + 1
202
+ print(counter)
203
+ counter = 1
204
+ print('We will now iterate three times...')
205
+ while counter < 4:
206
+ print('Iteration ' + str(counter))
207
+ counter = counter + 1
208
+ total = 0
209
+ count = 0
210
+ value_str = input('Enter a number, or "done" if done: ')
211
+ while value_str != 'done':
212
+ count = count + 1
213
+ value_int = int(value_str)
214
+ total = total + value_int
215
+ value_str = input('Enter a number, or "done" if done: ')
216
+ if count > 0:
217
+ print('The average is ' + str(total/count))
218
+ total = 0
219
+ count = 0
220
+ value_str = input('Enter a number, or "done" if done: ')
221
+ while value_str != 'done':
222
+ count += 1
223
+ value_int = int(value_str)
224
+ total += value_int
225
+ value_str = input('Enter a number, or "done" if done: ')
226
+ if count > 0:
227
+ print('The average is ' + str(total/count))
228
+ while(True):
229
+ input('Enter any input to get a compliment: ')
230
+ print('That is so clever of you!')
231
+ my_list = ['duck', 'duck', 'goose'] # A list with 3 items
232
+ print(my_list[0])
233
+ print(my_list[1])
234
+ print(my_list[2])
235
+ my_list = ['duck', 'duck', 'goose']
236
+ my_list[2] = 'bear'
237
+ print(my_list)
238
+ my_list = [1, 2, 3]
239
+ my_list.append(4)
240
+ print(my_list) # my_list has changed...
241
+ print(my_list.append(5))
242
+ print(my_list)
243
+ shopping_list = []
244
+ item = input('Add an item to the shopping list (or "done"): ')
245
+ while item.lower() != 'done':
246
+ shopping_list.append(item)
247
+ item = input('Add an item to the shopping list (or "done"): ')
248
+ print('Okay, so that was: ')
249
+ print(shopping_list)
250
+ [1, 2, 3] + [4, 5, 6]
251
+ print(len('Hello'))
252
+ print(len([1, 2, 3]))
253
+ my_items = ['eggs', 'flour', 'milk']
254
+ print(len(my_items), 'items')
255
+ print(my_items[2])
256
+ print(my_items[len(my_items)-1])
257
+ planet_diameter_km = [4879, 12104, 12756, 6792, 142984, 120536, 51118, 49528, 2377]
258
+ planet_diameter_km.sort()
259
+ planet_diameter_km
260
+ my_list1 = [3, 2, 1]
261
+ my_list2 = my_list1
262
+ my_list1.sort()
263
+ print(my_list1)
264
+ print(my_list2)
265
+ my_list1 = [3, 2, 1]
266
+ my_list2 = my_list1.copy()
267
+ my_list1.sort()
268
+ print(my_list1)
269
+ print(my_list2)
270
+ honors = ['Albert', 'Berenice', 'Chen', 'Dominique']
271
+ mentioned_honors = []
272
+ nonhonors = []
273
+ student = input('Enter a name (or "done"): ')
274
+ while (student != 'done'):
275
+ if student in honors:
276
+ print('Honors!')
277
+ mentioned_honors.append(student)
278
+ else:
279
+ print('Not honors...')
280
+ nonhonors.append(student)
281
+ student = input('Enter a name (or "done"): ')
282
+ print('Honors mentioned: ' + str(mentioned_honors))
283
+ print('Nonhonors mentioned: ' + str(nonhonors))
284
+ # Lecture5MorePower.py
285
+ percent = input('Enter a percentage between 0 and 100:')
286
+ if float(percent) >= 0 and float(percent) <= 100:
287
+ if float(percent) >= 10:
288
+ print('A decent return on investment....')
289
+ else:
290
+ print('Not a great return on investment....')
291
+ else:
292
+ print('That is not in the requested range!')
293
+ vip = False
294
+ spent = 10
295
+ if vip or spent >= 10000:
296
+ print('Send this person a loyalty reward!')
297
+ else:
298
+ print('This person deserves nothing!')
299
+ vip = False
300
+ if not vip:
301
+ print('Have you considered signing up to join the VIP program?')
302
+ else:
303
+ print('Welcome back, VIP customer!')
304
+ vip = False
305
+ spent = 0
306
+ if not vip or spent < 10000: # "not" applied to vip before "or"
307
+ print('Please spend more')
308
+ else:
309
+ print('Hello, valued patron!')
310
+ vip = False
311
+ spent = 0
312
+ if not (vip or spent < 10000): # within parens evaluates to True
313
+ print('Please spend more')
314
+ else:
315
+ print('Hello, valued patron!')
316
+ my_list = [1,2,3]
317
+ my_list2 = [7,8,9]
318
+ if not 4 in my_list and not 4 in my_list2:
319
+ print('No 4 found')
320
+ my_list = [1,2,3]
321
+ my_list2 = [7,8,9]
322
+ if 4 not in my_list and not in my_list2:
323
+ print('This will actually cause an error - not how "in" works')
324
+ import math
325
+ math.sqrt(2)
326
+ import math as m
327
+ m.sqrt(2)
328
+ from math import sqrt as my_sqrt
329
+ my_sqrt(2)
330
+ get_ipython().system('python3 -m ensurepip --upgrade')
331
+ get_ipython().system('pip install seaborn')
332
+ import seaborn as sns
333
+ df = sns.load_dataset("penguins") # Load a dataset about penguins
334
+ sns.jointplot(data=df, x="flipper_length_mm", y="bill_length_mm", hue="species")
335
+ import statistics
336
+ statistics.median([1, 2, 3, 4])
337
+ import statistics
338
+ statistics.median([1, 2, 3, 4])
339
+ total = 0
340
+ count = 0
341
+ value_str = input('Enter a number, or "done" if done: ')
342
+ while value_str != 'done':
343
+ count = count + 1
344
+ value_int = int(value_str)
345
+ total = total + value_int
346
+ value_str = input('Enter a number, or "done" if done: ')
347
+ if count > 0:
348
+ print('The average is ' + str(total/count))
349
+ total = 0
350
+ count = 0
351
+ value_str = input('Enter a non-negative integer, or "done" if done: ')
352
+ while value_str != 'done':
353
+ if not value_str.isdigit():
354
+ print('Non-negative integers only!')
355
+ else:
356
+ count = count + 1
357
+ value_int = int(value_str)
358
+ total = total + value_int
359
+ value_str = input('Enter a non-negative integer, or "done" if done: ')
360
+ if count > 0:
361
+ print('The average is ' + str(total/count))
362
+ total = 0
363
+ count = 0
364
+ value_str = input('Enter a number, or "done" if done: ')
365
+ while value_str != 'done':
366
+ count = count + 1
367
+ value_int = int(value_str)
368
+ total = total + value_int
369
+ print(value_str)
370
+ if count > 0:
371
+ print('The average is ' + str(total/count))
372
+ 3 = my_list
373
+ total = 0
374
+ count = 0
375
+ value_str = input('Enter a number, or "done" if done: ')
376
+ count = count + 1
377
+ value_int = int(value_str)
378
+ total = total + value_int
379
+ if count > 0:
380
+ print('The average is ' + str(total/count))
381
+ # Lecture6and7Iteration.py
382
+ people = ['Alice', 'Bob', 'Che']
383
+ index = 0
384
+ while index < len(people):
385
+ person = people[index]
386
+ print('Hooray for ' + person + '!')
387
+ index += 1
388
+ people = ['Alice', 'Bob', 'Che']
389
+ for person in people:
390
+ print('Hooray for ' + person + '!')
391
+ running_total = 0
392
+ numbers = [1,2,3,4,10]
393
+ for n in numbers:
394
+ running_total = running_total + n # Could be abbreviated running_total += n
395
+ print('Sum so far: ' + str(running_total))
396
+ print('Sum: ' + str(running_total))
397
+ my_grades = [4, 3, 2, 3, 4]
398
+ letter_grades = []
399
+ for g in my_grades:
400
+ if g == 4:
401
+ letter_grades.append('A')
402
+ elif g == 3:
403
+ letter_grades.append('B')
404
+ elif g == 2:
405
+ letter_grades.append('C')
406
+ print(letter_grades)
407
+ temps_f = [36, 39, 45, 56, 66, 76, 81, 80, 72, 61, 51, 41] # Jan through Dec
408
+ temps_c = []
409
+ for t in temps_f:
410
+ degrees_c = (t - 32)*5/9
411
+ temps_c.append(round(degrees_c, 2)) # Round to 2 decimal places
412
+ temps_c
413
+ my_car = ("Honda Fit", 2010, 30, 10000)
414
+ print(my_car)
415
+ car_type, year, mpg, price = my_car
416
+ print(mpg)
417
+ print(my_car[0] + ' prints successfully') # OK
418
+ my_car[0] = 'bad value' # Not OK, trying to change the tuple
419
+ my_movies = [("No", 4), ("Rogue One", 4.5), ("Casablanca", 5)]
420
+ for moviename, stars in my_movies: # Notice the two variable names
421
+ print ('I would rate ' + moviename + ' ' + str(stars) + ' stars')
422
+ my_movies = [("No", 4), ("Rogue One", 4.5), ("Casablanca", 5)]
423
+ best_rating = 0 # Initialize with a value that is definitely beat
424
+ best_movie = "none"
425
+ for movie, rating in my_movies:
426
+ if rating > best_rating:
427
+ best_rating = rating
428
+ best_movie = movie
429
+ print("Best movie: " + best_movie + "...rating = " + str(best_rating))
430
+ movies = ['Fall Guy', 'Free Guy', 'Cable Guy']
431
+ ratings = [5, 4, 3]
432
+ for movie, rating in zip(movies, ratings):
433
+ print("I'd rate " + movie + " a " + str(rating))
434
+ sw_movies = [('The Phantom Menace', 52),
435
+ ('Attack of the Clones', 65),
436
+ ('Revenge of the Sith', 80),
437
+ ('Rogue One', 84),
438
+ ('Solo', 70),
439
+ ('Star Wars', 92),
440
+ ('The Empire Strikes Back',94),
441
+ ('Return of the Jedi', 82),
442
+ ('The Force Awakens', 93),
443
+ ('The Last Jedi', 90),
444
+ ('The Rise of Skywalker', 51)]
445
+ my_list = []
446
+ for movie, score in sw_movies:
447
+ if score >= 80:
448
+ my_list.append(movie)
449
+ print(my_list)
450
+ for i in range(5):
451
+ print ("Iteration " + str(i))
452
+ for i in range(1,6):
453
+ print(i)
454
+ my_itinerary = ['Boston', 'Atlanta', 'LA', 'Seattle']
455
+ for idx in range(len(my_itinerary)-1): # Avoid indexing out of bounds
456
+ print(my_itinerary[idx] + '-' + my_itinerary[idx+1])
457
+ names = ['Alice', 'Bob', 'Charlie', 'Dora']
458
+ for number, name in enumerate(names):
459
+ print(name + ' ' + str(number))
460
+ for movie, rating in sw_movies:
461
+ print('Looking at ' + movie)
462
+ if movie == 'Rogue One':
463
+ print('The rating of Rogue One is ' + str(rating))
464
+ break # We don't need to look at any other entries
465
+ print('Done')
466
+ my_two_stock_histories = [[40.1, 40.2, 39.9, 40.2],
467
+ [100.2, 99.9, 100.0, 103.1]]
468
+ my_two_stock_histories = [[40.1, 40.2, 39.9, 40.2],
469
+ [100.2, 99.9, 100.0, 103.1]]
470
+ my_two_stock_histories[1]
471
+ my_two_stock_histories = [[40.1, 40.2, 39.9, 40.2],
472
+ [100.2, 99.9, 100.0, 103.1]]
473
+ my_two_stock_histories[1][2]
474
+ my_stock_histories = my_two_stock_histories.copy()
475
+ my_stock_histories.append([5.0, 9.0, 6.0, 7.0])
476
+ print(my_stock_histories)
477
+ print('Stock 0 closing prices: ')
478
+ for price in my_stock_histories[0]:
479
+ print(price)
480
+ print('Starting prices for all stocks:')
481
+ for stock_list in my_stock_histories:
482
+ print(stock_list[0])
483
+ letters = ['a', 'b', 'c','d','e','f','g','h','i','j']
484
+ print('All possible coordinates in Battleship:')
485
+ for l in letters:
486
+ for n in range(1,11):
487
+ print(l + str(n))
488
+ bills = [[1, 2, 3], [4,5,6], [7,8,9]]
489
+ my_totals = [] # empty list
490
+ for l in bills:
491
+ print('new list')
492
+ listsum = 0
493
+ for l2 in l: # iterating over the list we got from the outer foreach
494
+ print('adding ' + str(l2))
495
+ listsum += l2
496
+ my_totals.append(listsum)
497
+ print('Bill sums:' + str(my_totals))
498
+ print('Possible matchups:')
499
+ players = ['Alice', 'Bobby', 'Caspar', 'Dmitri', 'Eve']
500
+ for white_player in players:
501
+ for black_player in players:
502
+ print("White: " + white_player + "; Black player: " + black_player)
503
+ print('Possible matchups:')
504
+ players = ['Alice', 'Bobby', 'Caspar', 'Dmitri', 'Eve']
505
+ for white_player in players:
506
+ for black_player in players:
507
+ if not white_player == black_player:
508
+ print("White: " + white_player + "; Black player: " + black_player)
509
+ my_multiples_of_3 = [v * 3 for v in range(5)]
510
+ my_multiples_of_3
511
+ unrounded = [1.9, 5.3, 9.9]
512
+ rounded = [round(i,0) for i in unrounded]
513
+ rounded
514
+ unrounded = [1.9, 5.3, 9.9]
515
+ rounded = []
516
+ for item in unrounded:
517
+ rounded.append(round(item,0))
518
+ print(rounded)
519
+ temps_f = [36, 39, 45, 56, 66, 76, 81, 80, 72, 61, 51, 41] # Jan through Dec
520
+ temps_c = [round((t-32)*5/9,2) for t in temps_f]
521
+ temps_c
522
+ times = [(2,30), (4,10), (1, 30), (0,40), (0, 20)]
523
+ minutes = [t[0]*60 + t[1] for t in times]
524
+ minutes
525
+ # Lecture8and9Functions.py
526
+ def add_an_s(string):
527
+ new_string = string + 's'
528
+ return new_string
529
+ add_an_s('example') + '!'
530
+ records = read_customer_data('input.csv')
531
+ sales = 0
532
+ purchase_counts = []
533
+ s_names = []
534
+ for record in records:
535
+ name, purchase_list, sale_info = parse_record(record)
536
+ s_names.append(standardize_name(name))
537
+ sales = update_total_sales(sales, sale_info)
538
+ update_purchase_counts(purchase_counts, purchase_list)
539
+ write_to_file(s_names, purchase_counts, sales, 'output.csv')
540
+ def add_two(my_number):
541
+ # Adds two to the argument.
542
+ return my_number + 2
543
+ add_two(2)
544
+ def count_matches(to_match, my_list):
545
+ # Counts how many times to_match appears in my_list
546
+ count = 0
547
+ for m in my_list:
548
+ if to_match == m:
549
+ count += 1
550
+ return count
551
+ print(count_matches(5, [5, 6, 7, 5]))
552
+ print(count_matches("foo", ["foo","bar","baz"]))
553
+ def percent_gain(start, finish):
554
+ return (finish-start)/start * 100
555
+ print(percent_gain(36585.06, 33147.25))
556
+ print(percent_gain(4796.56, 3839.50))
557
+ print(percent_gain(15832.80, 10466.48))
558
+ def get_rating(movie_tuple):
559
+ # More readable way to access a movie rating
560
+ return movie_tuple[1]
561
+ get_rating(('Portrait of a Lady on Fire', 5))
562
+ def with_tax(price, tax):
563
+ return round(price * (1 + tax * .01), 2)
564
+ with_tax(1,8.6)
565
+ from datetime import date
566
+ def greet_user():
567
+ print("Hello, user!")
568
+ print("Today's date is " + str(date.today()))
569
+ greet_user()
570
+ def greet_user():
571
+ print("Hello, user!")
572
+ print("Today's date is " + str(date.today()))
573
+ return
574
+ print(greet_user())
575
+ def longest_customer_name(list_of_names):
576
+ # Find the longest customer name, and how long it is
577
+ # (maybe so we can display the names nicely later)
578
+ longest_len = 0
579
+ longest_name = ""
580
+ for n in list_of_names:
581
+ if len(n) > longest_len:
582
+ longest_len = len(n)
583
+ longest_name = n
584
+ return longest_name, longest_len
585
+ name, length = longest_customer_name(['Alice', 'Bob', 'Cassia'])
586
+ print(name)
587
+ print(length)
588
+ from statistics import mean
589
+ def min_mean_max(L):
590
+ return min(L), mean(L), max(L)
591
+ min_mean_max([1,2,3,4,5])
592
+ def count_items(lst):
593
+ # Count items but warn if the list is empty
594
+ if (len(lst) == 0):
595
+ print('Warning: empty list passed to count_items!')
596
+ return 0
597
+ print("We don't get here with an empty list")
598
+ return len(lst)
599
+ count_items([])
600
+ def is_prime(n):
601
+ for i in range(2, n): # Look for a divisor
602
+ if n % i == 0: # i divides n evenly, no remainder
603
+ return False
604
+ return True # didn't find a divisor
605
+ print(is_prime(11))
606
+ print(is_prime(4))
607
+ def longest_customer_name(list_of_names):
608
+ # Find the longest customer name, and how long it is
609
+ # (maybe so we can display the names nicely later)
610
+ longest_len = 0
611
+ longest_name = ""
612
+ for n in list_of_names:
613
+ if len(n) > longest_len:
614
+ longest_len = len(n)
615
+ longest_name = n
616
+ return longest_name, longest_len
617
+ def count_matches(to_match, my_list):
618
+ # Counts how many times to_match appears in my_list
619
+ count = 0
620
+ for m in my_list:
621
+ if to_match == m:
622
+ count += 1
623
+ return count
624
+ def count_longest_name(list_of_names):
625
+ # Count how many times the longest name appears in the list
626
+ # Makes use of functions defined above
627
+ word, length = longest_customer_name(list_of_names)
628
+ return count_matches(word,list_of_names)
629
+ count_longest_name(['Alice','Bob','Catherine','Catherine'])
630
+ def all_names_short_enough1(names, limit):
631
+ for name in names:
632
+ if len(name) > limit:
633
+ return False
634
+ return True
635
+ print(all_names_short_enough1(['Alice', 'Bob'], 3))
636
+ print(all_names_short_enough1(['Alice', 'Bob'], 5))
637
+ def all_names_short_enough2(names, limit):
638
+ name, length = longest_customer_name(names)
639
+ return length <= limit
640
+ print(all_names_short_enough2(['Alice', 'Bob'], 3))
641
+ print(all_names_short_enough2(['Alice', 'Bob'], 5))
642
+ def add5(arg):
643
+ b = arg + 5
644
+ return b
645
+ add5(7) # Return 12
646
+ def pattern_a(price, tax):
647
+ return price * (1 + 0.01 * tax) # Everything we need is in the arguments - good
648
+ tax = 20 # Global variable - this is worse style
649
+ def pattern_b(price):
650
+ return price * (1 + 0.01 * tax) # Works, but less flexible, hard to debug
651
+ print(pattern_a(100,20))
652
+ print(pattern_b(100))
653
+ def add_two(my_number):
654
+ a = my_number + 2 # Shadows outer "a", now we have two a's and see this one
655
+ print("a is " + str(a) + " inside add_two")
656
+ return a
657
+ a = 5
658
+ print("add_two(2) is " + str(add_two(2)))
659
+ print("a is " + str(a) + " outside add_two")
660
+ my_list = ['a','b','c']
661
+ def concatenate_all(my_list):
662
+ out = ''
663
+ for item in my_list:
664
+ out += item
665
+ return out
666
+ print(concatenate_all(['d','e'])) # ['d','e'] is called my_list in the function
667
+ print(concatenate_all(my_list)) # my_list is still a,b,c
668
+ names = ["Catherine", "Donovan", "alice", "BOB"]
669
+ standardized_names = []
670
+ for name in names:
671
+ name = name.capitalize() # Capitalize first letter, lc others
672
+ standardized_names.append(name)
673
+ standardized_names.sort()
674
+ jobs = ['Pilot', 'teacheR', 'firefighter', 'LIBRARIAN']
675
+ standardized_jobs = []
676
+ for job in jobs:
677
+ job = job.capitalize()
678
+ standardized_jobs.append(job)
679
+ standardized_jobs.sort()
680
+ print(standardized_names)
681
+ print(standardized_jobs)
682
+ names = ["Catherine", "Donovan", "alice", "BOB"]
683
+ jobs = ['Pilot', 'teacheR', 'firefighter', 'LIBRARIAN']
684
+ def standardize_strings(string_list):
685
+ out = []
686
+ for s in string_list:
687
+ s = s.capitalize()
688
+ out.append(s)
689
+ out.sort()
690
+ return out
691
+ standard_names = standardize_strings(names)
692
+ standard_jobs = standardize_strings(jobs)
693
+ print(standard_names)
694
+ print(standard_jobs)
695
+ def get_first_letter(word):
696
+ """ Returns the first letter of a string.
697
+ word (str): The string to get the letter from.
698
+ A simple function just for demo purposes. Probably
699
+ not useful since get_first_letter takes more characters
700
+ to type than string[0].
701
+ """
702
+ return word[0]
703
+ get_ipython().run_line_magic('pinfo', 'get_first_letter')
704
+ print(get_first_letter("Shibboleth") == "S")
705
+ print(pattern_a(100,20) == 120)
706
+ print(pattern_a(0, 20) == 0)
707
+ print(count_matches("A",[]) == 0)
708
+ print(count_matches("A", ["A","A","A"]) == 3)
709
+ # Lecture10Hashes.py
710
+ my_menu_dict = {
711
+ "Salmon": 25,
712
+ "Steak": 30,
713
+ "Mac and cheese" : 18
714
+ }
715
+ print(my_menu_dict["Salmon"])
716
+ my_menu_dict = {} # empty dictionary
717
+ my_menu_dict["Salmon"] = 25
718
+ my_menu_dict["Steak"] = 30
719
+ my_menu_dict["Mac and cheese"] = 18
720
+ print(my_menu_dict["Salmon"])
721
+ my_dict = {}
722
+ my_dict.get('sushi', 0)
723
+ two_cities = """It was the best of times, it was the worst of times,
724
+ it was the age of wisdom, it was the age of foolishness, it was the epoch of belief,
725
+ it was the epoch of incredulity, it was the season of light, it was the season of darkness,
726
+ it was the spring of hope, it was the winter of despair."""
727
+ worddict = {}
728
+ wordlist = two_cities.split()
729
+ for word in wordlist:
730
+ if word in worddict: # Check for presence of key
731
+ worddict[word] += 1
732
+ else:
733
+ worddict[word] = 1
734
+ print(worddict["age"])
735
+ print(worddict["of"])
736
+ for word, count in worddict.items():
737
+ print(word + ":" + str(count))
738
+ def word_prob(word, worddict):
739
+ numerator = worddict.get(word, 0)
740
+ denominator = 0
741
+ for word, count in worddict.items():
742
+ denominator += count
743
+ return numerator / denominator
744
+ print(word_prob('winter', worddict)) # Should be 1/60 = 0.0167 or so
745
+ print(word_prob('season', worddict)) # Should be 2/60 = 0.0333 or so
746
+ print(word_prob('Pokemon', worddict)) # Should be 0 with no errors
747
+ bigIPs = {"209.85.231.104", "207.46.170.123", "72.30.2.43"}
748
+ bigIPs.add("208.80.152.2")
749
+ len(bigIPs)
750
+ newset = set()
751
+ newset.add("First item")
752
+ print("First item" in newset)
753
+ myset = set(range(123456789)) # {0, 1, 2, ...}
754
+ mylist = list(range(123456789)) # [0, 1, 2, ...]
755
+ 12345678 in myset # Fast, uses hash
756
+ 12345678 in mylist # Slower, check each item
757
+ two_cities_extended = """It was the best of times,
758
+ it was the worst of times, it was the age of wisdom,
759
+ it was the age of foolishness, it was the epoch of belief,
760
+ it was the epoch of incredulity, it was the season of Light,
761
+ it was the season of Darkness, it was the spring of hope,
762
+ it was the winter of despair, we had everything before us,
763
+ we had nothing before us, we were all going direct to Heaven,
764
+ we were all going direct the other way--in short, the period was
765
+ so far like the present period that some of its noisiest authorities
766
+ insisted on its being received, for good or for evil, in the superlative
767
+ degree of comparison only.
768
+ There were a king with a large jaw and a queen with a plain face,
769
+ on the throne of England; there were a king with a large jaw and a
770
+ queen with a fair face, on the throne of France. In both countries
771
+ it was clearer than crystal to the lords of the State preserves of
772
+ loaves and fishes, that things in general were settled for ever.
773
+ It was the year of Our Lord one thousand seven hundred and seventy-five.
774
+ Spiritual revelations were conceded to England at that favoured period,
775
+ as at this. Mrs. Southcott had recently attained her five-and-twentieth
776
+ blessed birthday, of whom a prophetic private in the Life Guards had heralded
777
+ the sublime appearance by announcing that arrangements were made for the
778
+ swallowing up of London and Westminster. Even the Cock-lane ghost had been
779
+ laid only a round dozen of years, after rapping out its messages, as the
780
+ spirits of this very year last past (supernaturally deficient in originality)
781
+ rapped out theirs. Mere messages in the earthly order of events had lately
782
+ come to the English Crown and People, from a congress of British subjects
783
+ in America: which, strange to relate, have proved more important to the human
784
+ race than any communications yet received through any of the chickens of the
785
+ Cock-lane brood.
786
+ """
787
+ wordlist = two_cities_extended.split()
788
+ def find_by_list(wordlist):
789
+ for word in wordlist:
790
+ if word in wordlist:
791
+ continue # Move on to next loop
792
+ get_ipython().run_line_magic('time', 'find_by_list(wordlist)')
793
+ worddict = {}
794
+ for word in wordlist:
795
+ if word in worddict:
796
+ worddict[word] += 1
797
+ else:
798
+ worddict[word] = 1
799
+ def find_by_dict(wordlist, dict):
800
+ for word in wordlist:
801
+ if word in dict:
802
+ continue # Move on to next iteration of the for loop
803
+ get_ipython().run_line_magic('time', 'find_by_dict(wordlist,worddict)')
804
+ mydict = {"a":1000}
805
+ dict2 = mydict # gets the address, so any changes are permanent to the original
806
+ dict2["b"] = 500
807
+ print(mydict)
808
+ print(dict2)
809
+ dict3 = dict2.copy()
810
+ dict3["c"] = 40
811
+ print(dict2)
812
+ print(dict3)
813
+ from string import ascii_lowercase
814
+ myset = set()
815
+ for i in range(len(two_cities_extended)):
816
+ myset.add(two_cities_extended[i].lower())
817
+ def checkletters(myset):
818
+ for c in ascii_lowercase:
819
+ # TODO check whether this letter appeared in myset, maybe return a value
820
+ if c not in myset:
821
+ print("Missing: " + c)
822
+ return False
823
+ print("All found")
824
+ return True
825
+ checkletters(myset)
826
+ # Lecture11and12NumpyMatplotlib.py
827
+ import numpy as np
828
+ v = np.array([1, 2 ,3])
829
+ print(v)
830
+ A = np.array([[1, 0, 0],
831
+ [0 ,2, 0],
832
+ [0, 0, 3]]) # 3x3 with 1,2,3 along the diagonal
833
+ print(A)
834
+ print(A.shape) # Tuples: like lists, but use () instead of []
835
+ print(v.shape) # 1d outputs a comma to indicate it's still a tuple
836
+ v1 = v
837
+ print(v1)
838
+ v2 = np.array([4, 5, 6])
839
+ print(v2)
840
+ print("Adding 1D arrays: ", v1 + v2)
841
+ print("Subtracting 1D arrays: ", v1 - v2)
842
+ print("Multiplying 1D arrays: ", v1 * v2)
843
+ print("Dividing 1D arrays: ", v1 / v2)
844
+ print(v1)
845
+ print("Adding by a constant: ", v1 + 2)
846
+ print("Subtracting by a constant: ", v1 - 2)
847
+ print("Multiplying by a constant: ", v1 * 2)
848
+ print("Dividing by a constant: ", v1 / 2)
849
+ my_array = np.array([[1,2,3],
850
+ [4,5,6]])
851
+ print(np.min(my_array, axis=0))
852
+ print(np.mean(my_array, axis=1))
853
+ B = np.array([[3, 2],
854
+ [4, -1]])
855
+ w = np.array([1, -1])
856
+ z = B @ w
857
+ print(z)
858
+ my_array = np.array([8, 6, 7, 5, 3, 0, 9])
859
+ print(my_array[1:3]) # prints index 1 and 2, not 3
860
+ print(my_array)
861
+ print(my_array[1:])
862
+ my_array[:3]
863
+ my_matrix = np.array([[42.3, 71.1, 92],
864
+ [40.7, 70.0, 85],
865
+ [47.6, 122.0, 82]])
866
+ print(my_matrix)
867
+ two_by_two_square = my_matrix[1:, :2]
868
+ print(two_by_two_square)
869
+ no_last_column = my_matrix[:, :2] # no temperature
870
+ print(no_last_column)
871
+ import numpy as np
872
+ a = np.array([0, 1, 2, 3, 4, 5])
873
+ print(a)
874
+ b = a[1:3]
875
+ print(b)
876
+ b[1] = 100 # modify the slice...
877
+ print(a) # ...and see the original change
878
+ print(np.zeros(3)) #create an array of zeros with length 3
879
+ print(np.zeros((2, 3))) # create a 2x3 matrix of zeros
880
+ import matplotlib.pyplot as plt
881
+ x = [1, 2, 3]
882
+ y = [1, 4, 9]
883
+ plt.plot(x, y)
884
+ plt.show()
885
+ import numpy as np
886
+ my_points = np.array([[2, 1],
887
+ [3, 4],
888
+ [5, 6]]) # Each list is a point
889
+ print(my_points)
890
+ plt.plot(my_points[:, 0], my_points[:,1]) # Slice to get x values separate from y values
891
+ plt.show()
892
+ plt.plot(my_points[:, 0], my_points[:, 1], 'ro') # 'r' is for red, 'o' is for circles
893
+ plt.show()
894
+ distances_millions_miles = [35, 67, 93, 142, 484, 889, 1790, 2880]
895
+ plt.plot(np.arange(1, 9), distances_millions_miles, 'o')
896
+ plt.show()
897
+ np.arange(1,9)
898
+ xpoints = np.linspace(0, 10, 100)
899
+ ypoints = xpoints ** 2 + 1
900
+ plt.plot(xpoints, ypoints)
901
+ plt.show()
902
+ plt.plot(my_points[:, 0], my_points[:, 1], 'ro')
903
+ myfit_x = np.linspace(1, 5, 100)
904
+ myfit_y = np.linspace(1.5, 5.5, 100) # Same y/x slope for all segments - so, a line
905
+ plt.plot(myfit_x,myfit_y)
906
+ plt.show()
907
+ import matplotlib.pyplot as plt
908
+ x = [1, 2, 3]
909
+ y1 = [1, 2, 3]
910
+ y2 = [3, 2, 1]
911
+ plt.plot(x, y1, label='Sales')
912
+ plt.plot(x, y2, label='Quality')
913
+ plt.legend()
914
+ plt.title('Trends')
915
+ plt.grid(True)
916
+ customers = ['Oliver', 'Sophia', 'Liam', 'Arielle', 'Noah']
917
+ total_purchases = [56, 73, 24, 48, 88]
918
+ plt.bar(customers, total_purchases)
919
+ plt.xlabel("Customer name", fontsize=14)
920
+ plt.ylabel("Total purchases", fontsize=14)
921
+ plt.title("Total purchases for 5 Amazon customers", fontsize=16)
922
+ plt.tick_params(axis='x', labelsize=14)
923
+ plt.tick_params(axis='y', labelsize=14)
924
+ plt.show()
925
+ # Lecture13BiggerPrograms.py
926
+ """
927
+ Compute f-measure for each item in a list.
928
+
929
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
930
+ (these stand for true positive, false positive, etc)
931
+ Returns: a list of floats, the f-measures.
932
+ """
933
+ """
934
+ Compute the f-measure, a performance measure that ignores true negatives.
935
+
936
+ Arguments: tp (int): the count of true positives
937
+ fp (int): the count of false negatives
938
+ tn (int): the count of true negatives
939
+ fn (int): the count of false negatives
940
+ Returns: a float, the f-measure.
941
+ """
942
+ def f_measures(stats_list):
943
+ """
944
+ Compute f-measure for each item in a list.
945
+
946
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
947
+ (these stand for true positive, false positive, etc)
948
+ Returns: a list of floats, the f-measures.
949
+ """
950
+ for tp, fp, tn, fn in stats_list:
951
+ f = f_measure(tp, fp, tn, fn)
952
+ def f_measures(stats_list):
953
+ """
954
+ Compute f-measure for each item in a list.
955
+
956
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
957
+ (these stand for true positive, false positive, etc)
958
+ Returns: a list of floats, the f-measures.
959
+ """
960
+ out = []
961
+ for tp, fp, tn, fn in stats_list:
962
+ f = f_measure(tp, fp, tn, fn)
963
+ out.append(f)
964
+ return f
965
+ def f_measure(tp, fp, tn, fn):
966
+ """
967
+ Compute the f-measure, a performance measure that ignores true negatives.
968
+
969
+ Arguments: tp (int): the count of true positives
970
+ fp (int): the count of false negatives
971
+ tn (int): the count of true negatives
972
+ fn (int): the count of false negatives
973
+ Returns: a float, the f-measure.
974
+ """
975
+ precision = tp/(tp + fp)
976
+ recall = tp/(tp + fn)
977
+ return (2 * precision * recall)/(precision + recall)
978
+ def f_measure(precision, recall):
979
+ """
980
+ Compute the f-measure, a performance measure that ignores true negatives.
981
+
982
+ Arguments: precision (float): proportion of positive classifications that are correct
983
+ recall (float): proportion of positive examples that were found
984
+ Returns: a float, the f-measure.
985
+ """
986
+ return (2 * precision * recall)/(precision + recall)
987
+ def precision(tp, fp):
988
+ return tp/(tp + fp)
989
+ def recall(tp, fn):
990
+ tp/(tp + fn)
991
+
992
+ def f_measures(stats_list):
993
+ """
994
+ Compute f-measure for each item in a list.
995
+
996
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
997
+ (these stand for true positive, false positive, etc)
998
+ Returns: a list of floats, the f-measures.
999
+ """
1000
+ out = []
1001
+ for tp, fp, tn, fn in stats_list:
1002
+ f = f_measure(precision(tp, fp), recall(tp, fn))
1003
+ out.append(f)
1004
+ return f
1005
+ print(precision(4,4)) # Expect 0.5
1006
+ print(recall(4,4)) # Expect 0.5
1007
+ print(f_measure(1, 1)) # Expect 1
1008
+ def recall(tp, fn):
1009
+ print(tp/(tp + fn))
1010
+
1011
+ recall(4,4)
1012
+ def recall(tp, fn):
1013
+ print(tp/(tp + fn))
1014
+ return tp/(tp + fn)
1015
+
1016
+ recall(4,4)
1017
+ def f_measure(precision, recall):
1018
+ """
1019
+ Compute the f-measure, a performance measure that ignores true negatives.
1020
+
1021
+ Arguments: precision (float): proportion of positive classifications that are correct
1022
+ recall (float): proportion of positive examples that were found
1023
+ Returns: a float, the f-measure.
1024
+ """
1025
+ return (2 * precision * recall)/(precision + recall)
1026
+ def precision(tp, fp):
1027
+ return tp/(tp + fp)
1028
+ def recall(tp, fn):
1029
+ return tp/(tp + fn)
1030
+
1031
+ def f_measures(stats_list):
1032
+ """
1033
+ Compute f-measure for each item in a list.
1034
+
1035
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
1036
+ (these stand for true positive, false positive, etc)
1037
+ Returns: a list of floats, the f-measures.
1038
+ """
1039
+ out = []
1040
+ for tp, fp, tn, fn in stats_list:
1041
+ f = f_measure(precision(tp, fp), recall(tp, fn))
1042
+ out.append(f)
1043
+ return f
1044
+ print(precision(4,4)) # Expect 0.5
1045
+ print(recall(4,4)) # Expect 0.5
1046
+ print(f_measure(1, 1)) # Expect 1
1047
+ print(precision(0, 4)) # Expect 0
1048
+ print(precision(0, 0)) # Expect ... oh, I guess we didn't think about this. 0?
1049
+ print(precision(4, 0)) # Expect 1
1050
+ print(recall(0, 4)) # Expect 0
1051
+ print(recall(0, 0)) # Similarly to precision, let's return 0
1052
+ print(recall(4, 0)) # Expect 1
1053
+ print(f_measure(0, 0)) # Expect 0
1054
+ print(f_measure(0.5, 0.5)) # Expect 0.5
1055
+ def f_measure(precision, recall):
1056
+ """
1057
+ Compute the f-measure, a performance measure that ignores true negatives.
1058
+
1059
+ Arguments: precision (float): proportion of positive classifications that are correct
1060
+ recall (float): proportion of positive examples that were found
1061
+ Returns: a float, the f-measure.
1062
+ """
1063
+ return (2 * precision * recall)/(precision + recall)
1064
+ def precision(tp, fp):
1065
+ if tp + fp == 0:
1066
+ return 0
1067
+ return tp/(tp + fp)
1068
+ def recall(tp, fn):
1069
+ if tp + fn == 0:
1070
+ return 0
1071
+ return tp/(tp + fn)
1072
+
1073
+ def f_measures(stats_list):
1074
+ """
1075
+ Compute f-measure for each item in a list.
1076
+
1077
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
1078
+ (these stand for true positive, false positive, etc)
1079
+ Returns: a list of floats, the f-measures.
1080
+ """
1081
+ out = []
1082
+ for tp, fp, tn, fn in stats_list:
1083
+ f = f_measure(precision(tp, fp), recall(tp, fn))
1084
+ out.append(f)
1085
+ return f
1086
+ print(precision(4,4)) # Expect 0.5
1087
+ print(recall(4,4)) # Expect 0.5
1088
+ print(f_measure(1, 1)) # Expect 1
1089
+ print(precision(0, 4)) # Expect 0
1090
+ print(precision(0, 0)) # Expect 0
1091
+ print(precision(4, 0)) # Expect 1
1092
+ print(recall(0, 4)) # Expect 0
1093
+ print(recall(0, 0)) # Similarly to precision, let's return 0
1094
+ print(recall(4, 0)) # Expect 1
1095
+ print(f_measure(0, 0)) # Expect 0
1096
+ print(f_measure(0.5, 0.5)) # Expect 0.5
1097
+ def f_measure(precision, recall):
1098
+ """
1099
+ Compute the f-measure, a performance measure that ignores true negatives.
1100
+
1101
+ Arguments: precision (float): proportion of positive classifications that are correct
1102
+ recall (float): proportion of positive examples that were found
1103
+ Returns: a float, the f-measure.
1104
+ """
1105
+ if precision + recall == 0:
1106
+ return 0
1107
+ return (2 * precision * recall)/(precision + recall)
1108
+ def precision(tp, fp):
1109
+ if tp + fp == 0:
1110
+ return 0
1111
+ return tp/(tp + fp)
1112
+ def recall(tp, fn):
1113
+ if tp + fn == 0:
1114
+ return 0
1115
+ return tp/(tp + fn)
1116
+
1117
+ def f_measures(stats_list):
1118
+ """
1119
+ Compute f-measure for each item in a list.
1120
+
1121
+ Argument: stats_list (list): a list of tuples of four ints, (tp, fp, tn, fn)
1122
+ (these stand for true positive, false positive, etc)
1123
+ Returns: a list of floats, the f-measures.
1124
+ """
1125
+ out = []
1126
+ for tp, fp, tn, fn in stats_list:
1127
+ f = f_measure(precision(tp, fp), recall(tp, fn))
1128
+ out.append(f)
1129
+ return f
1130
+ print(precision(4,4)) # Expect 0.5
1131
+ print(recall(4,4)) # Expect 0.5
1132
+ print(f_measure(1, 1)) # Expect 1
1133
+ print(precision(0, 4)) # Expect 0
1134
+ print(precision(0, 0)) # Expect 0
1135
+ print(precision(4, 0)) # Expect 1
1136
+ print(recall(0, 4)) # Expect 0
1137
+ print(recall(0, 0)) # Similarly to precision, let's return 0
1138
+ print(recall(4, 0)) # Expect 1
1139
+ print(f_measure(0, 0)) # Expect 0
1140
+ print(f_measure(0.5, 0.5)) # Expect 0.5
1141
+ # Lecture14Pandas.py
1142
+ import pandas as pd
1143
+ import numpy as np
1144
+ s1 = pd.Series([-3, -1, 1, 3, 5])
1145
+ print(s1)
1146
+ print(s1.index)
1147
+ s1[:2] # First 2 elements
1148
+ print(s1[[2,1,0]]) # Elements out of order
1149
+ type(s1)
1150
+ s1[s1 > 0]
1151
+ s2 = pd.Series(np.random.rand(5), index=['a', 'b', 'c', 'd', 'e'])
1152
+ print(s2)
1153
+ print(s2.index)
1154
+ print(s2['a'])
1155
+ data = {'pi': 3.14159, 'e': 2.71828} # dictionary
1156
+ print(data)
1157
+ s3 = pd.Series(data)
1158
+ print(s3)
1159
+ my_array = s3.values
1160
+ print(my_array)
1161
+ import numpy as np
1162
+ my_data = np.array([[5, 5, 4],
1163
+ [2, 3, 4]])
1164
+ hotels = pd.DataFrame(my_data, index = ["Alice rating", "Bob rating"],
1165
+ columns = ["Hilton", "Marriott", "Four Seasons"])
1166
+ hotels
1167
+ from google.colab import files
1168
+ uploaded = files.upload() # pick starbucks_drinkMenu_expanded.csv
1169
+ get_ipython().system('ls')
1170
+ import pandas as pd
1171
+ df = pd.read_csv('starbucks_drinkMenu_expanded.csv', index_col = 'Beverage')
1172
+ df.head()
1173
+ sorted_df = df.sort_values(by = "Calories", ascending=False)
1174
+ sorted_df.head()
1175
+ hotels
1176
+ hotels['Hilton']
1177
+ sum = 0
1178
+ for i in hotels['Hilton']:
1179
+ sum += i
1180
+ print('Average Hilton Rating: ' + str(sum/len(hotels['Hilton'])))
1181
+ hotels.loc['Bob rating']
1182
+ hotels.loc['Bob rating', 'Marriott']
1183
+ hotels.iloc[1, 1]
1184
+ print(hotels.iloc[0, 1:2])
1185
+ print(hotels.loc['Bob rating', ['Marriott', 'Hilton']])
1186
+ (df['Calories'] > 300)
1187
+ df[df['Calories'] > 300].head()
1188
+ df[(df['Calories'] > 300) & (df['Beverage_prep'] == 'Soymilk')].head()
1189
+ df['bad_fat'] = df['Trans_Fat_g'] + df['Saturated_Fat_g']
1190
+ df.head()
1191
+ size_ounces_dict = {'Short': 8, 'Tall': 12, 'Grande': 16, 'Venti': 20}
1192
+ ounces_list = []
1193
+ for drink in df['Beverage_prep']:
1194
+ ounces_list.append(size_ounces_dict.get(drink, -1))
1195
+ df['ounces'] = ounces_list
1196
+ df.head()
1197
+ def size_to_ml(size_name):
1198
+ size_ounces_dict = {'Short': 8, 'Tall': 12, 'Grande': 16, 'Venti': 20}
1199
+ return size_ounces_dict.get(size_name,0) * 29.5735
1200
+ ml = df['Beverage_prep'].map(size_to_ml)
1201
+ print(ml)
1202
+ # Lecture15Pandas.py
1203
+ import pandas as pd
1204
+ df = pd.read_csv('starbucks_drinkMenu_expanded.csv', index_col = 'Beverage')
1205
+ df.head()
1206
+ print(df.loc[:, "Protein_g"].mean())
1207
+ print(df.loc[:, "Protein_g"].max())
1208
+ print(df.loc[:, "Protein_g"].idxmax()) # "argmax," gives index with biggest value
1209
+ df.describe()
1210
+ df.corr(numeric_only=True) # New to pandas 2.0.0: chokes on strings without added arg
1211
+ df.columns
1212
+ df.dtypes
1213
+ string = 'string'
1214
+ string[:-1]
1215
+ df['Vitamin_A'] = df['Vitamin_A'].str[0:-1] # Remove the % at the end
1216
+ df['Vitamin_A']
1217
+ df['Vitamin_A'] = pd.to_numeric(df['Vitamin_A'])
1218
+ df.dtypes
1219
+ df['Vitamin_A'] = df['Vitamin_A'].astype('float64')
1220
+ df.dtypes
1221
+ df.corr(numeric_only=True)
1222
+ df.isnull().sum()
1223
+ df = df.dropna(axis=0, how="any") # Remove the offending row
1224
+ df.isnull().sum()
1225
+ calorie_max = 0
1226
+ best_name = ""
1227
+ for index, row in df.iterrows():
1228
+ if row['Calories'] > calorie_max:
1229
+ calorie_max = row['Calories']
1230
+ best_name = index
1231
+ print(best_name)
1232
+ protein = df.loc[:, "Protein_g"]
1233
+ protein.hist(bins=20); # Create a histogram with 20 equally spaced bins for the data
1234
+ subplot = df[["Protein_g", "Vitamin_A"]] # Notice another way to get desired columns
1235
+ subplot.boxplot(); # Boxplots give median value, middle 50% of data, and range of non-outliers
1236
+ from google.colab import files
1237
+ uploaded = files.upload() # pick titanic.csv
1238
+ df = pd.read_csv('titanic.csv', index_col = 'PassengerId')
1239
+ df.head()
1240
+ df.columns
1241
+ df.dtypes
1242
+ df.describe()
1243
+ df.corr(numeric_only=True)
1244
+ males = df[df['Sex'] == 'male']
1245
+ males.head()
1246
+ males.describe()
1247
+ females = df[df['Sex'] == 'female']
1248
+ females.describe()
1249
+ df['sex_numeric'] = df['Sex'] == 'female'
1250
+ df.corr(numeric_only=True)
1251
+ third_class = df[df['Pclass'] == 3]
1252
+ second_class = df[df['Pclass'] == 2]
1253
+ first_class = df[df['Pclass'] == 1]
1254
+ third_class['Survived'].hist();
1255
+ second_class['Survived'].hist();
1256
+ first_class['Survived'].hist();
1257
+ # Lecture16Strings.py
1258
+ my_cost = 12.95821
1259
+ print(f'The total cost was {my_cost} dollars')
1260
+ print(f'The total cost was {my_cost:.2f} dollars')
1261
+ groceries = "milk,eggs,yogurt"
1262
+ grocerieslist = groceries.split(',')
1263
+ print(grocerieslist)
1264
+ ','.join(['milk', 'eggs', 'yogurt'])
1265
+ ' milk,eggs,yogurt '.strip()
1266
+ lines = "SERVANT: Sir, there are ten thousand--\nMACBETH: Geese, villain?"
1267
+ linelist = lines.splitlines() # A shortcut for split('\n')
1268
+ for line in linelist:
1269
+ if line.startswith("MACBETH"):
1270
+ print(line.split(": ")[1])
1271
+ print('Wow\n\twow!')
1272
+ print("foo" in "food")
1273
+ print("foodfood".replace("foo", "ra"))
1274
+ import numpy as np
1275
+ import pandas as pd
1276
+ my_data = np.array([["Excellent", " Okay ", " Okay"], ["Great ", " Good", " Good"]])
1277
+ df = pd.DataFrame(my_data, columns = ["Hilton", "Marriott", "Four Seasons"], index = ["Alice", "Bob"])
1278
+ df
1279
+ marriott = df['Marriott']
1280
+ for s in marriott:
1281
+ print(s)
1282
+ print('---')
1283
+ for s in marriott.str.strip():
1284
+ print(s) # Look, no extra whitespace
1285
+ marriott.str.match("\s*Okay\s*")
1286
+ import re
1287
+ pattern = '02143'
1288
+ longstring = 'Somerville, MA 02143'
1289
+ result = re.search(pattern, longstring)
1290
+ if result: # (if it's not None)
1291
+ print(result.group())
1292
+ longstring = '0132428190214200'
1293
+ pattern2 = '02143'
1294
+ result2 = re.search(pattern2, longstring)
1295
+ print(result2)
1296
+ pattern3 = '\d\d\d\d\d'
1297
+ longstring = 'Somerville, MA 02143'
1298
+ result3 = re.search(pattern3, longstring)
1299
+ if result3:
1300
+ print(result3.group())
1301
+ longstring = 'My phone number is 5555555'
1302
+ pattern4 = 'phone number is \d+'
1303
+ result4 = re.search(pattern4, longstring)
1304
+ if result4:
1305
+ print(result4.group())
1306
+ longstring = 'Call me at 555-5555'
1307
+ pattern5 = '\d\d\d-?\d\d\d\d'
1308
+ result5 = re.search(pattern5, longstring)
1309
+ if result5:
1310
+ print(result5.group())
1311
+ longstring = "Call me at 1-800-555-5555."
1312
+ pattern = "(\d-)?(\d\d\d-)?\d\d\d-?\d\d\d\d"
1313
+ result = re.search(pattern, longstring)
1314
+ if result:
1315
+ print(result.group())
1316
+ longstring2 = "Call me at 555-5555."
1317
+ result = re.search(pattern, longstring2)
1318
+ if result:
1319
+ print(result.group())
1320
+ pattern = "Somerville, (MA|NJ)"
1321
+ longstring = "Somerville, NJ 02143"
1322
+ result = re.search(pattern, longstring)
1323
+ if result:
1324
+ print(result.group())
1325
+ longstring = "States with a Somerville: AL, IN, ME, MA, NJ, OH, TN, TX"
1326
+ pattern = "[A-Z][A-Z]" # Get capital letters within A-Z range
1327
+ result = re.findall(pattern, longstring)
1328
+ print(result)
1329
+ longstring = "The stock NVDA went down 4.54 points"
1330
+ pattern = "stock (\w+) went down (\d+\.\d+) points"
1331
+ result = re.search(pattern, longstring)
1332
+ if result:
1333
+ print(result.group())
1334
+ print(result.group(1)) # Subgroup 1, the first () in the pattern
1335
+ print(result.group(2))
1336
+ import re
1337
+ longstring = "We paid $100 for those shoes"
1338
+ pattern = '\$\d+'
1339
+ result = re.search(pattern, longstring)
1340
+ print(result.group())
1341
+ # Lecture18Objects.py
1342
+ class Car:
1343
+ pass
1344
+ car1 = Car()
1345
+ car2 = Car()
1346
+ car3 = Car()
1347
+ print(isinstance(car1,Car))
1348
+ car1.year = 2010
1349
+ car1.make = "Honda"
1350
+ car1.model = "Fit"
1351
+ car1.color = "blue"
1352
+ car2.year = 2013
1353
+ car2.make = "Toyota"
1354
+ car2.model = "Camry"
1355
+ car2.color = "silver"
1356
+ print(f"This car is a {car1.year} {car1.color} {car1.make} {car1.model}")
1357
+ my_car = (2010, 'Honda', 'Fit', 'blue')
1358
+ print(f"This car is a {my_car[0]} {my_car[3]} {my_car[1]} {my_car[2]}")
1359
+ class Car:
1360
+ def print_facts(self):
1361
+ print(f"This car is a {self.year} {self.color} {self.make} {self.model}")
1362
+ car1 = Car()
1363
+ car2 = Car()
1364
+ car1.year = 2010
1365
+ car1.make = "Honda"
1366
+ car1.model = "Fit"
1367
+ car1.color = "blue"
1368
+ car2.year = 2013
1369
+ car2.make = "Toyota"
1370
+ car2.model = "Camry"
1371
+ car2.color = "silver"
1372
+ car1.print_facts()
1373
+ car2.print_facts()
1374
+ class Car:
1375
+ def __init__(self, year, make, model, color):
1376
+ # It's common for the constructor's arguments
1377
+ # to have similar or identical names to the attributes they set
1378
+ # (but we still have to say one should be set to the other)
1379
+ self.year = year
1380
+ self.make = make
1381
+ self.model = model
1382
+ self.color = color
1383
+
1384
+ def print_facts(self):
1385
+ print(f"This car is a {self.year} {self.color} {self.make} {self.model}")
1386
+ car1 = Car(2010, "Honda", "Fit", "blue")
1387
+ car2 = Car(2013, "Toyota", "Camry", "silver")
1388
+ car1.print_facts()
1389
+ car2.print_facts()
1390
+ def newest_car(list_of_cars):
1391
+ if not list_of_cars: # ie, empty list
1392
+ return None
1393
+ best_year = list_of_cars[0].year
1394
+ best_car = list_of_cars[0]
1395
+ for car in list_of_cars:
1396
+ # This warning message could prevent a bug if we try
1397
+ # to hand this function the wrong list
1398
+ if not isinstance(car, Car):
1399
+ print('Warning, list had non-car items!')
1400
+ elif car.year > best_year:
1401
+ best_year = car.year
1402
+ best_car = car
1403
+ return best_car
1404
+ newest_car([car1, car2]).print_facts()
1405
+ class Bill:
1406
+ """ Represents a bill at a restaurant.
1407
+ _items (list of tuples): list of (item name, cost) tuples
1408
+ """
1409
+ def __init__(self, items):
1410
+ self._items = items
1411
+ # "Getter"
1412
+ def items(self):
1413
+ return self._items
1414
+ # "Setter"
1415
+ def set_items(self, items):
1416
+ self._items = items
1417
+
1418
+ def total_cost_pretax(self):
1419
+ total = 0
1420
+ for name, cost in self._items:
1421
+ total += cost
1422
+ return total
1423
+ def total_cost_with_tax(self, tax_rate):
1424
+ return round(self.total_cost_pretax() * (1 + tax_rate), 2)
1425
+ my_lunch = [("Ham Sandwich", 9), ("Coke", 2)]
1426
+ new_bill = Bill(my_lunch)
1427
+ cost_with_tax = new_bill.total_cost_with_tax(0.08)
1428
+ print(f"Total cost: {cost_with_tax}")
1429
+ new_bill.items() # could have said new_bill._items, but we were told not to
1430
+ class Bill:
1431
+ """ Represents a bill at a restaurant.
1432
+ _item_names (list of strings): list of items on bill
1433
+ _item_costs (list of ints): list of prices of items on bill
1434
+ _items is not here anymore! sorry anybody who wrote code that uses it, we warned you!
1435
+ """
1436
+ def __init__(self, items):
1437
+ self._item_names = [item[0] for item in items]
1438
+ self._item_costs = [item[1] for item in items]
1439
+ # "Getter"
1440
+ def items(self):
1441
+ # list(zip(a, b)) returns a list of tuples combining a and b
1442
+ return list(zip(self._item_names, self._item_costs))
1443
+ # "Setter"
1444
+ def set_items(self, items):
1445
+ self._item_names = [item[0] for item in items]
1446
+ self._item_costs = [item[1] for item in items]
1447
+
1448
+ def total_cost_pretax(self):
1449
+ total = 0
1450
+ for name, cost in self._items:
1451
+ total += cost
1452
+ return total
1453
+ # Notice that we can call another method with this one
1454
+ def total_cost_with_tax(self, tax_rate):
1455
+ return round(self.total_cost_pretax() * (1 + tax_rate), 2)
1456
+ my_lunch = [("Ham Sandwich", 9), ("Coke", 2)]
1457
+ new_bill = Bill(my_lunch)
1458
+ print(new_bill.items()) # this still works, but _items would have broken
1459
+ class Circle:
1460
+ def __init__(self, radius):
1461
+ if radius < 0:
1462
+ raise ValueError("Can't have negative circle radius")
1463
+ self.radius=radius
1464
+ Circle(-1)
1465
+ class Circle2:
1466
+ def __init__(self,radius=2):
1467
+ self.radius = radius
1468
+ Circle2().radius
1469
+ class Student:
1470
+ def __init__(self, age, major, year):
1471
+ self.age = age
1472
+ self.major = major
1473
+ self.year = year
1474
+
1475
+ def get_older(self, amount):
1476
+ self.age += amount
1477
+ bob = Student(20,"Biology","Sophomore")
1478
+ bob.get_older(2)
1479
+ print(bob.age)
1480
+ car1 = Car(2010, "Honda", "Fit", "blue")
1481
+ car2 = car1
1482
+ car2.color = "black"
1483
+ car1.print_facts() # It's black now
1484
+ car2.print_facts()
1485
+ import copy
1486
+ car2 = copy.copy(car1)
1487
+ car2.color = "white"
1488
+ car1.print_facts()
1489
+ car2.print_facts()
1490
+ from google.colab import files
1491
+ uploaded = files.upload() # import books.csv
1492
+ import pandas as pd
1493
+ df = pd.read_csv('books.csv', index_col = 'title')
1494
+ df.head()
1495
+ class Book:
1496
+ def __init__(self, title, author, average_rating):
1497
+ self.title = title
1498
+ self.author = author
1499
+ self.average_rating = average_rating
1500
+ # Could add more fields from the dataset if desired
1501
+
1502
+ class Publisher:
1503
+ def __init__(self, df, publisher_name):
1504
+ self.name = publisher_name
1505
+ self.books = []
1506
+ for row in df.itertuples():
1507
+ if row.publisher == publisher_name:
1508
+ self.books.append(Book(row.Index, row.authors, row.average_rating))
1509
+
1510
+ def average_rating(self):
1511
+ total = 0
1512
+ for book in self.books:
1513
+ total += book.average_rating
1514
+ return total/len(self.books)
1515
+ scholastic = Publisher(df,'Scholastic Inc.')
1516
+ scholastic.average_rating()
1517
+ # Lecture19MoreOO.py
1518
+ class Client: # both Faculty and Students
1519
+ def __init__(self, birthyear, uid):
1520
+ self.birthyear = birthyear
1521
+ self.uid = uid
1522
+ def get_uid(self):
1523
+ return self.uid
1524
+
1525
+ def get_birthyear(self):
1526
+ return self.birthyear
1527
+ class Student(Client): # inherit from Client
1528
+ def __init__(self, birthyear, uid, gradyear):
1529
+ self.birthyear = birthyear
1530
+ self.uid = uid
1531
+ self.gradyear = gradyear
1532
+ def get_gradyear(self):
1533
+ return self.gradyear
1534
+
1535
+ class Faculty(Client):
1536
+ pass # Nothing else we want to do for Faculty
1537
+
1538
+ alice = Student(2003, 123456789, 2024)
1539
+ print(alice.get_birthyear()) # Inherited from Client
1540
+ print(alice.get_uid()) # Inherited from Client
1541
+ print(alice.get_gradyear()) # Specific to Student
1542
+ person1 = Student(2000,123456,2025)
1543
+ if not isinstance(person1, Faculty):
1544
+ print("Hey, this person doesn't have permission to do this!")
1545
+ else:
1546
+ print("Welcome, Faculty number " + str(person1.uid) + "!")
1547
+ student1 = Student(2000,123456,2025)
1548
+ print(isinstance(student1,Student))
1549
+ print(isinstance(student1,Client))
1550
+ print(isinstance(student1,object)) # Every class inherits from object
1551
+ class Student(Client): # inherit from Client
1552
+ def __init__(self, birthyear, uid, gradyear):
1553
+ super().__init__(birthyear, uid)
1554
+ self.gradyear = gradyear
1555
+ def get_gradyear():
1556
+ return self.gradyear
1557
+ bob = Student(2002,987654321,2022)
1558
+ print(bob.get_uid()) # inherited from Client
1559
+ class Trip:
1560
+ def __init__(self,cost,start_date,end_date):
1561
+ self.cost = cost
1562
+ self.start_date = start_date
1563
+ self.end_date = end_date
1564
+ self.reimbursed = False
1565
+ def cost(self):
1566
+ return self.cost
1567
+
1568
+ def reimburse(self):
1569
+ self.reimbursed = True
1570
+
1571
+ def dates(self):
1572
+ return self.startDate, self.endDate
1573
+ class EquipmentOrder:
1574
+ def __init__(self,cost,domestic_seller):
1575
+ self.cost = cost
1576
+ self.reimbursed = False
1577
+ self.domestic_seller = domestic_seller
1578
+ def cost(self):
1579
+ return self.cost
1580
+
1581
+ def reimburse(self):
1582
+ self.reimbursed = True
1583
+
1584
+ def domestic_seller(self):
1585
+ return self.domestic_seller
1586
+ class Expense:
1587
+ def __init__(self,cost):
1588
+ self.cost = cost
1589
+ self.reimbursed = False
1590
+
1591
+ def cost(self):
1592
+ return self.cost
1593
+
1594
+ def reimburse(self):
1595
+ self.reimbursed = True
1596
+ class Trip(Expense):
1597
+ def __init__(self,cost,start_date,end_date):
1598
+ super().__init__(cost)
1599
+ self.start_date = start_date
1600
+ self.end_date = end_date
1601
+
1602
+ # inherit cost, reimburse
1603
+ def dates(self):
1604
+ return self.start_date, self.end_date
1605
+ class EquipmentOrder(Expense):
1606
+ def __init__(self,cost,domestic_seller):
1607
+ super().__init__(cost)
1608
+ self.domestic_seller = domestic_seller
1609
+ # inherit cost, reimburse
1610
+ def domestic_seller(self):
1611
+ return self.domestic_seller
1612
+ class Employee:
1613
+ def __init__(self, name, salary, title, years_of_service):
1614
+ self.name = name
1615
+ self.salary = salary
1616
+ self.title = title
1617
+ self.years_of_service = years_of_service
1618
+
1619
+ def give_raise(self, raise_amount):
1620
+ self.salary += raise_amount
1621
+
1622
+ def change_title(self, new_title):
1623
+ self.title = new_title
1624
+
1625
+ def update_years_of_service(self, increase):
1626
+ self.years_of_service += increase
1627
+ class Contractor:
1628
+ def __init__(self, name, salary, contract_duration):
1629
+ self.name = name
1630
+ self.salary = salary
1631
+ self.contract_duration = contract_duration
1632
+
1633
+ def give_raise(self, raise_amount):
1634
+ self.salary += raise_amount
1635
+
1636
+ alice = Employee("Alice", 90000, "Manager", 7)
1637
+ alice.give_raise(10000)
1638
+ print(alice.salary)
1639
+ bob = Contractor("Bob", 80000, 2)
1640
+ bob.give_raise(10000)
1641
+ print(bob.salary)
1642
+ class Worker:
1643
+ def __init__(self, name, salary):
1644
+ self.name = name
1645
+ self.salary = salary
1646
+
1647
+ def give_raise(self, raise_amount):
1648
+ self.salary += raise_amount
1649
+
1650
+ class Employee(Worker):
1651
+ def __init__(self, name, salary, title, years_of_service):
1652
+ super().__init__(name, salary)
1653
+ self.title = title
1654
+ self.years_of_service = years_of_service
1655
+
1656
+ def change_title(self, new_title):
1657
+ self.title = new_title
1658
+
1659
+ def update_years_of_service(self, increase):
1660
+ self.years_of_service += increase
1661
+ class Contractor(Worker):
1662
+ def __init__(self, name, salary, contract_duration):
1663
+ super().__init__(name, salary)
1664
+ self.contract_duration = contract_duration
1665
+
1666
+ alice = Employee("Alice", 90000, "Manager", 7)
1667
+ alice.give_raise(10000)
1668
+ print(alice.salary)
1669
+ bob = Contractor("Bob", 80000, 2)
1670
+ bob.give_raise(10000)
1671
+ print(bob.salary)
1672
+ class Gradyear:
1673
+ def __init__(self, year):
1674
+ self.year = year
1675
+ year = Gradyear(2024)
1676
+ print(year)
1677
+ class Gradyear:
1678
+ def __init__(self, year):
1679
+ self.year = year
1680
+ def __str__(self): # Our own implementation
1681
+ return str(self.year)
1682
+ gradyear = Gradyear(2024)
1683
+ print(gradyear)
1684
+ gy1 = Gradyear(2024)
1685
+ gy2 = Gradyear(2024)
1686
+ print(gy1 == gy2)
1687
+ myset = set()
1688
+ myset.add(gy1)
1689
+ myset.add(gy2)
1690
+ len(myset)
1691
+ class Gradyear:
1692
+ def __init__(self, year):
1693
+ self.year = year
1694
+ def __str__(self): # Our own implementation
1695
+ return str(self.year)
1696
+
1697
+ def __eq__(self, other):
1698
+ return self.year == other.year
1699
+ def __hash__(self):
1700
+ return self.year # Just store by number itself
1701
+ gy1 = Gradyear(2024)
1702
+ gy2 = Gradyear(2024)
1703
+ print(gy1 == gy2)
1704
+ myset = set()
1705
+ myset.add(gy1)
1706
+ myset.add(gy2)
1707
+ len(myset)
1708
+ # Lecture20Recursion.py
1709
+ def bad_recursion():
1710
+ print("Bad!")
1711
+ bad_recursion()
1712
+ bad_recursion()
1713
+ def factorial(n):
1714
+ # Omitting checks to make sure we're a natural number, etc
1715
+ if n == 1:
1716
+ return 1
1717
+ return n * factorial(n-1)
1718
+ print (factorial(4))
1719
+ def factorial(n):
1720
+ # Omitting checks to make sure we're a natural number, etc
1721
+ print(f'Evaluating {n}!')
1722
+ if n == 1:
1723
+ print('Returning 1')
1724
+ return 1
1725
+ result = n * factorial(n-1)
1726
+ print(f'Returning {result}')
1727
+ return result
1728
+ print (factorial(4))
1729
+ def sum_m_to_n(m, n):
1730
+ if n == m:
1731
+ return m
1732
+ result = n + sum_m_to_n(m, n-1)
1733
+ return result
1734
+ sum_m_to_n(3, 7) # 3 + 4 + 5 + 6 + 7 = 25
1735
+ def sum_m_to_n(m, n):
1736
+ print(f'Evaluating sum from {m} to {n}')
1737
+ if n == m:
1738
+ print(f'Returning {m}')
1739
+ return m
1740
+ result = n + sum_m_to_n(m, n-1)
1741
+ print(f'Returning {result}')
1742
+ return result
1743
+ sum_m_to_n(3, 7) # 3 + 4 + 5 + 6 + 7 = 25
1744
+ def mypow(a, p):
1745
+ if p == 0:
1746
+ return 1
1747
+ result = a * mypow(a, p-1)
1748
+ return result
1749
+ mypow(2,8)
1750
+ def mypow(a, p):
1751
+ print(f'Evaluating {a}^{p}')
1752
+ if p == 0:
1753
+ print('Returning 1')
1754
+ return 1
1755
+ result = a * mypow(a, p-1)
1756
+ print(f'Returning {result}')
1757
+ return result
1758
+ mypow(2,8)
1759
+ def fib(n):
1760
+ if (n == 0):
1761
+ return 0
1762
+ if (n == 1):
1763
+ return 1
1764
+ return fib(n-1) + fib(n-2)
1765
+ for i in range(10):
1766
+ print(fib(i))
1767
+ def r_perm(r, n):
1768
+ if n == r+1:
1769
+ return n
1770
+ return n * r_perm(r,n-1)
1771
+ r_perm(5,7)
1772
+ def iter_factorial(n):
1773
+ running_fact = 1
1774
+ for i in range(1,n+1):
1775
+ running_fact *= i
1776
+ return running_fact
1777
+
1778
+ print(iter_factorial(4))
1779
+ import numpy as np
1780
+ def iter_fib(n):
1781
+ if n == 0 or n == 1:
1782
+ return n
1783
+ fibs = np.zeros(n+1)
1784
+ fibs[0] = 0
1785
+ fibs[1] = 1
1786
+ for i in range(2,n+1):
1787
+ fibs[i] = fibs[i-1] + fibs[i-2]
1788
+ return int(fibs[n])
1789
+ for i in range(10):
1790
+ print(iter_fib(i))
1791
+ def power_set(setstring):
1792
+ if len(setstring) == 0:
1793
+ return [""]
1794
+ subset_list = []
1795
+ # Recursive call gets all the subsets that don't involve the first character
1796
+ smaller_power_set = power_set(setstring[1:])
1797
+ # The starting character is either in the subset...
1798
+ for substring in smaller_power_set:
1799
+ subset_list.append(setstring[0] + substring)
1800
+ # ...or not.
1801
+ for substring in smaller_power_set:
1802
+ subset_list.append(substring)
1803
+ return subset_list
1804
+ power_set("abcd")
1805
+ def recursive_sum(lst):
1806
+ if not lst: # empty list
1807
+ return 0
1808
+ return lst[0] + recursive_sum(lst[1:])
1809
+ recursive_sum([1,2,3])
1810
+ def recursive_filter(min_val, lst):
1811
+ if not lst:
1812
+ return []
1813
+ if lst[0] >= min_val:
1814
+ return [lst[0]] + recursive_filter(min_val, lst[1:])
1815
+ else:
1816
+ return recursive_filter(min_val, lst[1:])
1817
+ recursive_filter(3, [1, 2, 3, 4, 5])
1818
+ def recursive_index(item, lst, index): # index tracks where we are in the list
1819
+ if not lst:
1820
+ return None # not found
1821
+ if lst[0] == item:
1822
+ return index
1823
+ return recursive_index(item,lst[1:],index+1)
1824
+ recursive_index(5, [0, 1, 2, 5], 0)
1825
+ def recursive_skiplist(lst):
1826
+ if len(lst) == 0:
1827
+ return []
1828
+ if len(lst) == 1:
1829
+ return lst
1830
+ return [lst[0]] + recursive_skiplist(lst[2:])
1831
+ recursive_skiplist([5,3,7,2,9])
1832
+ # Lecture21DataStructures.py
1833
+ class ll_node:
1834
+ def __init__(self, num):
1835
+ self.number = num
1836
+ self.next = None
1837
+ def append(self, num):
1838
+ if self.next == None: # End of the list - add the node
1839
+ self.next = ll_node(num)
1840
+ else:
1841
+ self.next.append(num) # Recursively append to rest of list
1842
+
1843
+ def contains(self, othernum):
1844
+ if self.number == othernum: # We found it
1845
+ return True
1846
+ elif self.next == None: # We reached the end, didn't find it
1847
+ return False
1848
+ # Not here, there's more list - so, keep looking (recursively)
1849
+ return self.next.contains(othernum)
1850
+ def __str__(self):
1851
+ if self.next == None: # Last number
1852
+ return str(self.number)
1853
+ # Print this and print the rest (more recursion)
1854
+ return str(self.number) + ' ' + str(self.next)
1855
+ mylist = ll_node(6)
1856
+ mylist.append(1)
1857
+ mylist.append(7)
1858
+ print(mylist)
1859
+ print('Contains 7: ' + str(mylist.contains(7)))
1860
+ print('Contains 5: ' + str(mylist.contains(5)))
1861
+ import numpy as np
1862
+ class dynamic_array: # Showing how Python lists work
1863
+ def __init__(self, initial_size):
1864
+ self.memory = np.zeros(initial_size)
1865
+ self.occupied = 0
1866
+ self.size = initial_size
1867
+ def __str__(self):
1868
+ return str(self.memory)
1869
+
1870
+ def append(self, val):
1871
+ if self.occupied == self.size:
1872
+ print('Resizing...')
1873
+ new_memory = np.zeros(self.size*2)
1874
+ # A "hiccup" in running time as everything's copied
1875
+ for i in range(len(self.memory)):
1876
+ new_memory[i] = self.memory[i]
1877
+ self.memory = new_memory
1878
+ self.size = self.size*2
1879
+ print('Adding ' + str(val))
1880
+ self.memory[self.occupied] = val
1881
+ self.occupied += 1
1882
+ my_array = dynamic_array(2)
1883
+ print(my_array)
1884
+ my_array.append(1)
1885
+ my_array.append(1)
1886
+ print(my_array)
1887
+ my_array.append(1)
1888
+ print(my_array)
1889
+ my_array.append(1)
1890
+ print(my_array)
1891
+ class FolderTree:
1892
+ # binary left and right are its fields
1893
+ def __init__(self, val):
1894
+ self.left = None
1895
+ self.right = None
1896
+ self.val = val
1897
+
1898
+ def addLeft(self, node):
1899
+ self.left = node
1900
+
1901
+ def addRight(self, node):
1902
+ self.right = node
1903
+
1904
+ def find(self, v):
1905
+ if self.val == v:
1906
+ return True
1907
+ # "if self.left" is checking that self.left exists -
1908
+ # else error when we run self.left.find()
1909
+ if self.left and self.left.find(v):
1910
+ return True
1911
+ if self.right and self.right.find(v):
1912
+ return True
1913
+ return False
1914
+ leftleftchild = FolderTree("wow.exe")
1915
+ leftrightchild = FolderTree("xls.exe")
1916
+ rightleftchild = FolderTree("lec12.pdf")
1917
+ rightrightchild = FolderTree("lec14.pdf")
1918
+ leftparent = FolderTree("apps")
1919
+ rightparent = FolderTree("lecs")
1920
+ leftparent.addLeft(leftleftchild)
1921
+ leftparent.addRight(leftrightchild)
1922
+ rightparent.addLeft(rightleftchild)
1923
+ rightparent.addRight(rightrightchild)
1924
+ root = FolderTree("root")
1925
+ root.addLeft(leftparent)
1926
+ root.addRight(rightparent)
1927
+ print(root.find("wow.exe"))
1928
+ print(root.find("lec13.exe"))
1929
+ def count_nodes(tree):
1930
+ if tree == None:
1931
+ return 0
1932
+ return 1 + count_nodes(tree.left) + count_nodes(tree.right)
1933
+ count_nodes(root)
1934
+ def calc_depth(tree):
1935
+ if tree is None:
1936
+ return 0
1937
+ if tree.left is None and tree.right is None:
1938
+ return 0 # Leaf has depth 0 in its subtree
1939
+ return 1 + max(calc_depth(tree.left), calc_depth(tree.right))
1940
+ calc_depth(root)
1941
+ class BinarySearchTree:
1942
+ # binary left and right are its fields
1943
+ def __init__(self, val):
1944
+ self.left = None
1945
+ self.right = None
1946
+ self.val = val
1947
+
1948
+ def addLeft(self, node):
1949
+ self.left = node
1950
+
1951
+ def addRight(self, node):
1952
+ self.right = node
1953
+
1954
+ def find(self, v):
1955
+ if self.val == v:
1956
+ return True
1957
+ if v < self.val:
1958
+ if self.left:
1959
+ print("Going Left")
1960
+ return self.left.find(v)
1961
+ else:
1962
+ return False
1963
+ else:
1964
+ if self.right:
1965
+ print("Going Right")
1966
+ return self.right.find(v)
1967
+ else:
1968
+ return False
1969
+ root = BinarySearchTree("m")
1970
+ leftparent = BinarySearchTree("f")
1971
+ rightparent = BinarySearchTree("q")
1972
+ leftleftchild = BinarySearchTree("a")
1973
+ leftrightchild = BinarySearchTree("h")
1974
+ rightleftchild = BinarySearchTree("o")
1975
+ rightrightchild = BinarySearchTree("u")
1976
+ leftparent.addLeft(leftleftchild)
1977
+ leftparent.addRight(leftrightchild)
1978
+ rightparent.addLeft(rightleftchild)
1979
+ rightparent.addRight(rightrightchild)
1980
+ root.addLeft(leftparent)
1981
+ root.addRight(rightparent)
1982
+ print(root.find("h"))
1983
+ print(root.find("d"))
1984
+ class infect_tree:
1985
+ # name is a string, infects is a list of infect_tree's infected
1986
+ def __init__(self, name, infects):
1987
+ self.name = name
1988
+ self.infects = infects
1989
+ jake = infect_tree('jake', [])
1990
+ eric = infect_tree('eric', [])
1991
+ fifi = infect_tree('fifi', [])
1992
+ ged = infect_tree('ged', [])
1993
+ hao = infect_tree('hao', [])
1994
+ idris = infect_tree('idris', [jake])
1995
+ bob = infect_tree('bob', [eric])
1996
+ che = infect_tree('che', [])
1997
+ daphne = infect_tree('daphne', [fifi, ged, hao, idris])
1998
+ alice = infect_tree('alice', [bob, che, daphne])
1999
+ def find_most_infections(my_tree):
2000
+ best_infects = len(my_tree.infects)
2001
+ best_name = my_tree.name
2002
+ for infect in my_tree.infects:
2003
+ name, infects = find_most_infections(infect) # Recursion...
2004
+ if infects > best_infects:
2005
+ best_infects = infects
2006
+ best_name = name
2007
+ return best_name, best_infects
2008
+ find_most_infections(alice)
2009
+ def find_all_descendants(my_tree):
2010
+ my_list = [my_tree.name]
2011
+ for infect in my_tree.infects:
2012
+ my_list += find_all_descendants(infect) # More recursion
2013
+ return my_list
2014
+ find_all_descendants(daphne)
2015
+ # Lecture22ScikitLearn.py
2016
+ from sklearn.datasets import load_digits
2017
+ import matplotlib.pyplot as plt
2018
+ digits = load_digits()
2019
+ print(digits.data.shape) # Examples x 64 pixels
2020
+ import matplotlib.pyplot as plt
2021
+ plt.gray()
2022
+ plt.matshow(digits.images[0]) # Notice images[0] is 2D
2023
+ from warnings import simplefilter
2024
+ simplefilter(action='ignore', category=FutureWarning)
2025
+ from sklearn.neighbors import KNeighborsClassifier
2026
+ nbrs = KNeighborsClassifier(n_neighbors=3).fit(digits.data, digits.target)
2027
+ nbrs.score(digits.data, digits.target) # Find accuracy on the training dataset
2028
+ from sklearn.model_selection import train_test_split
2029
+ data_train, data_test, label_train, label_test = train_test_split(digits.data, digits.target, test_size=0.2)
2030
+ nbrs = KNeighborsClassifier(n_neighbors=3).fit(data_train, label_train)
2031
+ nbrs.score(data_test,label_test)
2032
+ print(nbrs.predict(data_test[0:3]))
2033
+ def reshape_and_show(num, data_test):
2034
+ image = data_test[num].reshape(8,8)
2035
+ plt.matshow(image)
2036
+ reshape_and_show(0,data_test)
2037
+ reshape_and_show(1,data_test)
2038
+ reshape_and_show(2,data_test)
2039
+ from sklearn.datasets import fetch_lfw_people
2040
+ faces = fetch_lfw_people(min_faces_per_person = 100)
2041
+ plt.imshow(faces.images[5], cmap="gray")
2042
+ data_train, data_test, label_train, label_test = train_test_split(faces.data, faces.target, test_size=0.2)
2043
+ nbrs = KNeighborsClassifier(n_neighbors=3).fit(data_train, label_train)
2044
+
2045
+ nbrs.score(data_test,label_test)
2046
+ import random
2047
+ random.seed(110) # Set seed - comment this out to get different rolls
2048
+ print(random.randint(1,8)) # Normally produces random integer 1-8
2049
+ print(random.randint(1,8))
2050
+ data_train, data_test, label_train, label_test = train_test_split(faces.data,
2051
+ faces.target, test_size=0.2,
2052
+ random_state=110) # Set the seed
2053
+ nbrs = KNeighborsClassifier(n_neighbors=3).fit(data_train, label_train)
2054
+
2055
+ nbrs.score(data_test,label_test)
2056
+ from sklearn.model_selection import cross_val_score
2057
+ cross_val_score(nbrs, data_train, label_train)
2058
+ import numpy as np
2059
+ for i in range(1,10):
2060
+ nbrs = KNeighborsClassifier(n_neighbors=i)
2061
+ print(np.mean(cross_val_score(nbrs, data_train, label_train)))
2062
+ # Lecture23DecisionTrees.py
2063
+ import math
2064
+ yes_branch_entropy = 0
2065
+ no_branch_entropy = -0.2 * math.log(0.2,2) - 0.8 * math.log(0.8, 2)
2066
+ pr_yes = 5/2005
2067
+ pr_no = 2000/2005
2068
+ print(pr_yes * yes_branch_entropy + pr_no * no_branch_entropy)
2069
+ from sklearn.datasets import load_iris
2070
+ from sklearn.model_selection import train_test_split
2071
+ import numpy as np
2072
+ iris = load_iris()
2073
+ iris.feature_names
2074
+ iris.target_names
2075
+ iris.data[0]
2076
+ features_train, features_test, labels_train, labels_test = \
2077
+ train_test_split(iris.data, iris.target, test_size=0.1, random_state=110)
2078
+ from sklearn.tree import DecisionTreeClassifier
2079
+ from sklearn.model_selection import cross_val_score
2080
+ dtree = DecisionTreeClassifier(criterion="entropy", random_state=110)
2081
+ dtree.fit(features_train, labels_train)
2082
+ dtree.score(features_test, labels_test) # Gives accuracy
2083
+ import matplotlib.pyplot as plt
2084
+ from sklearn import tree
2085
+ plt.figure(figsize=(14,10))
2086
+ tree.plot_tree(dtree, feature_names = iris.feature_names, class_names = iris.target_names)
2087
+ # Lecture24RandomForestsOnly.py
2088
+ from sklearn.datasets import load_iris
2089
+ from sklearn.ensemble import RandomForestClassifier
2090
+ from sklearn.model_selection import train_test_split
2091
+ import numpy as np
2092
+ iris = load_iris()
2093
+ iris["feature_names"]
2094
+ features_train, features_test, labels_train, labels_test = \
2095
+ train_test_split(iris['data'], iris['target'],
2096
+ test_size=0.1,random_state=110)
2097
+ irisforest = RandomForestClassifier(n_estimators=200,criterion="entropy",random_state=110)
2098
+ irisforest.fit(features_train, labels_train)
2099
+ irisforest.score(features_test, labels_test)
2100
+ irisforest.feature_importances_
2101
+ # Lecture25Regression.py
2102
+ import numpy as np
2103
+ x = np.linspace(1984, 2016, 33)
2104
+ y = [48.0, 47.3, 47.2, 47.4, 47.2, 46.7,
2105
+ 49.7, 49.6, 46.4, 47.3, 47.7, 47.8, 47.3, 47.4, 50.4, 49.8,
2106
+ 47.5, 49.1, 49.4, 47.1, 47.6, 48.4, 50.1, 48.3, 48.6, 47.8,
2107
+ 50.4, 49.7, 51.4, 48.8, 47.7, 48.5, 50.3]
2108
+ import matplotlib.pyplot as plt
2109
+ plt.plot(x,y,'o')
2110
+ import sklearn.linear_model as lm
2111
+ from sklearn.linear_model import LinearRegression
2112
+ linear_model = LinearRegression()
2113
+ x = x.reshape(-1,1)
2114
+ linear_model.fit(x,y)
2115
+ y_hat = linear_model.predict(x)
2116
+ plt.plot(x,y,'o')
2117
+ plt.plot(x,y_hat,'r')
2118
+ print(f'The temperature is rising {linear_model.coef_[0]:.4f} degrees F per year')
2119
+ print(f'{linear_model.intercept_:.2f}')
2120
+ linear_model.score(x,y)
2121
+ methane = np.array([12.81, 25.15, 38.06, 49.47, 60.24, 71.32,
2122
+ 80.08, 94.14, 96.49, 100.32, 107.54, 111.50, 113.97, 120.26, 132.39, 134.82,
2123
+ 133.30, 132.60, 135.91, 140.65, 135.76, 136.14, 138.11, 145.90, 152.41, 157.13,
2124
+ 162.33, 167.15, 172.17, 177.86, 190.62, 200.65, 207.73])
2125
+ mass_co = [84, 82.7, 84.9, 81.7, 81.9, 79.2, 79.9, 85.9, 84.3, 81.9,
2126
+ 82.9, 82.8,83.7, 85, 83.6, 85, 77.1, 80.4, 77.2, 70.6,
2127
+ 72.0, 68.1, 61.9, 65.7, 63.8, 65.6, 63.9]
2128
+ y_from_90 = y[6:] # From the last example, these are the temperatures
2129
+ methane_from_90 = methane[6:]
2130
+ x = np.transpose(np.array([mass_co, methane_from_90]))
2131
+ x
2132
+ temp_model = LinearRegression()
2133
+ temp_model.fit(x,y_from_90)
2134
+ print(temp_model.coef_)
2135
+ print(temp_model.intercept_)
2136
+ from sklearn.tree import DecisionTreeRegressor
2137
+ import numpy as np
2138
+ import matplotlib.pyplot as plt
2139
+ model = DecisionTreeRegressor() # no pruning of any kind, so expect overfitting
2140
+ x = np.linspace(1984, 2016, 33)
2141
+ x = x.reshape(-1,1)
2142
+ y = [48.0, 47.3, 47.2, 47.4, 47.2, 46.7,
2143
+ 49.7, 49.6, 46.4, 47.3, 47.7, 47.8, 47.3, 47.4, 50.4, 49.8,
2144
+ 47.5, 49.1, 49.4, 47.1, 47.6, 48.4, 50.1, 48.3, 48.6, 47.8,
2145
+ 50.4, 49.7, 51.4, 48.8, 47.7, 48.5, 50.3]
2146
+ xtrain = x[:30]
2147
+ ytrain = y[:30]
2148
+ model.fit(xtrain,ytrain)
2149
+ yhat = model.predict(x)
2150
+ plt.plot(x,y,'o')
2151
+ plt.plot(x[:30],yhat[:30])
2152
+ plt.plot(x[29:],yhat[29:],'r') # Plot line to test predictions in red
2153
+ model = DecisionTreeRegressor(max_depth = 3) # maybe overdoing it on the pruning
2154
+ x = np.linspace(1984, 2016, 33)
2155
+ prev_value_features = [0] + y.copy()[:-1] # shift y values so we see the previous one; discard last
2156
+ combined_features = np.array([x, prev_value_features]).transpose()
2157
+ print(combined_features)
2158
+ xtrain = combined_features[:30,:]
2159
+ model.fit(xtrain,ytrain)
2160
+ yhat = model.predict(combined_features)
2161
+ plt.plot(x,y,'o')
2162
+ plt.plot(x[:30],yhat[:30])
2163
+ plt.plot(x[29:],yhat[29:],'r')
2164
+ from sklearn.ensemble import RandomForestRegressor
2165
+ model = RandomForestRegressor()
2166
+ model.fit(xtrain,ytrain) # xtrain has the matrix we made in the previous code box
2167
+ yhat = model.predict(combined_features)
2168
+ plt.plot(x,y,'o')
2169
+ plt.plot(x[:30],yhat[:30])
2170
+ plt.plot(x[29:],yhat[29:],'r')
2171
+ from sklearn.neighbors import KNeighborsRegressor
2172
+ model = KNeighborsRegressor(n_neighbors=3)
2173
+ model.fit(xtrain,ytrain) # xtrain has the matrix we made in the previous code box
2174
+ yhat = model.predict(combined_features)
2175
+ plt.plot(x,y,'o')
2176
+ plt.plot(x[:30],yhat[:30])
2177
+ plt.plot(x[29:],yhat[29:],'r')
2178
+ # Lecture26ModernNLPandML.py
2179
+ import pandas as pd
2180
+ SST2_LOC = 'https://github.com/clairett/pytorch-sentiment-classification/raw/master/data/SST2/train.tsv'
2181
+ df = pd.read_csv(SST2_LOC, delimiter='\t', header=None)
2182
+ df
2183
+ import nltk
2184
+ from nltk.tokenize import word_tokenize
2185
+ nltk.download('punkt') # Name means 'period' in German; from Kiss and Strunk 2006
2186
+ word_tokenize("I won't sell my cat for even $1,000,000,000.")
2187
+ def wordset(raw_text):
2188
+ tokenized = word_tokenize(raw_text.lower())
2189
+ return set(tokenized)
2190
+ def all_words_set(df_column):
2191
+ set_of_all = set()
2192
+ dict_of_all = {}
2193
+ for row in df_column:
2194
+ textset = wordset(row)
2195
+ set_of_all = set_of_all.union(textset)
2196
+ dict_of_all[row] = textset
2197
+ return set_of_all, dict_of_all
2198
+ def one_hot_columns(df_column):
2199
+ all_words, all_tokenizations = all_words_set(df_column)
2200
+ word_dict = {}
2201
+ for word in all_words:
2202
+ word_present_list = []
2203
+ for line_num in range(len(df_column)):
2204
+ if word in all_tokenizations[df_column[line_num]]:
2205
+ word_present_list.append(1)
2206
+ else:
2207
+ word_present_list.append(0)
2208
+ word_dict[word] = word_present_list
2209
+ # We can create a dataframe from a dictionary of column header
2210
+ # to list of column values
2211
+ return pd.DataFrame.from_dict(word_dict)
2212
+ one_hot_cols = one_hot_columns(df.iloc[:,0])
2213
+ one_hot_cols
2214
+ from sklearn.model_selection import train_test_split
2215
+ from sklearn.ensemble import RandomForestClassifier
2216
+ labels = df[1]
2217
+ features = one_hot_cols
2218
+ X_train, X_test, y_train, y_test = train_test_split(features, labels, random_state=42)
2219
+ clf = RandomForestClassifier(n_estimators=200, random_state=42)
2220
+ clf.fit(X_train, y_train)
2221
+ clf.score(X_test, y_test)
2222
+ one_hot_cols.sum()
2223
+ import gensim.downloader as api
2224
+ wv = api.load('word2vec-google-news-300')
2225
+ wv['king']
2226
+ print(wv.most_similar('king')) # Prints words and cosines of angles with 'king'
2227
+ import numpy as np
2228
+ def find_cosine(vec1, vec2):
2229
+ # Scale vectors to both have unit length
2230
+ unit_vec1 = vec1/np.linalg.norm(vec1)
2231
+ unit_vec2 = vec2/np.linalg.norm(vec2)
2232
+ # The dot product of unit vectors gives the cosine of their angle
2233
+ return np.dot(unit_vec1,unit_vec2)
2234
+ print(find_cosine(wv['king'], wv['faucet']))
2235
+ wv.similarity('king', 'faucet')
2236
+ def find_avg_vector(txt, embedding):
2237
+ words = word_tokenize(txt)
2238
+ vec_sum = None
2239
+ count = 0
2240
+ for word in words:
2241
+ if word in embedding:
2242
+ count += 1
2243
+ if vec_sum is not None:
2244
+ vec_sum += embedding[word]
2245
+ else:
2246
+ # The embeddings are read-only unless you copy them
2247
+ vec_sum = embedding[word].copy()
2248
+ if vec_sum is None:
2249
+ return pd.Series(np.zeros((300,))) # Treat no word found in embedding as zero vector
2250
+ return pd.Series(vec_sum/count)
2251
+ find_avg_vector('Long live the king and queen!', wv)
2252
+ df_embeddings = df[0].apply(lambda txt: find_avg_vector(txt, wv))
2253
+ df_embeddings.rename(columns=lambda x: 'feature'+str(x), inplace=True)
2254
+ df_augmented = pd.concat([df, df_embeddings], axis=1)
2255
+ df_augmented
2256
+ from sklearn.model_selection import train_test_split
2257
+ from sklearn.ensemble import RandomForestClassifier
2258
+ labels = df_augmented[1]
2259
+ features = df_augmented.iloc[:,2:]
2260
+ X_train, X_test, y_train, y_test = train_test_split(features, labels, random_state=42)
2261
+ clf = RandomForestClassifier(n_estimators=200, random_state=42)
2262
+ clf.fit(X_train, y_train)
2263
+ clf.score(X_test, y_test)