EddyGiusepe commited on
Commit
4752b1c
·
1 Parent(s): cf5b1ec

Usando Pycaret para ML

Browse files
Files changed (1) hide show
  1. Multiclass_Classification.ipynb +1012 -0
Multiclass_Classification.ipynb ADDED
@@ -0,0 +1,1012 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# <h1 align=\"center\"><font color=\"red\">Multiclass Classification</font></h1>"
8
+ ]
9
+ },
10
+ {
11
+ "cell_type": "markdown",
12
+ "metadata": {},
13
+ "source": [
14
+ "<font color=\"yellow\">Data Scientist.: Dr. Eddy Giusepe Chirinos Isidro</font>"
15
+ ]
16
+ },
17
+ {
18
+ "cell_type": "markdown",
19
+ "metadata": {},
20
+ "source": [
21
+ "Link de estudo:\n",
22
+ "\n",
23
+ "* [pycaret 3.2.0](https://pypi.org/project/pycaret/)"
24
+ ]
25
+ },
26
+ {
27
+ "cell_type": "code",
28
+ "execution_count": 1,
29
+ "metadata": {},
30
+ "outputs": [
31
+ {
32
+ "data": {
33
+ "text/html": [
34
+ "<div>\n",
35
+ "<style scoped>\n",
36
+ " .dataframe tbody tr th:only-of-type {\n",
37
+ " vertical-align: middle;\n",
38
+ " }\n",
39
+ "\n",
40
+ " .dataframe tbody tr th {\n",
41
+ " vertical-align: top;\n",
42
+ " }\n",
43
+ "\n",
44
+ " .dataframe thead th {\n",
45
+ " text-align: right;\n",
46
+ " }\n",
47
+ "</style>\n",
48
+ "<table border=\"1\" class=\"dataframe\">\n",
49
+ " <thead>\n",
50
+ " <tr style=\"text-align: right;\">\n",
51
+ " <th></th>\n",
52
+ " <th>sepal_length</th>\n",
53
+ " <th>sepal_width</th>\n",
54
+ " <th>petal_length</th>\n",
55
+ " <th>petal_width</th>\n",
56
+ " <th>species</th>\n",
57
+ " </tr>\n",
58
+ " </thead>\n",
59
+ " <tbody>\n",
60
+ " <tr>\n",
61
+ " <th>0</th>\n",
62
+ " <td>5.1</td>\n",
63
+ " <td>3.5</td>\n",
64
+ " <td>1.4</td>\n",
65
+ " <td>0.2</td>\n",
66
+ " <td>Iris-setosa</td>\n",
67
+ " </tr>\n",
68
+ " <tr>\n",
69
+ " <th>1</th>\n",
70
+ " <td>4.9</td>\n",
71
+ " <td>3.0</td>\n",
72
+ " <td>1.4</td>\n",
73
+ " <td>0.2</td>\n",
74
+ " <td>Iris-setosa</td>\n",
75
+ " </tr>\n",
76
+ " <tr>\n",
77
+ " <th>2</th>\n",
78
+ " <td>4.7</td>\n",
79
+ " <td>3.2</td>\n",
80
+ " <td>1.3</td>\n",
81
+ " <td>0.2</td>\n",
82
+ " <td>Iris-setosa</td>\n",
83
+ " </tr>\n",
84
+ " <tr>\n",
85
+ " <th>3</th>\n",
86
+ " <td>4.6</td>\n",
87
+ " <td>3.1</td>\n",
88
+ " <td>1.5</td>\n",
89
+ " <td>0.2</td>\n",
90
+ " <td>Iris-setosa</td>\n",
91
+ " </tr>\n",
92
+ " <tr>\n",
93
+ " <th>4</th>\n",
94
+ " <td>5.0</td>\n",
95
+ " <td>3.6</td>\n",
96
+ " <td>1.4</td>\n",
97
+ " <td>0.2</td>\n",
98
+ " <td>Iris-setosa</td>\n",
99
+ " </tr>\n",
100
+ " </tbody>\n",
101
+ "</table>\n",
102
+ "</div>"
103
+ ],
104
+ "text/plain": [
105
+ " sepal_length sepal_width petal_length petal_width species\n",
106
+ "0 5.1 3.5 1.4 0.2 Iris-setosa\n",
107
+ "1 4.9 3.0 1.4 0.2 Iris-setosa\n",
108
+ "2 4.7 3.2 1.3 0.2 Iris-setosa\n",
109
+ "3 4.6 3.1 1.5 0.2 Iris-setosa\n",
110
+ "4 5.0 3.6 1.4 0.2 Iris-setosa"
111
+ ]
112
+ },
113
+ "metadata": {},
114
+ "output_type": "display_data"
115
+ }
116
+ ],
117
+ "source": [
118
+ "from pycaret.datasets import get_data\n",
119
+ "\n",
120
+ "\n",
121
+ "dataset= get_data(\"iris\")"
122
+ ]
123
+ },
124
+ {
125
+ "cell_type": "code",
126
+ "execution_count": 2,
127
+ "metadata": {},
128
+ "outputs": [
129
+ {
130
+ "data": {
131
+ "text/plain": [
132
+ "(150, 5)"
133
+ ]
134
+ },
135
+ "execution_count": 2,
136
+ "metadata": {},
137
+ "output_type": "execute_result"
138
+ }
139
+ ],
140
+ "source": [
141
+ "dataset.shape"
142
+ ]
143
+ },
144
+ {
145
+ "cell_type": "code",
146
+ "execution_count": 3,
147
+ "metadata": {},
148
+ "outputs": [
149
+ {
150
+ "data": {
151
+ "text/plain": [
152
+ "array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)"
153
+ ]
154
+ },
155
+ "execution_count": 3,
156
+ "metadata": {},
157
+ "output_type": "execute_result"
158
+ }
159
+ ],
160
+ "source": [
161
+ "dataset[\"species\"].unique()"
162
+ ]
163
+ },
164
+ {
165
+ "cell_type": "code",
166
+ "execution_count": 4,
167
+ "metadata": {},
168
+ "outputs": [
169
+ {
170
+ "data": {
171
+ "text/html": [
172
+ "<div>\n",
173
+ "<style scoped>\n",
174
+ " .dataframe tbody tr th:only-of-type {\n",
175
+ " vertical-align: middle;\n",
176
+ " }\n",
177
+ "\n",
178
+ " .dataframe tbody tr th {\n",
179
+ " vertical-align: top;\n",
180
+ " }\n",
181
+ "\n",
182
+ " .dataframe thead th {\n",
183
+ " text-align: right;\n",
184
+ " }\n",
185
+ "</style>\n",
186
+ "<table border=\"1\" class=\"dataframe\">\n",
187
+ " <thead>\n",
188
+ " <tr style=\"text-align: right;\">\n",
189
+ " <th></th>\n",
190
+ " <th>sepal_length</th>\n",
191
+ " <th>sepal_width</th>\n",
192
+ " <th>petal_length</th>\n",
193
+ " <th>petal_width</th>\n",
194
+ " <th>species</th>\n",
195
+ " </tr>\n",
196
+ " </thead>\n",
197
+ " <tbody>\n",
198
+ " <tr>\n",
199
+ " <th>145</th>\n",
200
+ " <td>6.7</td>\n",
201
+ " <td>3.0</td>\n",
202
+ " <td>5.2</td>\n",
203
+ " <td>2.3</td>\n",
204
+ " <td>Iris-virginica</td>\n",
205
+ " </tr>\n",
206
+ " <tr>\n",
207
+ " <th>146</th>\n",
208
+ " <td>6.3</td>\n",
209
+ " <td>2.5</td>\n",
210
+ " <td>5.0</td>\n",
211
+ " <td>1.9</td>\n",
212
+ " <td>Iris-virginica</td>\n",
213
+ " </tr>\n",
214
+ " <tr>\n",
215
+ " <th>147</th>\n",
216
+ " <td>6.5</td>\n",
217
+ " <td>3.0</td>\n",
218
+ " <td>5.2</td>\n",
219
+ " <td>2.0</td>\n",
220
+ " <td>Iris-virginica</td>\n",
221
+ " </tr>\n",
222
+ " <tr>\n",
223
+ " <th>148</th>\n",
224
+ " <td>6.2</td>\n",
225
+ " <td>3.4</td>\n",
226
+ " <td>5.4</td>\n",
227
+ " <td>2.3</td>\n",
228
+ " <td>Iris-virginica</td>\n",
229
+ " </tr>\n",
230
+ " <tr>\n",
231
+ " <th>149</th>\n",
232
+ " <td>5.9</td>\n",
233
+ " <td>3.0</td>\n",
234
+ " <td>5.1</td>\n",
235
+ " <td>1.8</td>\n",
236
+ " <td>Iris-virginica</td>\n",
237
+ " </tr>\n",
238
+ " </tbody>\n",
239
+ "</table>\n",
240
+ "</div>"
241
+ ],
242
+ "text/plain": [
243
+ " sepal_length sepal_width petal_length petal_width species\n",
244
+ "145 6.7 3.0 5.2 2.3 Iris-virginica\n",
245
+ "146 6.3 2.5 5.0 1.9 Iris-virginica\n",
246
+ "147 6.5 3.0 5.2 2.0 Iris-virginica\n",
247
+ "148 6.2 3.4 5.4 2.3 Iris-virginica\n",
248
+ "149 5.9 3.0 5.1 1.8 Iris-virginica"
249
+ ]
250
+ },
251
+ "execution_count": 4,
252
+ "metadata": {},
253
+ "output_type": "execute_result"
254
+ }
255
+ ],
256
+ "source": [
257
+ "# Mostra as 5 últimas linhas:\n",
258
+ "dataset.tail()"
259
+ ]
260
+ },
261
+ {
262
+ "cell_type": "code",
263
+ "execution_count": 5,
264
+ "metadata": {},
265
+ "outputs": [
266
+ {
267
+ "data": {
268
+ "text/html": [
269
+ "<div>\n",
270
+ "<style scoped>\n",
271
+ " .dataframe tbody tr th:only-of-type {\n",
272
+ " vertical-align: middle;\n",
273
+ " }\n",
274
+ "\n",
275
+ " .dataframe tbody tr th {\n",
276
+ " vertical-align: top;\n",
277
+ " }\n",
278
+ "\n",
279
+ " .dataframe thead th {\n",
280
+ " text-align: right;\n",
281
+ " }\n",
282
+ "</style>\n",
283
+ "<table border=\"1\" class=\"dataframe\">\n",
284
+ " <thead>\n",
285
+ " <tr style=\"text-align: right;\">\n",
286
+ " <th></th>\n",
287
+ " <th>sepal_length</th>\n",
288
+ " <th>sepal_width</th>\n",
289
+ " <th>petal_length</th>\n",
290
+ " <th>petal_width</th>\n",
291
+ " </tr>\n",
292
+ " </thead>\n",
293
+ " <tbody>\n",
294
+ " <tr>\n",
295
+ " <th>count</th>\n",
296
+ " <td>150.000000</td>\n",
297
+ " <td>150.000000</td>\n",
298
+ " <td>150.000000</td>\n",
299
+ " <td>150.000000</td>\n",
300
+ " </tr>\n",
301
+ " <tr>\n",
302
+ " <th>mean</th>\n",
303
+ " <td>5.843333</td>\n",
304
+ " <td>3.054000</td>\n",
305
+ " <td>3.758667</td>\n",
306
+ " <td>1.198667</td>\n",
307
+ " </tr>\n",
308
+ " <tr>\n",
309
+ " <th>std</th>\n",
310
+ " <td>0.828066</td>\n",
311
+ " <td>0.433594</td>\n",
312
+ " <td>1.764420</td>\n",
313
+ " <td>0.763161</td>\n",
314
+ " </tr>\n",
315
+ " <tr>\n",
316
+ " <th>min</th>\n",
317
+ " <td>4.300000</td>\n",
318
+ " <td>2.000000</td>\n",
319
+ " <td>1.000000</td>\n",
320
+ " <td>0.100000</td>\n",
321
+ " </tr>\n",
322
+ " <tr>\n",
323
+ " <th>25%</th>\n",
324
+ " <td>5.100000</td>\n",
325
+ " <td>2.800000</td>\n",
326
+ " <td>1.600000</td>\n",
327
+ " <td>0.300000</td>\n",
328
+ " </tr>\n",
329
+ " <tr>\n",
330
+ " <th>50%</th>\n",
331
+ " <td>5.800000</td>\n",
332
+ " <td>3.000000</td>\n",
333
+ " <td>4.350000</td>\n",
334
+ " <td>1.300000</td>\n",
335
+ " </tr>\n",
336
+ " <tr>\n",
337
+ " <th>75%</th>\n",
338
+ " <td>6.400000</td>\n",
339
+ " <td>3.300000</td>\n",
340
+ " <td>5.100000</td>\n",
341
+ " <td>1.800000</td>\n",
342
+ " </tr>\n",
343
+ " <tr>\n",
344
+ " <th>max</th>\n",
345
+ " <td>7.900000</td>\n",
346
+ " <td>4.400000</td>\n",
347
+ " <td>6.900000</td>\n",
348
+ " <td>2.500000</td>\n",
349
+ " </tr>\n",
350
+ " </tbody>\n",
351
+ "</table>\n",
352
+ "</div>"
353
+ ],
354
+ "text/plain": [
355
+ " sepal_length sepal_width petal_length petal_width\n",
356
+ "count 150.000000 150.000000 150.000000 150.000000\n",
357
+ "mean 5.843333 3.054000 3.758667 1.198667\n",
358
+ "std 0.828066 0.433594 1.764420 0.763161\n",
359
+ "min 4.300000 2.000000 1.000000 0.100000\n",
360
+ "25% 5.100000 2.800000 1.600000 0.300000\n",
361
+ "50% 5.800000 3.000000 4.350000 1.300000\n",
362
+ "75% 6.400000 3.300000 5.100000 1.800000\n",
363
+ "max 7.900000 4.400000 6.900000 2.500000"
364
+ ]
365
+ },
366
+ "execution_count": 5,
367
+ "metadata": {},
368
+ "output_type": "execute_result"
369
+ }
370
+ ],
371
+ "source": [
372
+ "dataset.describe()"
373
+ ]
374
+ },
375
+ {
376
+ "cell_type": "code",
377
+ "execution_count": 6,
378
+ "metadata": {},
379
+ "outputs": [
380
+ {
381
+ "data": {
382
+ "text/html": [
383
+ "<div>\n",
384
+ "<style scoped>\n",
385
+ " .dataframe tbody tr th:only-of-type {\n",
386
+ " vertical-align: middle;\n",
387
+ " }\n",
388
+ "\n",
389
+ " .dataframe tbody tr th {\n",
390
+ " vertical-align: top;\n",
391
+ " }\n",
392
+ "\n",
393
+ " .dataframe thead th {\n",
394
+ " text-align: right;\n",
395
+ " }\n",
396
+ "</style>\n",
397
+ "<table border=\"1\" class=\"dataframe\">\n",
398
+ " <thead>\n",
399
+ " <tr style=\"text-align: right;\">\n",
400
+ " <th></th>\n",
401
+ " <th>sepal_length</th>\n",
402
+ " <th>sepal_width</th>\n",
403
+ " <th>petal_length</th>\n",
404
+ " <th>petal_width</th>\n",
405
+ " <th>species</th>\n",
406
+ " </tr>\n",
407
+ " </thead>\n",
408
+ " <tbody>\n",
409
+ " <tr>\n",
410
+ " <th>73</th>\n",
411
+ " <td>6.1</td>\n",
412
+ " <td>2.8</td>\n",
413
+ " <td>4.7</td>\n",
414
+ " <td>1.2</td>\n",
415
+ " <td>Iris-versicolor</td>\n",
416
+ " </tr>\n",
417
+ " <tr>\n",
418
+ " <th>18</th>\n",
419
+ " <td>5.7</td>\n",
420
+ " <td>3.8</td>\n",
421
+ " <td>1.7</td>\n",
422
+ " <td>0.3</td>\n",
423
+ " <td>Iris-setosa</td>\n",
424
+ " </tr>\n",
425
+ " <tr>\n",
426
+ " <th>118</th>\n",
427
+ " <td>7.7</td>\n",
428
+ " <td>2.6</td>\n",
429
+ " <td>6.9</td>\n",
430
+ " <td>2.3</td>\n",
431
+ " <td>Iris-virginica</td>\n",
432
+ " </tr>\n",
433
+ " <tr>\n",
434
+ " <th>78</th>\n",
435
+ " <td>6.0</td>\n",
436
+ " <td>2.9</td>\n",
437
+ " <td>4.5</td>\n",
438
+ " <td>1.5</td>\n",
439
+ " <td>Iris-versicolor</td>\n",
440
+ " </tr>\n",
441
+ " <tr>\n",
442
+ " <th>76</th>\n",
443
+ " <td>6.8</td>\n",
444
+ " <td>2.8</td>\n",
445
+ " <td>4.8</td>\n",
446
+ " <td>1.4</td>\n",
447
+ " <td>Iris-versicolor</td>\n",
448
+ " </tr>\n",
449
+ " </tbody>\n",
450
+ "</table>\n",
451
+ "</div>"
452
+ ],
453
+ "text/plain": [
454
+ " sepal_length sepal_width petal_length petal_width species\n",
455
+ "73 6.1 2.8 4.7 1.2 Iris-versicolor\n",
456
+ "18 5.7 3.8 1.7 0.3 Iris-setosa\n",
457
+ "118 7.7 2.6 6.9 2.3 Iris-virginica\n",
458
+ "78 6.0 2.9 4.5 1.5 Iris-versicolor\n",
459
+ "76 6.8 2.8 4.8 1.4 Iris-versicolor"
460
+ ]
461
+ },
462
+ "execution_count": 6,
463
+ "metadata": {},
464
+ "output_type": "execute_result"
465
+ }
466
+ ],
467
+ "source": [
468
+ "# Dados de Treinamento:\n",
469
+ "\n",
470
+ "data_train = dataset.sample(frac=0.9, random_state=42)\n",
471
+ "\n",
472
+ "data_train.head()"
473
+ ]
474
+ },
475
+ {
476
+ "cell_type": "code",
477
+ "execution_count": 7,
478
+ "metadata": {},
479
+ "outputs": [
480
+ {
481
+ "data": {
482
+ "text/plain": [
483
+ "(135, 5)"
484
+ ]
485
+ },
486
+ "execution_count": 7,
487
+ "metadata": {},
488
+ "output_type": "execute_result"
489
+ }
490
+ ],
491
+ "source": [
492
+ "data_train.shape"
493
+ ]
494
+ },
495
+ {
496
+ "cell_type": "code",
497
+ "execution_count": 8,
498
+ "metadata": {},
499
+ "outputs": [
500
+ {
501
+ "data": {
502
+ "text/html": [
503
+ "<div>\n",
504
+ "<style scoped>\n",
505
+ " .dataframe tbody tr th:only-of-type {\n",
506
+ " vertical-align: middle;\n",
507
+ " }\n",
508
+ "\n",
509
+ " .dataframe tbody tr th {\n",
510
+ " vertical-align: top;\n",
511
+ " }\n",
512
+ "\n",
513
+ " .dataframe thead th {\n",
514
+ " text-align: right;\n",
515
+ " }\n",
516
+ "</style>\n",
517
+ "<table border=\"1\" class=\"dataframe\">\n",
518
+ " <thead>\n",
519
+ " <tr style=\"text-align: right;\">\n",
520
+ " <th></th>\n",
521
+ " <th>sepal_length</th>\n",
522
+ " <th>sepal_width</th>\n",
523
+ " <th>petal_length</th>\n",
524
+ " <th>petal_width</th>\n",
525
+ " <th>species</th>\n",
526
+ " </tr>\n",
527
+ " </thead>\n",
528
+ " <tbody>\n",
529
+ " <tr>\n",
530
+ " <th>14</th>\n",
531
+ " <td>5.8</td>\n",
532
+ " <td>4.0</td>\n",
533
+ " <td>1.2</td>\n",
534
+ " <td>0.2</td>\n",
535
+ " <td>Iris-setosa</td>\n",
536
+ " </tr>\n",
537
+ " <tr>\n",
538
+ " <th>20</th>\n",
539
+ " <td>5.4</td>\n",
540
+ " <td>3.4</td>\n",
541
+ " <td>1.7</td>\n",
542
+ " <td>0.2</td>\n",
543
+ " <td>Iris-setosa</td>\n",
544
+ " </tr>\n",
545
+ " <tr>\n",
546
+ " <th>52</th>\n",
547
+ " <td>6.9</td>\n",
548
+ " <td>3.1</td>\n",
549
+ " <td>4.9</td>\n",
550
+ " <td>1.5</td>\n",
551
+ " <td>Iris-versicolor</td>\n",
552
+ " </tr>\n",
553
+ " <tr>\n",
554
+ " <th>71</th>\n",
555
+ " <td>6.1</td>\n",
556
+ " <td>2.8</td>\n",
557
+ " <td>4.0</td>\n",
558
+ " <td>1.3</td>\n",
559
+ " <td>Iris-versicolor</td>\n",
560
+ " </tr>\n",
561
+ " <tr>\n",
562
+ " <th>74</th>\n",
563
+ " <td>6.4</td>\n",
564
+ " <td>2.9</td>\n",
565
+ " <td>4.3</td>\n",
566
+ " <td>1.3</td>\n",
567
+ " <td>Iris-versicolor</td>\n",
568
+ " </tr>\n",
569
+ " </tbody>\n",
570
+ "</table>\n",
571
+ "</div>"
572
+ ],
573
+ "text/plain": [
574
+ " sepal_length sepal_width petal_length petal_width species\n",
575
+ "14 5.8 4.0 1.2 0.2 Iris-setosa\n",
576
+ "20 5.4 3.4 1.7 0.2 Iris-setosa\n",
577
+ "52 6.9 3.1 4.9 1.5 Iris-versicolor\n",
578
+ "71 6.1 2.8 4.0 1.3 Iris-versicolor\n",
579
+ "74 6.4 2.9 4.3 1.3 Iris-versicolor"
580
+ ]
581
+ },
582
+ "execution_count": 8,
583
+ "metadata": {},
584
+ "output_type": "execute_result"
585
+ }
586
+ ],
587
+ "source": [
588
+ "# Dados de Teste:\n",
589
+ "\n",
590
+ "data_test = dataset.drop(data_train.index)\n",
591
+ "\n",
592
+ "data_test.head()"
593
+ ]
594
+ },
595
+ {
596
+ "cell_type": "code",
597
+ "execution_count": 9,
598
+ "metadata": {},
599
+ "outputs": [
600
+ {
601
+ "data": {
602
+ "text/plain": [
603
+ "(15, 5)"
604
+ ]
605
+ },
606
+ "execution_count": 9,
607
+ "metadata": {},
608
+ "output_type": "execute_result"
609
+ }
610
+ ],
611
+ "source": [
612
+ "data_test.shape"
613
+ ]
614
+ },
615
+ {
616
+ "cell_type": "code",
617
+ "execution_count": 10,
618
+ "metadata": {},
619
+ "outputs": [
620
+ {
621
+ "data": {
622
+ "text/html": [
623
+ "<div>\n",
624
+ "<style scoped>\n",
625
+ " .dataframe tbody tr th:only-of-type {\n",
626
+ " vertical-align: middle;\n",
627
+ " }\n",
628
+ "\n",
629
+ " .dataframe tbody tr th {\n",
630
+ " vertical-align: top;\n",
631
+ " }\n",
632
+ "\n",
633
+ " .dataframe thead th {\n",
634
+ " text-align: right;\n",
635
+ " }\n",
636
+ "</style>\n",
637
+ "<table border=\"1\" class=\"dataframe\">\n",
638
+ " <thead>\n",
639
+ " <tr style=\"text-align: right;\">\n",
640
+ " <th></th>\n",
641
+ " <th>sepal_length</th>\n",
642
+ " <th>sepal_width</th>\n",
643
+ " <th>petal_length</th>\n",
644
+ " <th>petal_width</th>\n",
645
+ " <th>species</th>\n",
646
+ " </tr>\n",
647
+ " </thead>\n",
648
+ " <tbody>\n",
649
+ " <tr>\n",
650
+ " <th>0</th>\n",
651
+ " <td>6.1</td>\n",
652
+ " <td>2.8</td>\n",
653
+ " <td>4.7</td>\n",
654
+ " <td>1.2</td>\n",
655
+ " <td>Iris-versicolor</td>\n",
656
+ " </tr>\n",
657
+ " <tr>\n",
658
+ " <th>1</th>\n",
659
+ " <td>5.7</td>\n",
660
+ " <td>3.8</td>\n",
661
+ " <td>1.7</td>\n",
662
+ " <td>0.3</td>\n",
663
+ " <td>Iris-setosa</td>\n",
664
+ " </tr>\n",
665
+ " <tr>\n",
666
+ " <th>2</th>\n",
667
+ " <td>7.7</td>\n",
668
+ " <td>2.6</td>\n",
669
+ " <td>6.9</td>\n",
670
+ " <td>2.3</td>\n",
671
+ " <td>Iris-virginica</td>\n",
672
+ " </tr>\n",
673
+ " <tr>\n",
674
+ " <th>3</th>\n",
675
+ " <td>6.0</td>\n",
676
+ " <td>2.9</td>\n",
677
+ " <td>4.5</td>\n",
678
+ " <td>1.5</td>\n",
679
+ " <td>Iris-versicolor</td>\n",
680
+ " </tr>\n",
681
+ " <tr>\n",
682
+ " <th>4</th>\n",
683
+ " <td>6.8</td>\n",
684
+ " <td>2.8</td>\n",
685
+ " <td>4.8</td>\n",
686
+ " <td>1.4</td>\n",
687
+ " <td>Iris-versicolor</td>\n",
688
+ " </tr>\n",
689
+ " </tbody>\n",
690
+ "</table>\n",
691
+ "</div>"
692
+ ],
693
+ "text/plain": [
694
+ " sepal_length sepal_width petal_length petal_width species\n",
695
+ "0 6.1 2.8 4.7 1.2 Iris-versicolor\n",
696
+ "1 5.7 3.8 1.7 0.3 Iris-setosa\n",
697
+ "2 7.7 2.6 6.9 2.3 Iris-virginica\n",
698
+ "3 6.0 2.9 4.5 1.5 Iris-versicolor\n",
699
+ "4 6.8 2.8 4.8 1.4 Iris-versicolor"
700
+ ]
701
+ },
702
+ "execution_count": 10,
703
+ "metadata": {},
704
+ "output_type": "execute_result"
705
+ }
706
+ ],
707
+ "source": [
708
+ "# Tratando os índices dos dados de Treinamento:\n",
709
+ "\n",
710
+ "data_train.reset_index(drop=True, inplace=True)\n",
711
+ "\n",
712
+ "data_train.head()"
713
+ ]
714
+ },
715
+ {
716
+ "cell_type": "code",
717
+ "execution_count": 11,
718
+ "metadata": {},
719
+ "outputs": [
720
+ {
721
+ "data": {
722
+ "text/html": [
723
+ "<div>\n",
724
+ "<style scoped>\n",
725
+ " .dataframe tbody tr th:only-of-type {\n",
726
+ " vertical-align: middle;\n",
727
+ " }\n",
728
+ "\n",
729
+ " .dataframe tbody tr th {\n",
730
+ " vertical-align: top;\n",
731
+ " }\n",
732
+ "\n",
733
+ " .dataframe thead th {\n",
734
+ " text-align: right;\n",
735
+ " }\n",
736
+ "</style>\n",
737
+ "<table border=\"1\" class=\"dataframe\">\n",
738
+ " <thead>\n",
739
+ " <tr style=\"text-align: right;\">\n",
740
+ " <th></th>\n",
741
+ " <th>sepal_length</th>\n",
742
+ " <th>sepal_width</th>\n",
743
+ " <th>petal_length</th>\n",
744
+ " <th>petal_width</th>\n",
745
+ " <th>species</th>\n",
746
+ " </tr>\n",
747
+ " </thead>\n",
748
+ " <tbody>\n",
749
+ " <tr>\n",
750
+ " <th>0</th>\n",
751
+ " <td>5.8</td>\n",
752
+ " <td>4.0</td>\n",
753
+ " <td>1.2</td>\n",
754
+ " <td>0.2</td>\n",
755
+ " <td>Iris-setosa</td>\n",
756
+ " </tr>\n",
757
+ " <tr>\n",
758
+ " <th>1</th>\n",
759
+ " <td>5.4</td>\n",
760
+ " <td>3.4</td>\n",
761
+ " <td>1.7</td>\n",
762
+ " <td>0.2</td>\n",
763
+ " <td>Iris-setosa</td>\n",
764
+ " </tr>\n",
765
+ " <tr>\n",
766
+ " <th>2</th>\n",
767
+ " <td>6.9</td>\n",
768
+ " <td>3.1</td>\n",
769
+ " <td>4.9</td>\n",
770
+ " <td>1.5</td>\n",
771
+ " <td>Iris-versicolor</td>\n",
772
+ " </tr>\n",
773
+ " <tr>\n",
774
+ " <th>3</th>\n",
775
+ " <td>6.1</td>\n",
776
+ " <td>2.8</td>\n",
777
+ " <td>4.0</td>\n",
778
+ " <td>1.3</td>\n",
779
+ " <td>Iris-versicolor</td>\n",
780
+ " </tr>\n",
781
+ " <tr>\n",
782
+ " <th>4</th>\n",
783
+ " <td>6.4</td>\n",
784
+ " <td>2.9</td>\n",
785
+ " <td>4.3</td>\n",
786
+ " <td>1.3</td>\n",
787
+ " <td>Iris-versicolor</td>\n",
788
+ " </tr>\n",
789
+ " </tbody>\n",
790
+ "</table>\n",
791
+ "</div>"
792
+ ],
793
+ "text/plain": [
794
+ " sepal_length sepal_width petal_length petal_width species\n",
795
+ "0 5.8 4.0 1.2 0.2 Iris-setosa\n",
796
+ "1 5.4 3.4 1.7 0.2 Iris-setosa\n",
797
+ "2 6.9 3.1 4.9 1.5 Iris-versicolor\n",
798
+ "3 6.1 2.8 4.0 1.3 Iris-versicolor\n",
799
+ "4 6.4 2.9 4.3 1.3 Iris-versicolor"
800
+ ]
801
+ },
802
+ "execution_count": 11,
803
+ "metadata": {},
804
+ "output_type": "execute_result"
805
+ }
806
+ ],
807
+ "source": [
808
+ "# Tratando os índices dos dados de Teste:\n",
809
+ "\n",
810
+ "data_test.reset_index(drop=True, inplace=True)\n",
811
+ "\n",
812
+ "data_test.head()"
813
+ ]
814
+ },
815
+ {
816
+ "cell_type": "markdown",
817
+ "metadata": {},
818
+ "source": [
819
+ "<font color=\"orange\">Importando nosso modelo de `Classificação`:</font>"
820
+ ]
821
+ },
822
+ {
823
+ "cell_type": "code",
824
+ "execution_count": 12,
825
+ "metadata": {},
826
+ "outputs": [
827
+ {
828
+ "data": {
829
+ "text/html": [
830
+ "<style type=\"text/css\">\n",
831
+ "#T_6b0ab_row9_col1 {\n",
832
+ " background-color: lightgreen;\n",
833
+ "}\n",
834
+ "</style>\n",
835
+ "<table id=\"T_6b0ab\">\n",
836
+ " <thead>\n",
837
+ " <tr>\n",
838
+ " <th class=\"blank level0\" >&nbsp;</th>\n",
839
+ " <th id=\"T_6b0ab_level0_col0\" class=\"col_heading level0 col0\" >Description</th>\n",
840
+ " <th id=\"T_6b0ab_level0_col1\" class=\"col_heading level0 col1\" >Value</th>\n",
841
+ " </tr>\n",
842
+ " </thead>\n",
843
+ " <tbody>\n",
844
+ " <tr>\n",
845
+ " <th id=\"T_6b0ab_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
846
+ " <td id=\"T_6b0ab_row0_col0\" class=\"data row0 col0\" >Session id</td>\n",
847
+ " <td id=\"T_6b0ab_row0_col1\" class=\"data row0 col1\" >123</td>\n",
848
+ " </tr>\n",
849
+ " <tr>\n",
850
+ " <th id=\"T_6b0ab_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
851
+ " <td id=\"T_6b0ab_row1_col0\" class=\"data row1 col0\" >Target</td>\n",
852
+ " <td id=\"T_6b0ab_row1_col1\" class=\"data row1 col1\" >species</td>\n",
853
+ " </tr>\n",
854
+ " <tr>\n",
855
+ " <th id=\"T_6b0ab_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
856
+ " <td id=\"T_6b0ab_row2_col0\" class=\"data row2 col0\" >Target type</td>\n",
857
+ " <td id=\"T_6b0ab_row2_col1\" class=\"data row2 col1\" >Multiclass</td>\n",
858
+ " </tr>\n",
859
+ " <tr>\n",
860
+ " <th id=\"T_6b0ab_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
861
+ " <td id=\"T_6b0ab_row3_col0\" class=\"data row3 col0\" >Target mapping</td>\n",
862
+ " <td id=\"T_6b0ab_row3_col1\" class=\"data row3 col1\" >Iris-setosa: 0, Iris-versicolor: 1, Iris-virginica: 2</td>\n",
863
+ " </tr>\n",
864
+ " <tr>\n",
865
+ " <th id=\"T_6b0ab_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
866
+ " <td id=\"T_6b0ab_row4_col0\" class=\"data row4 col0\" >Original data shape</td>\n",
867
+ " <td id=\"T_6b0ab_row4_col1\" class=\"data row4 col1\" >(135, 5)</td>\n",
868
+ " </tr>\n",
869
+ " <tr>\n",
870
+ " <th id=\"T_6b0ab_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
871
+ " <td id=\"T_6b0ab_row5_col0\" class=\"data row5 col0\" >Transformed data shape</td>\n",
872
+ " <td id=\"T_6b0ab_row5_col1\" class=\"data row5 col1\" >(135, 5)</td>\n",
873
+ " </tr>\n",
874
+ " <tr>\n",
875
+ " <th id=\"T_6b0ab_level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
876
+ " <td id=\"T_6b0ab_row6_col0\" class=\"data row6 col0\" >Transformed train set shape</td>\n",
877
+ " <td id=\"T_6b0ab_row6_col1\" class=\"data row6 col1\" >(94, 5)</td>\n",
878
+ " </tr>\n",
879
+ " <tr>\n",
880
+ " <th id=\"T_6b0ab_level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
881
+ " <td id=\"T_6b0ab_row7_col0\" class=\"data row7 col0\" >Transformed test set shape</td>\n",
882
+ " <td id=\"T_6b0ab_row7_col1\" class=\"data row7 col1\" >(41, 5)</td>\n",
883
+ " </tr>\n",
884
+ " <tr>\n",
885
+ " <th id=\"T_6b0ab_level0_row8\" class=\"row_heading level0 row8\" >8</th>\n",
886
+ " <td id=\"T_6b0ab_row8_col0\" class=\"data row8 col0\" >Numeric features</td>\n",
887
+ " <td id=\"T_6b0ab_row8_col1\" class=\"data row8 col1\" >4</td>\n",
888
+ " </tr>\n",
889
+ " <tr>\n",
890
+ " <th id=\"T_6b0ab_level0_row9\" class=\"row_heading level0 row9\" >9</th>\n",
891
+ " <td id=\"T_6b0ab_row9_col0\" class=\"data row9 col0\" >Preprocess</td>\n",
892
+ " <td id=\"T_6b0ab_row9_col1\" class=\"data row9 col1\" >True</td>\n",
893
+ " </tr>\n",
894
+ " <tr>\n",
895
+ " <th id=\"T_6b0ab_level0_row10\" class=\"row_heading level0 row10\" >10</th>\n",
896
+ " <td id=\"T_6b0ab_row10_col0\" class=\"data row10 col0\" >Imputation type</td>\n",
897
+ " <td id=\"T_6b0ab_row10_col1\" class=\"data row10 col1\" >simple</td>\n",
898
+ " </tr>\n",
899
+ " <tr>\n",
900
+ " <th id=\"T_6b0ab_level0_row11\" class=\"row_heading level0 row11\" >11</th>\n",
901
+ " <td id=\"T_6b0ab_row11_col0\" class=\"data row11 col0\" >Numeric imputation</td>\n",
902
+ " <td id=\"T_6b0ab_row11_col1\" class=\"data row11 col1\" >mean</td>\n",
903
+ " </tr>\n",
904
+ " <tr>\n",
905
+ " <th id=\"T_6b0ab_level0_row12\" class=\"row_heading level0 row12\" >12</th>\n",
906
+ " <td id=\"T_6b0ab_row12_col0\" class=\"data row12 col0\" >Categorical imputation</td>\n",
907
+ " <td id=\"T_6b0ab_row12_col1\" class=\"data row12 col1\" >mode</td>\n",
908
+ " </tr>\n",
909
+ " <tr>\n",
910
+ " <th id=\"T_6b0ab_level0_row13\" class=\"row_heading level0 row13\" >13</th>\n",
911
+ " <td id=\"T_6b0ab_row13_col0\" class=\"data row13 col0\" >Fold Generator</td>\n",
912
+ " <td id=\"T_6b0ab_row13_col1\" class=\"data row13 col1\" >StratifiedKFold</td>\n",
913
+ " </tr>\n",
914
+ " <tr>\n",
915
+ " <th id=\"T_6b0ab_level0_row14\" class=\"row_heading level0 row14\" >14</th>\n",
916
+ " <td id=\"T_6b0ab_row14_col0\" class=\"data row14 col0\" >Fold Number</td>\n",
917
+ " <td id=\"T_6b0ab_row14_col1\" class=\"data row14 col1\" >10</td>\n",
918
+ " </tr>\n",
919
+ " <tr>\n",
920
+ " <th id=\"T_6b0ab_level0_row15\" class=\"row_heading level0 row15\" >15</th>\n",
921
+ " <td id=\"T_6b0ab_row15_col0\" class=\"data row15 col0\" >CPU Jobs</td>\n",
922
+ " <td id=\"T_6b0ab_row15_col1\" class=\"data row15 col1\" >-1</td>\n",
923
+ " </tr>\n",
924
+ " <tr>\n",
925
+ " <th id=\"T_6b0ab_level0_row16\" class=\"row_heading level0 row16\" >16</th>\n",
926
+ " <td id=\"T_6b0ab_row16_col0\" class=\"data row16 col0\" >Use GPU</td>\n",
927
+ " <td id=\"T_6b0ab_row16_col1\" class=\"data row16 col1\" >False</td>\n",
928
+ " </tr>\n",
929
+ " <tr>\n",
930
+ " <th id=\"T_6b0ab_level0_row17\" class=\"row_heading level0 row17\" >17</th>\n",
931
+ " <td id=\"T_6b0ab_row17_col0\" class=\"data row17 col0\" >Log Experiment</td>\n",
932
+ " <td id=\"T_6b0ab_row17_col1\" class=\"data row17 col1\" >False</td>\n",
933
+ " </tr>\n",
934
+ " <tr>\n",
935
+ " <th id=\"T_6b0ab_level0_row18\" class=\"row_heading level0 row18\" >18</th>\n",
936
+ " <td id=\"T_6b0ab_row18_col0\" class=\"data row18 col0\" >Experiment Name</td>\n",
937
+ " <td id=\"T_6b0ab_row18_col1\" class=\"data row18 col1\" >clf-default-name</td>\n",
938
+ " </tr>\n",
939
+ " <tr>\n",
940
+ " <th id=\"T_6b0ab_level0_row19\" class=\"row_heading level0 row19\" >19</th>\n",
941
+ " <td id=\"T_6b0ab_row19_col0\" class=\"data row19 col0\" >USI</td>\n",
942
+ " <td id=\"T_6b0ab_row19_col1\" class=\"data row19 col1\" >c071</td>\n",
943
+ " </tr>\n",
944
+ " </tbody>\n",
945
+ "</table>\n"
946
+ ],
947
+ "text/plain": [
948
+ "<pandas.io.formats.style.Styler at 0x7fa2c73cad40>"
949
+ ]
950
+ },
951
+ "metadata": {},
952
+ "output_type": "display_data"
953
+ }
954
+ ],
955
+ "source": [
956
+ "from pycaret.classification import *\n",
957
+ "\n",
958
+ "\n",
959
+ "Mult_clf = setup(data=data_train,\n",
960
+ " target=\"species\",\n",
961
+ " session_id=123\n",
962
+ " )"
963
+ ]
964
+ },
965
+ {
966
+ "cell_type": "code",
967
+ "execution_count": null,
968
+ "metadata": {},
969
+ "outputs": [],
970
+ "source": [
971
+ "best_Model = compare_models()"
972
+ ]
973
+ },
974
+ {
975
+ "cell_type": "markdown",
976
+ "metadata": {},
977
+ "source": [
978
+ "NOTA:\n",
979
+ "\n",
980
+ "Tive que executar no Google Colab, porque demorou muito para obter os resultados.\n",
981
+ "\n",
982
+ "Pedir permissão para meu arquivo compartilhado, [Aqui](https://colab.research.google.com/drive/1bELHHfpLFZ7SyRifMpn_4h34Nir1y37D#scrollTo=NelzMSccoXnI)"
983
+ ]
984
+ },
985
+ {
986
+ "cell_type": "markdown",
987
+ "metadata": {},
988
+ "source": []
989
+ }
990
+ ],
991
+ "metadata": {
992
+ "kernelspec": {
993
+ "display_name": "venv_pycaret",
994
+ "language": "python",
995
+ "name": "python3"
996
+ },
997
+ "language_info": {
998
+ "codemirror_mode": {
999
+ "name": "ipython",
1000
+ "version": 3
1001
+ },
1002
+ "file_extension": ".py",
1003
+ "mimetype": "text/x-python",
1004
+ "name": "python",
1005
+ "nbconvert_exporter": "python",
1006
+ "pygments_lexer": "ipython3",
1007
+ "version": "3.10.12"
1008
+ }
1009
+ },
1010
+ "nbformat": 4,
1011
+ "nbformat_minor": 2
1012
+ }