File size: 11,614 Bytes
17e2002 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 |
1
00:00:00,570 --> 00:00:06,360
I welcome to chapter 6.1 which is basically a machine learning crash course overview.
2
00:00:06,390 --> 00:00:06,690
All right.
3
00:00:06,690 --> 00:00:09,980
So let's get started into this so what is machine learning now.
4
00:00:10,050 --> 00:00:14,910
Machine learning has been synonymous to artificial intelligence because basically it's a field of study
5
00:00:15,030 --> 00:00:21,620
that basically studies how algorithms or software actually learns from data.
6
00:00:21,620 --> 00:00:27,920
So basically as I said some field or field in artificial intelligence that uses statistical techniques
7
00:00:27,920 --> 00:00:33,050
to give computers the ability to learn from data without being explicitly programmed and explicitly
8
00:00:33,050 --> 00:00:39,440
means like if this is dad and that is that basically a hard look up table of criteria machine learning
9
00:00:39,440 --> 00:00:40,180
does not do that.
10
00:00:40,190 --> 00:00:46,890
It learns from the data learns it's one model of how it should be answered and basically over the last
11
00:00:46,890 --> 00:00:52,010
five to 10 years maybe even 15 years or so machine learning has exploded there.
12
00:00:52,110 --> 00:00:58,860
Basically a number of masters and BSD programs all over the world have specializations in machine learning
13
00:00:58,860 --> 00:01:05,430
now it is and this has mainly been brought brought born because processing power and also from GPS use
14
00:01:05,910 --> 00:01:12,280
has basically caught up to the process of intensive intensity required for machine learning.
15
00:01:13,830 --> 00:01:19,320
So there are four types of machine learning with neural networks being basically one type belonging
16
00:01:19,320 --> 00:01:20,010
to us.
17
00:01:20,190 --> 00:01:27,630
A subtype belonging to one type of DS for these four are basically supervised unsupervised self supervised
18
00:01:27,680 --> 00:01:29,640
and reinforcement learning.
19
00:01:29,640 --> 00:01:36,350
And I'm going to talk a little bit about each one so fiercely supervised living now supervised living
20
00:01:36,350 --> 00:01:39,660
is by far the most popular form of E.I. and well used today.
21
00:01:39,800 --> 00:01:48,060
And they'll being machine learning basically because it's relatively easy compared to other things to
22
00:01:48,070 --> 00:01:48,910
implement.
23
00:01:49,150 --> 00:01:56,230
All they need is labelled data set and we feed this data set into machine learning model algorithm and
24
00:01:56,230 --> 00:01:59,330
it develops a model to fit this data to some outputs.
25
00:01:59,380 --> 00:02:06,910
So basically it's like an example here is let's see if we have 10000 emails that are labeled spam 10000
26
00:02:06,910 --> 00:02:10,120
that are not spam and we give this to a model.
27
00:02:10,120 --> 00:02:17,520
Basically we get the text and Miss riskless subjects and from the sender of the email and now the e-mail
28
00:02:17,860 --> 00:02:23,910
story the machine learning algorithm is now going to figure out what is spam based on that.
29
00:02:23,960 --> 00:02:31,870
So we have input data being an e-mail model that we just trained and it outputs but it's or not.
30
00:02:31,930 --> 00:02:38,480
So that in a nutshell is supervised learning.
31
00:02:38,620 --> 00:02:42,260
And here are some examples of supervised living in crappy division.
32
00:02:42,640 --> 00:02:48,160
Basically it's used heavily in image justification even object detection and segmentation.
33
00:02:48,220 --> 00:02:54,310
Basically all of these involve feeding some label data into our depleting model and treating it and
34
00:02:54,310 --> 00:03:00,420
getting a model that is accurate enough to take unseen data and classified them correctly.
35
00:03:01,650 --> 00:03:07,500
So what about unsupervised learning now unsupervised learning learning is concerned with finding interesting
36
00:03:07,500 --> 00:03:12,030
clusters Indian data and it does so without any hope of data labeling.
37
00:03:12,030 --> 00:03:18,630
So you just feed some data into it and the unsupervised learning algorithm basically finds interesting
38
00:03:18,630 --> 00:03:21,240
patterns and clusters in the information
39
00:03:24,510 --> 00:03:29,520
it is actually very important in data analytics when you're trying to understand vast amounts of data
40
00:03:29,520 --> 00:03:31,320
of that data.
41
00:03:31,410 --> 00:03:37,350
Basically when you have huge data sets with huge number of columns and rows and different information
42
00:03:37,770 --> 00:03:43,430
using unsupervised learning can help you understand very quickly what is important in your data.
43
00:03:44,130 --> 00:03:46,720
This is a best example of it here.
44
00:03:47,190 --> 00:03:48,970
Go back to it.
45
00:03:48,980 --> 00:03:54,010
So now imagine we have basically meats of food items here.
46
00:03:54,770 --> 00:04:00,500
And we give it as a seamstress pictures here and we'll give it to an unsupervised machine learning algorithm
47
00:04:01,070 --> 00:04:05,270
and it's going to actually pick it close to what interests you and I'm willing to bet close to what
48
00:04:05,270 --> 00:04:13,150
is going to be the first tree I trusted to baby does it's a meets it's that is interesting pattern unsupervised
49
00:04:13,220 --> 00:04:17,190
machine learning algorithms help us pick out.
50
00:04:17,190 --> 00:04:19,460
So what about self supervised learning.
51
00:04:19,870 --> 00:04:23,880
Now sylphs revised learning is the same concept as supervised learning.
52
00:04:23,920 --> 00:04:31,650
However that data is not labeled by humans which is pretty interesting so how it is level is generated.
53
00:04:31,690 --> 00:04:34,100
Basically it's done using heuristic algorithms.
54
00:04:34,180 --> 00:04:40,030
Example also includes an A good example of that is basically trying to predict the next Freman a video
55
00:04:40,240 --> 00:04:42,130
given the previous frames.
56
00:04:42,130 --> 00:04:44,980
That's a very good example of self supervised learning.
57
00:04:44,980 --> 00:04:49,660
We're not going to deal with any self-sacrifice or unsupervised learning in this class but it's good
58
00:04:49,660 --> 00:04:50,070
to know.
59
00:04:50,140 --> 00:04:54,570
If you want to have an overview of machine learning what they're about.
60
00:04:54,610 --> 00:04:59,890
And lastly we have reinforcement learning that reinforcement learning is potentially very interesting.
61
00:04:59,960 --> 00:05:05,190
However it's still in its infancy and it's still a bit tricky to actually do.
62
00:05:05,200 --> 00:05:11,500
I actually did a couple of courses classes on this in my university in Edinburgh and it wasn't that
63
00:05:11,500 --> 00:05:11,790
fun.
64
00:05:11,800 --> 00:05:17,220
Was actually very challenging but once I got it working it was actually quite fun quite cool.
65
00:05:18,980 --> 00:05:20,730
So the concept is pretty simple.
66
00:05:21,050 --> 00:05:23,280
However it's that simple to implement.
67
00:05:23,660 --> 00:05:30,850
But we basically teach the algorithm something by giving it bad examples or penalties against something.
68
00:05:31,060 --> 00:05:32,800
So it's of like learning to play games.
69
00:05:32,840 --> 00:05:35,370
It's a very good example of reinforcement learning.
70
00:05:36,050 --> 00:05:40,880
You're basically trying different things and getting punished or dying or losing points for something
71
00:05:41,420 --> 00:05:45,500
until you come up with a strategy where you're basically minimizing your loss.
72
00:05:45,500 --> 00:05:49,530
That is what reinforcement learning technically is.
73
00:05:49,540 --> 00:05:55,810
So in machine learning it and machining in supervised machine learning is a basic tenet process which
74
00:05:55,810 --> 00:06:02,320
you follow in every basically every using of every algorithm whether it be deplaning convolutional and
75
00:06:02,350 --> 00:06:04,080
that's SVM.
76
00:06:04,120 --> 00:06:04,910
Blah blah blah.
77
00:06:05,110 --> 00:06:06,650
They all follow this pattern.
78
00:06:06,700 --> 00:06:12,070
So in step one you obtain a label dataset.
79
00:06:12,170 --> 00:06:13,970
Step two is split to say the set.
80
00:06:13,970 --> 00:06:20,840
This is very important into a trining portion and the validation or test portion noted is technically
81
00:06:20,840 --> 00:06:24,400
a little difference between the validation and test portion.
82
00:06:24,590 --> 00:06:30,620
However for all intents and purposes which is about Free's I shouldn't be using but whatever.
83
00:06:31,050 --> 00:06:36,430
Basically validation and test push is basically the unseen data.
84
00:06:36,430 --> 00:06:43,130
Your model never sees this data model only sees the training data and we test performance on the validation
85
00:06:43,160 --> 00:06:45,700
or test push and test tested assets.
86
00:06:46,240 --> 00:06:47,790
So this in step 3.
87
00:06:47,990 --> 00:06:51,380
We take this training data set that we split from the original.
88
00:06:51,860 --> 00:06:54,280
And we feel it's our model.
89
00:06:54,370 --> 00:06:59,990
So model takes us still and isn't inputs all labels and basically loon's tries to figure out patterns
90
00:07:00,320 --> 00:07:01,270
how do we predict this.
91
00:07:01,280 --> 00:07:07,670
How do we know what this is after some time it develops a fully trained model and so forth.
92
00:07:07,670 --> 00:07:12,700
Basically we run this model in our test validation dataset to see how effective it is.
93
00:07:14,380 --> 00:07:19,710
So here's some machine learning terminology that you're probably going to hear in this course and it
94
00:07:19,720 --> 00:07:27,280
will basically target data with all this if we is to the ground troop levels technically in programming
95
00:07:27,280 --> 00:07:27,780
languages.
96
00:07:27,790 --> 00:07:37,010
You'll see that refer to as the y o In mathematics the Y labels X being the training data set.
97
00:07:37,030 --> 00:07:37,990
Sorry about that.
98
00:07:38,620 --> 00:07:45,250
And prediction basically being what all models predicted from some input data classes will be basically
99
00:07:45,250 --> 00:07:47,170
the categories of your data.
100
00:07:47,400 --> 00:07:53,770
So if you were talking about the hand-written digit amnesty the set there were 10 classes there would
101
00:07:53,770 --> 00:08:02,810
be zero 1 2 3 4 5 6 7 8 9 aggression or phrase do when you're in classes we are operating basically.
102
00:08:02,880 --> 00:08:05,560
This image belongs to this last regression.
103
00:08:05,610 --> 00:08:09,980
We're putting basically a continuous value digit number.
104
00:08:10,290 --> 00:08:15,450
So that's say we taking some inputs and trying to predict someone's height or weight that would be a
105
00:08:15,450 --> 00:08:16,820
regression model.
106
00:08:17,340 --> 00:08:21,290
And as I mentioned before invalidations last tests yes to can be different.
107
00:08:21,300 --> 00:08:25,950
But in early Christou unseen data that we test our tree and model on.
|