1 00:00:00,570 --> 00:00:06,360 I welcome to chapter 6.1 which is basically a machine learning crash course overview. 2 00:00:06,390 --> 00:00:06,690 All right. 3 00:00:06,690 --> 00:00:09,980 So let's get started into this so what is machine learning now. 4 00:00:10,050 --> 00:00:14,910 Machine learning has been synonymous to artificial intelligence because basically it's a field of study 5 00:00:15,030 --> 00:00:21,620 that basically studies how algorithms or software actually learns from data. 6 00:00:21,620 --> 00:00:27,920 So basically as I said some field or field in artificial intelligence that uses statistical techniques 7 00:00:27,920 --> 00:00:33,050 to give computers the ability to learn from data without being explicitly programmed and explicitly 8 00:00:33,050 --> 00:00:39,440 means like if this is dad and that is that basically a hard look up table of criteria machine learning 9 00:00:39,440 --> 00:00:40,180 does not do that. 10 00:00:40,190 --> 00:00:46,890 It learns from the data learns it's one model of how it should be answered and basically over the last 11 00:00:46,890 --> 00:00:52,010 five to 10 years maybe even 15 years or so machine learning has exploded there. 12 00:00:52,110 --> 00:00:58,860 Basically a number of masters and BSD programs all over the world have specializations in machine learning 13 00:00:58,860 --> 00:01:05,430 now it is and this has mainly been brought brought born because processing power and also from GPS use 14 00:01:05,910 --> 00:01:12,280 has basically caught up to the process of intensive intensity required for machine learning. 15 00:01:13,830 --> 00:01:19,320 So there are four types of machine learning with neural networks being basically one type belonging 16 00:01:19,320 --> 00:01:20,010 to us. 17 00:01:20,190 --> 00:01:27,630 A subtype belonging to one type of DS for these four are basically supervised unsupervised self supervised 18 00:01:27,680 --> 00:01:29,640 and reinforcement learning. 19 00:01:29,640 --> 00:01:36,350 And I'm going to talk a little bit about each one so fiercely supervised living now supervised living 20 00:01:36,350 --> 00:01:39,660 is by far the most popular form of E.I. and well used today. 21 00:01:39,800 --> 00:01:48,060 And they'll being machine learning basically because it's relatively easy compared to other things to 22 00:01:48,070 --> 00:01:48,910 implement. 23 00:01:49,150 --> 00:01:56,230 All they need is labelled data set and we feed this data set into machine learning model algorithm and 24 00:01:56,230 --> 00:01:59,330 it develops a model to fit this data to some outputs. 25 00:01:59,380 --> 00:02:06,910 So basically it's like an example here is let's see if we have 10000 emails that are labeled spam 10000 26 00:02:06,910 --> 00:02:10,120 that are not spam and we give this to a model. 27 00:02:10,120 --> 00:02:17,520 Basically we get the text and Miss riskless subjects and from the sender of the email and now the e-mail 28 00:02:17,860 --> 00:02:23,910 story the machine learning algorithm is now going to figure out what is spam based on that. 29 00:02:23,960 --> 00:02:31,870 So we have input data being an e-mail model that we just trained and it outputs but it's or not. 30 00:02:31,930 --> 00:02:38,480 So that in a nutshell is supervised learning. 31 00:02:38,620 --> 00:02:42,260 And here are some examples of supervised living in crappy division. 32 00:02:42,640 --> 00:02:48,160 Basically it's used heavily in image justification even object detection and segmentation. 33 00:02:48,220 --> 00:02:54,310 Basically all of these involve feeding some label data into our depleting model and treating it and 34 00:02:54,310 --> 00:03:00,420 getting a model that is accurate enough to take unseen data and classified them correctly. 35 00:03:01,650 --> 00:03:07,500 So what about unsupervised learning now unsupervised learning learning is concerned with finding interesting 36 00:03:07,500 --> 00:03:12,030 clusters Indian data and it does so without any hope of data labeling. 37 00:03:12,030 --> 00:03:18,630 So you just feed some data into it and the unsupervised learning algorithm basically finds interesting 38 00:03:18,630 --> 00:03:21,240 patterns and clusters in the information 39 00:03:24,510 --> 00:03:29,520 it is actually very important in data analytics when you're trying to understand vast amounts of data 40 00:03:29,520 --> 00:03:31,320 of that data. 41 00:03:31,410 --> 00:03:37,350 Basically when you have huge data sets with huge number of columns and rows and different information 42 00:03:37,770 --> 00:03:43,430 using unsupervised learning can help you understand very quickly what is important in your data. 43 00:03:44,130 --> 00:03:46,720 This is a best example of it here. 44 00:03:47,190 --> 00:03:48,970 Go back to it. 45 00:03:48,980 --> 00:03:54,010 So now imagine we have basically meats of food items here. 46 00:03:54,770 --> 00:04:00,500 And we give it as a seamstress pictures here and we'll give it to an unsupervised machine learning algorithm 47 00:04:01,070 --> 00:04:05,270 and it's going to actually pick it close to what interests you and I'm willing to bet close to what 48 00:04:05,270 --> 00:04:13,150 is going to be the first tree I trusted to baby does it's a meets it's that is interesting pattern unsupervised 49 00:04:13,220 --> 00:04:17,190 machine learning algorithms help us pick out. 50 00:04:17,190 --> 00:04:19,460 So what about self supervised learning. 51 00:04:19,870 --> 00:04:23,880 Now sylphs revised learning is the same concept as supervised learning. 52 00:04:23,920 --> 00:04:31,650 However that data is not labeled by humans which is pretty interesting so how it is level is generated. 53 00:04:31,690 --> 00:04:34,100 Basically it's done using heuristic algorithms. 54 00:04:34,180 --> 00:04:40,030 Example also includes an A good example of that is basically trying to predict the next Freman a video 55 00:04:40,240 --> 00:04:42,130 given the previous frames. 56 00:04:42,130 --> 00:04:44,980 That's a very good example of self supervised learning. 57 00:04:44,980 --> 00:04:49,660 We're not going to deal with any self-sacrifice or unsupervised learning in this class but it's good 58 00:04:49,660 --> 00:04:50,070 to know. 59 00:04:50,140 --> 00:04:54,570 If you want to have an overview of machine learning what they're about. 60 00:04:54,610 --> 00:04:59,890 And lastly we have reinforcement learning that reinforcement learning is potentially very interesting. 61 00:04:59,960 --> 00:05:05,190 However it's still in its infancy and it's still a bit tricky to actually do. 62 00:05:05,200 --> 00:05:11,500 I actually did a couple of courses classes on this in my university in Edinburgh and it wasn't that 63 00:05:11,500 --> 00:05:11,790 fun. 64 00:05:11,800 --> 00:05:17,220 Was actually very challenging but once I got it working it was actually quite fun quite cool. 65 00:05:18,980 --> 00:05:20,730 So the concept is pretty simple. 66 00:05:21,050 --> 00:05:23,280 However it's that simple to implement. 67 00:05:23,660 --> 00:05:30,850 But we basically teach the algorithm something by giving it bad examples or penalties against something. 68 00:05:31,060 --> 00:05:32,800 So it's of like learning to play games. 69 00:05:32,840 --> 00:05:35,370 It's a very good example of reinforcement learning. 70 00:05:36,050 --> 00:05:40,880 You're basically trying different things and getting punished or dying or losing points for something 71 00:05:41,420 --> 00:05:45,500 until you come up with a strategy where you're basically minimizing your loss. 72 00:05:45,500 --> 00:05:49,530 That is what reinforcement learning technically is. 73 00:05:49,540 --> 00:05:55,810 So in machine learning it and machining in supervised machine learning is a basic tenet process which 74 00:05:55,810 --> 00:06:02,320 you follow in every basically every using of every algorithm whether it be deplaning convolutional and 75 00:06:02,350 --> 00:06:04,080 that's SVM. 76 00:06:04,120 --> 00:06:04,910 Blah blah blah. 77 00:06:05,110 --> 00:06:06,650 They all follow this pattern. 78 00:06:06,700 --> 00:06:12,070 So in step one you obtain a label dataset. 79 00:06:12,170 --> 00:06:13,970 Step two is split to say the set. 80 00:06:13,970 --> 00:06:20,840 This is very important into a trining portion and the validation or test portion noted is technically 81 00:06:20,840 --> 00:06:24,400 a little difference between the validation and test portion. 82 00:06:24,590 --> 00:06:30,620 However for all intents and purposes which is about Free's I shouldn't be using but whatever. 83 00:06:31,050 --> 00:06:36,430 Basically validation and test push is basically the unseen data. 84 00:06:36,430 --> 00:06:43,130 Your model never sees this data model only sees the training data and we test performance on the validation 85 00:06:43,160 --> 00:06:45,700 or test push and test tested assets. 86 00:06:46,240 --> 00:06:47,790 So this in step 3. 87 00:06:47,990 --> 00:06:51,380 We take this training data set that we split from the original. 88 00:06:51,860 --> 00:06:54,280 And we feel it's our model. 89 00:06:54,370 --> 00:06:59,990 So model takes us still and isn't inputs all labels and basically loon's tries to figure out patterns 90 00:07:00,320 --> 00:07:01,270 how do we predict this. 91 00:07:01,280 --> 00:07:07,670 How do we know what this is after some time it develops a fully trained model and so forth. 92 00:07:07,670 --> 00:07:12,700 Basically we run this model in our test validation dataset to see how effective it is. 93 00:07:14,380 --> 00:07:19,710 So here's some machine learning terminology that you're probably going to hear in this course and it 94 00:07:19,720 --> 00:07:27,280 will basically target data with all this if we is to the ground troop levels technically in programming 95 00:07:27,280 --> 00:07:27,780 languages. 96 00:07:27,790 --> 00:07:37,010 You'll see that refer to as the y o In mathematics the Y labels X being the training data set. 97 00:07:37,030 --> 00:07:37,990 Sorry about that. 98 00:07:38,620 --> 00:07:45,250 And prediction basically being what all models predicted from some input data classes will be basically 99 00:07:45,250 --> 00:07:47,170 the categories of your data. 100 00:07:47,400 --> 00:07:53,770 So if you were talking about the hand-written digit amnesty the set there were 10 classes there would 101 00:07:53,770 --> 00:08:02,810 be zero 1 2 3 4 5 6 7 8 9 aggression or phrase do when you're in classes we are operating basically. 102 00:08:02,880 --> 00:08:05,560 This image belongs to this last regression. 103 00:08:05,610 --> 00:08:09,980 We're putting basically a continuous value digit number. 104 00:08:10,290 --> 00:08:15,450 So that's say we taking some inputs and trying to predict someone's height or weight that would be a 105 00:08:15,450 --> 00:08:16,820 regression model. 106 00:08:17,340 --> 00:08:21,290 And as I mentioned before invalidations last tests yes to can be different. 107 00:08:21,300 --> 00:08:25,950 But in early Christou unseen data that we test our tree and model on.