1 00:00:00,920 --> 00:00:06,160 OK so before we actually dive into the book I'm just going to talk a bit about loading our data. 2 00:00:06,980 --> 00:00:10,130 So Chris has some built in these letters that are quite easy to use. 3 00:00:10,130 --> 00:00:15,650 However we still have to do some manipulation to our data afterward and we'll get into that shortly. 4 00:00:15,650 --> 00:00:20,510 But let me just show you quickly what this is from Kristel data sets. 5 00:00:20,510 --> 00:00:22,040 We import this. 6 00:00:22,100 --> 00:00:28,550 There are a few data sets Karris can automatically import Amnesty's one of them and Safar tenons the 7 00:00:28,550 --> 00:00:29,170 other. 8 00:00:29,510 --> 00:00:35,510 So this is how the data set from the Lord function comes out complots in this form and we can we can 9 00:00:35,510 --> 00:00:38,250 define these variables any name names we want. 10 00:00:38,270 --> 00:00:40,000 However this is a standard naming convention. 11 00:00:40,010 --> 00:00:40,890 I like to use. 12 00:00:40,940 --> 00:00:49,100 I realize that Francois surely actually uses these same variables names as well and in a lot to Tauriel 13 00:00:49,140 --> 00:00:56,600 you'll see sometimes slightly different naming conventions but generally extreme and X test are basically 14 00:00:56,610 --> 00:01:01,040 the test treating the testator and a white train. 15 00:01:01,090 --> 00:01:03,020 Why to the class labels. 16 00:01:03,050 --> 00:01:05,500 So we have white train tied to extreme. 17 00:01:05,510 --> 00:01:08,740 This is going to be the same length surfaces like 60000. 18 00:01:08,750 --> 00:01:10,000 This will be 60000. 19 00:01:10,280 --> 00:01:14,090 And if it is 10000 This will be 10000 10000 records. 20 00:01:14,090 --> 00:01:18,230 I mean this is obviously an image here and an image here. 21 00:01:18,470 --> 00:01:20,580 So it's going to be slightly different shape. 22 00:01:20,900 --> 00:01:23,800 So we can actually print to appear extreme shape. 23 00:01:23,810 --> 00:01:32,110 And it gives you this up here so tells you that we have 60000 images and each images of 28 by 20 dimensions. 24 00:01:32,150 --> 00:01:35,420 So let's do this now and I put it on notebook. 25 00:01:35,890 --> 00:01:36,190 All right. 26 00:01:36,190 --> 00:01:37,930 So this is what I put in the book here. 27 00:01:38,050 --> 00:01:41,990 Eight point twenty eight point two one zero and one book. 28 00:01:42,100 --> 00:01:45,200 It's in those same deep learning linning for the scene before. 29 00:01:45,580 --> 00:01:47,970 So this is a code I just showed you in previous slide. 30 00:01:48,040 --> 00:01:53,550 So let's just go ahead and run this code by pressing shift to and it takes a little while to load. 31 00:01:53,550 --> 00:01:57,540 If you don't have the data saved on your machine you'll see it downloading some Dalwood balls come up 32 00:01:57,540 --> 00:01:58,380 here below. 33 00:01:58,740 --> 00:02:00,570 So we just wanted to ship for waitron. 34 00:02:00,630 --> 00:02:05,890 Let's take a look of extreme extremes what we saw in the previous slide. 35 00:02:06,000 --> 00:02:07,840 60000 by 20 by 20. 36 00:02:08,090 --> 00:02:09,620 And let's see what Whiteread looks like. 37 00:02:09,620 --> 00:02:11,620 I think we just looked at it 16000. 38 00:02:11,960 --> 00:02:14,480 Let's see what extreme x test looks like. 39 00:02:14,480 --> 00:02:15,870 10000 May 28 by 20. 40 00:02:15,890 --> 00:02:19,950 And this we assume it's going to be 10000 good. 41 00:02:20,040 --> 00:02:21,990 So I just did it here. 42 00:02:22,070 --> 00:02:24,250 But initially we can always do it here. 43 00:02:24,620 --> 00:02:30,890 So this step to me we examined the size and image dimensions it's not required but it's a good practice 44 00:02:31,280 --> 00:02:33,860 just to check and make sure all your data is correct. 45 00:02:33,860 --> 00:02:39,560 So we know data consists of 60000 samples of feeling data and 10000 of test data and all labels are 46 00:02:39,560 --> 00:02:40,750 appropriately sized. 47 00:02:40,760 --> 00:02:45,710 So we actually should check this here appropriately sized meaning that they're in the correct format 48 00:02:45,920 --> 00:02:47,720 that Cara's requires. 49 00:02:48,080 --> 00:02:54,770 And as I mentioned before are 20 by 28 and there's no add dimension or four dimension if you want to 50 00:02:54,770 --> 00:02:57,600 consider this being the first mention here. 51 00:02:57,830 --> 00:03:02,340 This meeting the number of images stored here. 52 00:03:02,790 --> 00:03:09,090 So let's take a look at what happens when we're underskirt it prints out this year which is the shape 53 00:03:09,240 --> 00:03:10,580 we saw before. 54 00:03:10,800 --> 00:03:12,820 How many samples are training data. 55 00:03:13,110 --> 00:03:14,740 How many labels on our training data. 56 00:03:14,820 --> 00:03:19,320 How many samples are tested and samples and labels and are tested here as well. 57 00:03:19,320 --> 00:03:21,720 So these all match up that's good. 58 00:03:21,720 --> 00:03:27,080 And the dimensions here are 28 by 2020 by 28 so everything is all good. 59 00:03:27,090 --> 00:03:30,930 You can take a look at this code and try to decide if you want to. 60 00:03:31,260 --> 00:03:32,580 It's quite basic and simple. 61 00:03:32,580 --> 00:03:37,720 Basically we use these are all in up-I is when it's loaded into here. 62 00:03:38,600 --> 00:03:45,780 And basically we can use the dot ship function which is an extremely useful function to just get a ship 63 00:03:46,020 --> 00:03:48,680 basically the dimensions of your data are key. 64 00:03:49,200 --> 00:03:52,380 So now let's visualize some of this information. 65 00:03:52,400 --> 00:03:54,230 So I'm going to do this in two different ways. 66 00:03:54,240 --> 00:03:54,910 OK. 67 00:03:55,250 --> 00:04:00,500 We're going to do this with open C-v then we're going to do it with Matt plot. 68 00:04:00,720 --> 00:04:02,840 And you can use either one going forward. 69 00:04:03,000 --> 00:04:08,520 I tend to use matplotlib if I'm plotting multiple images on the plate in the book and you can see if 70 00:04:08,520 --> 00:04:12,500 I'm testing and wanting to displace text of an image. 71 00:04:12,630 --> 00:04:15,000 So let's try this. 72 00:04:15,150 --> 00:04:17,610 There we go pops up in a window here. 73 00:04:17,890 --> 00:04:20,200 So we took a look at this nine to nine again. 74 00:04:20,200 --> 00:04:23,430 Tree 2 0 1. 75 00:04:23,560 --> 00:04:28,140 So we just brought up all these windows here and all these digits these are random digits. 76 00:04:28,140 --> 00:04:34,980 We use this function and the random and this defines any random number between 0 and who linked up or 77 00:04:34,980 --> 00:04:37,150 treating data link for treating data. 78 00:04:37,150 --> 00:04:39,260 If you remember with 60000. 79 00:04:39,280 --> 00:04:45,230 So this generates a number from zero to 60000 and displays the image here using open city functions. 80 00:04:46,430 --> 00:04:48,710 So let's now do the same with matplotlib. 81 00:04:49,010 --> 00:04:52,220 This is the code to basically plot and matplotlib. 82 00:04:52,430 --> 00:04:54,750 This is not the actual most efficient way to do it. 83 00:04:54,770 --> 00:04:59,480 The most efficient way would be doing it into a loop but it just left it here for you so you get an 84 00:04:59,480 --> 00:05:02,430 idea of how we use subplots to plot it. 85 00:05:02,450 --> 00:05:03,560 So let's do it. 86 00:05:07,290 --> 00:05:08,110 This is good. 87 00:05:08,410 --> 00:05:12,880 Actually it's defined and that's because this one may have changed it before. 88 00:05:12,900 --> 00:05:14,650 But let's run it again. 89 00:05:14,660 --> 00:05:17,060 It's a the extreme right here. 90 00:05:18,440 --> 00:05:19,700 There we go. 91 00:05:19,790 --> 00:05:22,910 So this plotted six images in a nice small grid here. 92 00:05:23,150 --> 00:05:26,360 So this is this is why I sort of use my Potala to display images in there. 93 00:05:26,360 --> 00:05:30,710 But in the book I find it easier and nicer to work with. 94 00:05:30,980 --> 00:05:36,130 So that's how we basically import or data set and visualize some data from a data set. 95 00:05:36,150 --> 00:05:40,190 Next we're going to prepare a dataset for treating. 96 00:05:40,220 --> 00:05:41,760 So let's take a look at that shortly.