| 1 | |
| 00:00:00,920 --> 00:00:06,160 | |
| OK so before we actually dive into the book I'm just going to talk a bit about loading our data. | |
| 2 | |
| 00:00:06,980 --> 00:00:10,130 | |
| So Chris has some built in these letters that are quite easy to use. | |
| 3 | |
| 00:00:10,130 --> 00:00:15,650 | |
| However we still have to do some manipulation to our data afterward and we'll get into that shortly. | |
| 4 | |
| 00:00:15,650 --> 00:00:20,510 | |
| But let me just show you quickly what this is from Kristel data sets. | |
| 5 | |
| 00:00:20,510 --> 00:00:22,040 | |
| We import this. | |
| 6 | |
| 00:00:22,100 --> 00:00:28,550 | |
| There are a few data sets Karris can automatically import Amnesty's one of them and Safar tenons the | |
| 7 | |
| 00:00:28,550 --> 00:00:29,170 | |
| other. | |
| 8 | |
| 00:00:29,510 --> 00:00:35,510 | |
| So this is how the data set from the Lord function comes out complots in this form and we can we can | |
| 9 | |
| 00:00:35,510 --> 00:00:38,250 | |
| define these variables any name names we want. | |
| 10 | |
| 00:00:38,270 --> 00:00:40,000 | |
| However this is a standard naming convention. | |
| 11 | |
| 00:00:40,010 --> 00:00:40,890 | |
| I like to use. | |
| 12 | |
| 00:00:40,940 --> 00:00:49,100 | |
| I realize that Francois surely actually uses these same variables names as well and in a lot to Tauriel | |
| 13 | |
| 00:00:49,140 --> 00:00:56,600 | |
| you'll see sometimes slightly different naming conventions but generally extreme and X test are basically | |
| 14 | |
| 00:00:56,610 --> 00:01:01,040 | |
| the test treating the testator and a white train. | |
| 15 | |
| 00:01:01,090 --> 00:01:03,020 | |
| Why to the class labels. | |
| 16 | |
| 00:01:03,050 --> 00:01:05,500 | |
| So we have white train tied to extreme. | |
| 17 | |
| 00:01:05,510 --> 00:01:08,740 | |
| This is going to be the same length surfaces like 60000. | |
| 18 | |
| 00:01:08,750 --> 00:01:10,000 | |
| This will be 60000. | |
| 19 | |
| 00:01:10,280 --> 00:01:14,090 | |
| And if it is 10000 This will be 10000 10000 records. | |
| 20 | |
| 00:01:14,090 --> 00:01:18,230 | |
| I mean this is obviously an image here and an image here. | |
| 21 | |
| 00:01:18,470 --> 00:01:20,580 | |
| So it's going to be slightly different shape. | |
| 22 | |
| 00:01:20,900 --> 00:01:23,800 | |
| So we can actually print to appear extreme shape. | |
| 23 | |
| 00:01:23,810 --> 00:01:32,110 | |
| And it gives you this up here so tells you that we have 60000 images and each images of 28 by 20 dimensions. | |
| 24 | |
| 00:01:32,150 --> 00:01:35,420 | |
| So let's do this now and I put it on notebook. | |
| 25 | |
| 00:01:35,890 --> 00:01:36,190 | |
| All right. | |
| 26 | |
| 00:01:36,190 --> 00:01:37,930 | |
| So this is what I put in the book here. | |
| 27 | |
| 00:01:38,050 --> 00:01:41,990 | |
| Eight point twenty eight point two one zero and one book. | |
| 28 | |
| 00:01:42,100 --> 00:01:45,200 | |
| It's in those same deep learning linning for the scene before. | |
| 29 | |
| 00:01:45,580 --> 00:01:47,970 | |
| So this is a code I just showed you in previous slide. | |
| 30 | |
| 00:01:48,040 --> 00:01:53,550 | |
| So let's just go ahead and run this code by pressing shift to and it takes a little while to load. | |
| 31 | |
| 00:01:53,550 --> 00:01:57,540 | |
| If you don't have the data saved on your machine you'll see it downloading some Dalwood balls come up | |
| 32 | |
| 00:01:57,540 --> 00:01:58,380 | |
| here below. | |
| 33 | |
| 00:01:58,740 --> 00:02:00,570 | |
| So we just wanted to ship for waitron. | |
| 34 | |
| 00:02:00,630 --> 00:02:05,890 | |
| Let's take a look of extreme extremes what we saw in the previous slide. | |
| 35 | |
| 00:02:06,000 --> 00:02:07,840 | |
| 60000 by 20 by 20. | |
| 36 | |
| 00:02:08,090 --> 00:02:09,620 | |
| And let's see what Whiteread looks like. | |
| 37 | |
| 00:02:09,620 --> 00:02:11,620 | |
| I think we just looked at it 16000. | |
| 38 | |
| 00:02:11,960 --> 00:02:14,480 | |
| Let's see what extreme x test looks like. | |
| 39 | |
| 00:02:14,480 --> 00:02:15,870 | |
| 10000 May 28 by 20. | |
| 40 | |
| 00:02:15,890 --> 00:02:19,950 | |
| And this we assume it's going to be 10000 good. | |
| 41 | |
| 00:02:20,040 --> 00:02:21,990 | |
| So I just did it here. | |
| 42 | |
| 00:02:22,070 --> 00:02:24,250 | |
| But initially we can always do it here. | |
| 43 | |
| 00:02:24,620 --> 00:02:30,890 | |
| So this step to me we examined the size and image dimensions it's not required but it's a good practice | |
| 44 | |
| 00:02:31,280 --> 00:02:33,860 | |
| just to check and make sure all your data is correct. | |
| 45 | |
| 00:02:33,860 --> 00:02:39,560 | |
| So we know data consists of 60000 samples of feeling data and 10000 of test data and all labels are | |
| 46 | |
| 00:02:39,560 --> 00:02:40,750 | |
| appropriately sized. | |
| 47 | |
| 00:02:40,760 --> 00:02:45,710 | |
| So we actually should check this here appropriately sized meaning that they're in the correct format | |
| 48 | |
| 00:02:45,920 --> 00:02:47,720 | |
| that Cara's requires. | |
| 49 | |
| 00:02:48,080 --> 00:02:54,770 | |
| And as I mentioned before are 20 by 28 and there's no add dimension or four dimension if you want to | |
| 50 | |
| 00:02:54,770 --> 00:02:57,600 | |
| consider this being the first mention here. | |
| 51 | |
| 00:02:57,830 --> 00:03:02,340 | |
| This meeting the number of images stored here. | |
| 52 | |
| 00:03:02,790 --> 00:03:09,090 | |
| So let's take a look at what happens when we're underskirt it prints out this year which is the shape | |
| 53 | |
| 00:03:09,240 --> 00:03:10,580 | |
| we saw before. | |
| 54 | |
| 00:03:10,800 --> 00:03:12,820 | |
| How many samples are training data. | |
| 55 | |
| 00:03:13,110 --> 00:03:14,740 | |
| How many labels on our training data. | |
| 56 | |
| 00:03:14,820 --> 00:03:19,320 | |
| How many samples are tested and samples and labels and are tested here as well. | |
| 57 | |
| 00:03:19,320 --> 00:03:21,720 | |
| So these all match up that's good. | |
| 58 | |
| 00:03:21,720 --> 00:03:27,080 | |
| And the dimensions here are 28 by 2020 by 28 so everything is all good. | |
| 59 | |
| 00:03:27,090 --> 00:03:30,930 | |
| You can take a look at this code and try to decide if you want to. | |
| 60 | |
| 00:03:31,260 --> 00:03:32,580 | |
| It's quite basic and simple. | |
| 61 | |
| 00:03:32,580 --> 00:03:37,720 | |
| Basically we use these are all in up-I is when it's loaded into here. | |
| 62 | |
| 00:03:38,600 --> 00:03:45,780 | |
| And basically we can use the dot ship function which is an extremely useful function to just get a ship | |
| 63 | |
| 00:03:46,020 --> 00:03:48,680 | |
| basically the dimensions of your data are key. | |
| 64 | |
| 00:03:49,200 --> 00:03:52,380 | |
| So now let's visualize some of this information. | |
| 65 | |
| 00:03:52,400 --> 00:03:54,230 | |
| So I'm going to do this in two different ways. | |
| 66 | |
| 00:03:54,240 --> 00:03:54,910 | |
| OK. | |
| 67 | |
| 00:03:55,250 --> 00:04:00,500 | |
| We're going to do this with open C-v then we're going to do it with Matt plot. | |
| 68 | |
| 00:04:00,720 --> 00:04:02,840 | |
| And you can use either one going forward. | |
| 69 | |
| 00:04:03,000 --> 00:04:08,520 | |
| I tend to use matplotlib if I'm plotting multiple images on the plate in the book and you can see if | |
| 70 | |
| 00:04:08,520 --> 00:04:12,500 | |
| I'm testing and wanting to displace text of an image. | |
| 71 | |
| 00:04:12,630 --> 00:04:15,000 | |
| So let's try this. | |
| 72 | |
| 00:04:15,150 --> 00:04:17,610 | |
| There we go pops up in a window here. | |
| 73 | |
| 00:04:17,890 --> 00:04:20,200 | |
| So we took a look at this nine to nine again. | |
| 74 | |
| 00:04:20,200 --> 00:04:23,430 | |
| Tree 2 0 1. | |
| 75 | |
| 00:04:23,560 --> 00:04:28,140 | |
| So we just brought up all these windows here and all these digits these are random digits. | |
| 76 | |
| 00:04:28,140 --> 00:04:34,980 | |
| We use this function and the random and this defines any random number between 0 and who linked up or | |
| 77 | |
| 00:04:34,980 --> 00:04:37,150 | |
| treating data link for treating data. | |
| 78 | |
| 00:04:37,150 --> 00:04:39,260 | |
| If you remember with 60000. | |
| 79 | |
| 00:04:39,280 --> 00:04:45,230 | |
| So this generates a number from zero to 60000 and displays the image here using open city functions. | |
| 80 | |
| 00:04:46,430 --> 00:04:48,710 | |
| So let's now do the same with matplotlib. | |
| 81 | |
| 00:04:49,010 --> 00:04:52,220 | |
| This is the code to basically plot and matplotlib. | |
| 82 | |
| 00:04:52,430 --> 00:04:54,750 | |
| This is not the actual most efficient way to do it. | |
| 83 | |
| 00:04:54,770 --> 00:04:59,480 | |
| The most efficient way would be doing it into a loop but it just left it here for you so you get an | |
| 84 | |
| 00:04:59,480 --> 00:05:02,430 | |
| idea of how we use subplots to plot it. | |
| 85 | |
| 00:05:02,450 --> 00:05:03,560 | |
| So let's do it. | |
| 86 | |
| 00:05:07,290 --> 00:05:08,110 | |
| This is good. | |
| 87 | |
| 00:05:08,410 --> 00:05:12,880 | |
| Actually it's defined and that's because this one may have changed it before. | |
| 88 | |
| 00:05:12,900 --> 00:05:14,650 | |
| But let's run it again. | |
| 89 | |
| 00:05:14,660 --> 00:05:17,060 | |
| It's a the extreme right here. | |
| 90 | |
| 00:05:18,440 --> 00:05:19,700 | |
| There we go. | |
| 91 | |
| 00:05:19,790 --> 00:05:22,910 | |
| So this plotted six images in a nice small grid here. | |
| 92 | |
| 00:05:23,150 --> 00:05:26,360 | |
| So this is this is why I sort of use my Potala to display images in there. | |
| 93 | |
| 00:05:26,360 --> 00:05:30,710 | |
| But in the book I find it easier and nicer to work with. | |
| 94 | |
| 00:05:30,980 --> 00:05:36,130 | |
| So that's how we basically import or data set and visualize some data from a data set. | |
| 95 | |
| 00:05:36,150 --> 00:05:40,190 | |
| Next we're going to prepare a dataset for treating. | |
| 96 | |
| 00:05:40,220 --> 00:05:41,760 | |
| So let's take a look at that shortly. | |