Prince-1's picture
Add files using upload-large-folder tool
d157f08 verified
1
00:00:00,920 --> 00:00:06,160
OK so before we actually dive into the book I'm just going to talk a bit about loading our data.
2
00:00:06,980 --> 00:00:10,130
So Chris has some built in these letters that are quite easy to use.
3
00:00:10,130 --> 00:00:15,650
However we still have to do some manipulation to our data afterward and we'll get into that shortly.
4
00:00:15,650 --> 00:00:20,510
But let me just show you quickly what this is from Kristel data sets.
5
00:00:20,510 --> 00:00:22,040
We import this.
6
00:00:22,100 --> 00:00:28,550
There are a few data sets Karris can automatically import Amnesty's one of them and Safar tenons the
7
00:00:28,550 --> 00:00:29,170
other.
8
00:00:29,510 --> 00:00:35,510
So this is how the data set from the Lord function comes out complots in this form and we can we can
9
00:00:35,510 --> 00:00:38,250
define these variables any name names we want.
10
00:00:38,270 --> 00:00:40,000
However this is a standard naming convention.
11
00:00:40,010 --> 00:00:40,890
I like to use.
12
00:00:40,940 --> 00:00:49,100
I realize that Francois surely actually uses these same variables names as well and in a lot to Tauriel
13
00:00:49,140 --> 00:00:56,600
you'll see sometimes slightly different naming conventions but generally extreme and X test are basically
14
00:00:56,610 --> 00:01:01,040
the test treating the testator and a white train.
15
00:01:01,090 --> 00:01:03,020
Why to the class labels.
16
00:01:03,050 --> 00:01:05,500
So we have white train tied to extreme.
17
00:01:05,510 --> 00:01:08,740
This is going to be the same length surfaces like 60000.
18
00:01:08,750 --> 00:01:10,000
This will be 60000.
19
00:01:10,280 --> 00:01:14,090
And if it is 10000 This will be 10000 10000 records.
20
00:01:14,090 --> 00:01:18,230
I mean this is obviously an image here and an image here.
21
00:01:18,470 --> 00:01:20,580
So it's going to be slightly different shape.
22
00:01:20,900 --> 00:01:23,800
So we can actually print to appear extreme shape.
23
00:01:23,810 --> 00:01:32,110
And it gives you this up here so tells you that we have 60000 images and each images of 28 by 20 dimensions.
24
00:01:32,150 --> 00:01:35,420
So let's do this now and I put it on notebook.
25
00:01:35,890 --> 00:01:36,190
All right.
26
00:01:36,190 --> 00:01:37,930
So this is what I put in the book here.
27
00:01:38,050 --> 00:01:41,990
Eight point twenty eight point two one zero and one book.
28
00:01:42,100 --> 00:01:45,200
It's in those same deep learning linning for the scene before.
29
00:01:45,580 --> 00:01:47,970
So this is a code I just showed you in previous slide.
30
00:01:48,040 --> 00:01:53,550
So let's just go ahead and run this code by pressing shift to and it takes a little while to load.
31
00:01:53,550 --> 00:01:57,540
If you don't have the data saved on your machine you'll see it downloading some Dalwood balls come up
32
00:01:57,540 --> 00:01:58,380
here below.
33
00:01:58,740 --> 00:02:00,570
So we just wanted to ship for waitron.
34
00:02:00,630 --> 00:02:05,890
Let's take a look of extreme extremes what we saw in the previous slide.
35
00:02:06,000 --> 00:02:07,840
60000 by 20 by 20.
36
00:02:08,090 --> 00:02:09,620
And let's see what Whiteread looks like.
37
00:02:09,620 --> 00:02:11,620
I think we just looked at it 16000.
38
00:02:11,960 --> 00:02:14,480
Let's see what extreme x test looks like.
39
00:02:14,480 --> 00:02:15,870
10000 May 28 by 20.
40
00:02:15,890 --> 00:02:19,950
And this we assume it's going to be 10000 good.
41
00:02:20,040 --> 00:02:21,990
So I just did it here.
42
00:02:22,070 --> 00:02:24,250
But initially we can always do it here.
43
00:02:24,620 --> 00:02:30,890
So this step to me we examined the size and image dimensions it's not required but it's a good practice
44
00:02:31,280 --> 00:02:33,860
just to check and make sure all your data is correct.
45
00:02:33,860 --> 00:02:39,560
So we know data consists of 60000 samples of feeling data and 10000 of test data and all labels are
46
00:02:39,560 --> 00:02:40,750
appropriately sized.
47
00:02:40,760 --> 00:02:45,710
So we actually should check this here appropriately sized meaning that they're in the correct format
48
00:02:45,920 --> 00:02:47,720
that Cara's requires.
49
00:02:48,080 --> 00:02:54,770
And as I mentioned before are 20 by 28 and there's no add dimension or four dimension if you want to
50
00:02:54,770 --> 00:02:57,600
consider this being the first mention here.
51
00:02:57,830 --> 00:03:02,340
This meeting the number of images stored here.
52
00:03:02,790 --> 00:03:09,090
So let's take a look at what happens when we're underskirt it prints out this year which is the shape
53
00:03:09,240 --> 00:03:10,580
we saw before.
54
00:03:10,800 --> 00:03:12,820
How many samples are training data.
55
00:03:13,110 --> 00:03:14,740
How many labels on our training data.
56
00:03:14,820 --> 00:03:19,320
How many samples are tested and samples and labels and are tested here as well.
57
00:03:19,320 --> 00:03:21,720
So these all match up that's good.
58
00:03:21,720 --> 00:03:27,080
And the dimensions here are 28 by 2020 by 28 so everything is all good.
59
00:03:27,090 --> 00:03:30,930
You can take a look at this code and try to decide if you want to.
60
00:03:31,260 --> 00:03:32,580
It's quite basic and simple.
61
00:03:32,580 --> 00:03:37,720
Basically we use these are all in up-I is when it's loaded into here.
62
00:03:38,600 --> 00:03:45,780
And basically we can use the dot ship function which is an extremely useful function to just get a ship
63
00:03:46,020 --> 00:03:48,680
basically the dimensions of your data are key.
64
00:03:49,200 --> 00:03:52,380
So now let's visualize some of this information.
65
00:03:52,400 --> 00:03:54,230
So I'm going to do this in two different ways.
66
00:03:54,240 --> 00:03:54,910
OK.
67
00:03:55,250 --> 00:04:00,500
We're going to do this with open C-v then we're going to do it with Matt plot.
68
00:04:00,720 --> 00:04:02,840
And you can use either one going forward.
69
00:04:03,000 --> 00:04:08,520
I tend to use matplotlib if I'm plotting multiple images on the plate in the book and you can see if
70
00:04:08,520 --> 00:04:12,500
I'm testing and wanting to displace text of an image.
71
00:04:12,630 --> 00:04:15,000
So let's try this.
72
00:04:15,150 --> 00:04:17,610
There we go pops up in a window here.
73
00:04:17,890 --> 00:04:20,200
So we took a look at this nine to nine again.
74
00:04:20,200 --> 00:04:23,430
Tree 2 0 1.
75
00:04:23,560 --> 00:04:28,140
So we just brought up all these windows here and all these digits these are random digits.
76
00:04:28,140 --> 00:04:34,980
We use this function and the random and this defines any random number between 0 and who linked up or
77
00:04:34,980 --> 00:04:37,150
treating data link for treating data.
78
00:04:37,150 --> 00:04:39,260
If you remember with 60000.
79
00:04:39,280 --> 00:04:45,230
So this generates a number from zero to 60000 and displays the image here using open city functions.
80
00:04:46,430 --> 00:04:48,710
So let's now do the same with matplotlib.
81
00:04:49,010 --> 00:04:52,220
This is the code to basically plot and matplotlib.
82
00:04:52,430 --> 00:04:54,750
This is not the actual most efficient way to do it.
83
00:04:54,770 --> 00:04:59,480
The most efficient way would be doing it into a loop but it just left it here for you so you get an
84
00:04:59,480 --> 00:05:02,430
idea of how we use subplots to plot it.
85
00:05:02,450 --> 00:05:03,560
So let's do it.
86
00:05:07,290 --> 00:05:08,110
This is good.
87
00:05:08,410 --> 00:05:12,880
Actually it's defined and that's because this one may have changed it before.
88
00:05:12,900 --> 00:05:14,650
But let's run it again.
89
00:05:14,660 --> 00:05:17,060
It's a the extreme right here.
90
00:05:18,440 --> 00:05:19,700
There we go.
91
00:05:19,790 --> 00:05:22,910
So this plotted six images in a nice small grid here.
92
00:05:23,150 --> 00:05:26,360
So this is this is why I sort of use my Potala to display images in there.
93
00:05:26,360 --> 00:05:30,710
But in the book I find it easier and nicer to work with.
94
00:05:30,980 --> 00:05:36,130
So that's how we basically import or data set and visualize some data from a data set.
95
00:05:36,150 --> 00:05:40,190
Next we're going to prepare a dataset for treating.
96
00:05:40,220 --> 00:05:41,760
So let's take a look at that shortly.