Add files using upload-large-folder tool

d157f08 verified 3 months ago

6.59 kB

	1
	00:00:01,080 --> 00:00:08,280
	And welcome back to 7.5 which is about pooling This is the next sequence of leads in our CNN so far

	2
	00:00:08,490 --> 00:00:15,240
	we've dealt with convolutional nearly the convolution part and real you know let's look at pulling all

	3
	00:00:15,420 --> 00:00:22,660
	those Monas subsampling spooling as I just said assaults and then a subsampling or downsampling is a

	4
	00:00:22,660 --> 00:00:27,170
	simple process where we reduce the size or dimensionality of the future map.

	5
	00:00:27,280 --> 00:00:31,690
	The purpose of this reductionists reduced number of parameters that we need to train whilst retaining

	6
	00:00:31,690 --> 00:00:36,670
	most of the important features and information in the image.

	7
	00:00:36,870 --> 00:00:39,100
	They are basically tree types of pooling we can apply.

	8
	00:00:39,100 --> 00:00:43,800
	There are actually some Wolper does take a look at these Tree Man types that are used.

	9
	00:00:43,870 --> 00:00:46,250
	So here's an example of Max pooling.

	10
	00:00:46,300 --> 00:00:52,900
	Imagine this is the really outputs from all this input output here was reproduced from the real real

	11
	00:00:52,940 --> 00:00:53,510
	layer.

	12
	00:00:53,800 --> 00:00:57,430
	So you can imagine these values at the zeros here were actually negative values.

	13
	00:00:57,820 --> 00:01:03,790
	So Max bhool basically uses a two by two Kial here we can define the screen size anything we want just

	14
	00:01:03,790 --> 00:01:09,520
	like we did with the straight and of the kernels we used in the convolutional Liya and basically using

	15
	00:01:09,520 --> 00:01:10,530
	a two by two.

	16
	00:01:10,600 --> 00:01:15,250
	It splits up into two by two two by two two by two by two grid.

	17
	00:01:15,580 --> 00:01:24,190
	So what it does Max beling takes it massively out of each tutelary for 167 2:41 and 235 and puts them

	18
	00:01:24,190 --> 00:01:25,380
	into this block here.

	19
	00:01:25,750 --> 00:01:29,270
	So this is what we call downsampling or subsampling.

	20
	00:01:29,320 --> 00:01:35,440
	Basically we have sort of like compressed the image here and retain the most Max important features

	21
	00:01:36,680 --> 00:01:37,470
	actually.

	22
	00:01:37,470 --> 00:01:40,160
	Let's go back to the previous slide and previously.

	23
	00:01:40,210 --> 00:01:42,810
	We mentioned average and sampling.

	24
	00:01:42,850 --> 00:01:48,850
	Now as you can imagine average and sampling would just simply be the average of these values here here

	25
	00:01:49,120 --> 00:01:53,130
	here here and sampling would just be the sum of these values.

	26
	00:01:53,460 --> 00:01:55,090
	So it's also a way we can use pooling.

	27
	00:01:55,090 --> 00:02:01,900
	However in majority of convolutional neural nets we always use maximally.

	28
	00:02:01,940 --> 00:02:04,740
	So this is only so far just to do a recap.

	29
	00:02:04,880 --> 00:02:10,370
	We have an input image with our key and all that is being slid across this image producing multiple

	30
	00:02:10,370 --> 00:02:11,380
	different filters here.

	31
	00:02:11,450 --> 00:02:15,920
	All of it seems much of the same size as the input image and that's because of zero padding.

	32
	00:02:16,250 --> 00:02:22,430
	Then we have a real output which basically is the same size up of matrix as this except all the negative

	33
	00:02:22,430 --> 00:02:23,850
	values into zeros.

	34
	00:02:24,230 --> 00:02:30,470
	And then we have the subsampling are pulling away a lot downsampling which basically reduces this image.

	35
	00:02:30,530 --> 00:02:37,220
	This Sorry this matrix by half 14 by 14 because as you can see using a two by two we have four by four

	36
	00:02:37,360 --> 00:02:41,570
	and we get a two by two and that's still 12 filters.

	37
	00:02:41,750 --> 00:02:44,540
	However they have not been downsampled.

	38
	00:02:44,540 --> 00:02:45,880
	So let's move on now.

	39
	00:02:46,310 --> 00:02:52,100
	So let's talk a bit more about pooling typically pooling is done using two by two windows with a straight

	40
	00:02:52,100 --> 00:02:54,540
	of two and no padding applied.

	41
	00:02:54,560 --> 00:02:58,280
	That's how we actually get this four by four here.

	42
	00:02:58,280 --> 00:03:01,920
	It takes a two by two jump to make two jump and blah blah blah.

	43
	00:03:04,060 --> 00:03:08,170
	So for smaller and put images or larger images we can use larger pools.

	44
	00:03:09,020 --> 00:03:14,530
	Or smaller pools whichever you want to do and using the above settings pooling has the effect of reducing

	45
	00:03:14,530 --> 00:03:16,890
	dimensionality width and height.

	46
	00:03:16,930 --> 00:03:18,330
	Those are the only two dimensions we have.

	47
	00:03:18,340 --> 00:03:22,150
	We reduce the of the previous layer by half.

	48
	00:03:22,330 --> 00:03:26,950
	And to us removing tree quarter or 75 percent of the activations seen in the previously

	49
	00:03:31,290 --> 00:03:32,940
	so keep moving on.

	50
	00:03:32,940 --> 00:03:39,470
	This makes our model more invariant to small or minor transformations or distortions no input image.

	51
	00:03:39,570 --> 00:03:45,000
	Since we're now averaging or taking to max or put from a small area of an image what this actually means

	52
	00:03:45,000 --> 00:03:51,020
	is that we're instead of looking at specific pixels here in an image because we're actually dwindling

	53
	00:03:51,050 --> 00:03:57,480
	sample and looking at a max in an area we sort of add some sort of variance or spatial variance too

	54
	00:03:57,480 --> 00:03:58,150
	awful to say.

	55
	00:03:58,170 --> 00:04:04,800
	So if filters on super specific to certain areas and I remember they being slid across the image.

	56
	00:04:04,800 --> 00:04:10,410
	So imagine this filter have been Slackware's image looking for a specific edge or whatever it can actually

	57
	00:04:11,310 --> 00:04:13,160
	add some invariants Now to it.

	58
	00:04:13,170 --> 00:04:20,490
	So this actually increases do basically the ability of all convolutional model to generalize to information

	59
	00:04:20,490 --> 00:04:21,790
	is never seen before.

	60
	00:04:23,540 --> 00:04:29,450
	So now let's move on to what is kind of to finally the awesomely as in-between of you discussed them

	61
	00:04:29,450 --> 00:04:30,250
	later on.

	62
	00:04:30,410 --> 00:04:35,990
	But for now seeing is of course to CNN and this is the last layer to fully connected.

	63
	00:04:36,030 --> 00:04:36,730
	FCPA.