WEBVTT - Subtitles by: DownloadYoutubeSubtitles.com

00:00:00.240 --> 00:00:03.760
we know humans learn from their past

00:00:02.320 --> 00:00:05.680
experiences

00:00:03.760 --> 00:00:07.359
and machines follow instructions given

00:00:05.680 --> 00:00:09.599
by humans

00:00:07.359 --> 00:00:11.519
but what if humans can train the

00:00:09.599 --> 00:00:14.000
machines to learn from the past data and

00:00:11.519 --> 00:00:15.839
do what humans can do and much faster

00:00:14.000 --> 00:00:17.760
well that's called machine learning but

00:00:15.839 --> 00:00:20.000
it's a lot more than just learning it's

00:00:17.760 --> 00:00:22.400
also about understanding and reasoning

00:00:20.000 --> 00:00:24.240
so today we will learn about the basics

00:00:22.400 --> 00:00:26.800
of machine learning

00:00:24.240 --> 00:00:28.800
so that's paul he loves listening to new

00:00:26.800 --> 00:00:30.880
songs

00:00:28.800 --> 00:00:33.120
he either likes them or dislikes them

00:00:30.880 --> 00:00:34.880
paul decides this on the basis of the

00:00:33.120 --> 00:00:36.000
song's tempo

00:00:34.880 --> 00:00:39.040
genre

00:00:36.000 --> 00:00:41.440
intensity and the gender of voice for

00:00:39.040 --> 00:00:44.559
simplicity let's just use tempo and

00:00:41.440 --> 00:00:47.680
intensity for now so here tempo is on

00:00:44.559 --> 00:00:50.320
the x axis ranging from relaxed to fast

00:00:47.680 --> 00:00:53.280
whereas intensity is on the y axis

00:00:50.320 --> 00:00:56.879
ranging from light to soaring we see

00:00:53.280 --> 00:00:59.840
that paul likes the song with fast tempo

00:00:56.879 --> 00:01:02.800
and soaring intensity while he dislikes

00:00:59.840 --> 00:01:05.280
the song with relaxed tempo and light

00:01:02.800 --> 00:01:07.360
intensity so now we know paul's choices

00:01:05.280 --> 00:01:10.720
let's say paul listens to a new song

00:01:07.360 --> 00:01:13.680
let's name it as song a song a has fast

00:01:10.720 --> 00:01:15.840
tempo and a soaring intensity so it lies

00:01:13.680 --> 00:01:17.759
somewhere here looking at the data can

00:01:15.840 --> 00:01:20.560
you guess whether paul will like the

00:01:17.759 --> 00:01:23.040
song or not correct so paul likes this

00:01:20.560 --> 00:01:25.119
song by looking at paul's past choices

00:01:23.040 --> 00:01:28.400
we were able to classify the unknown

00:01:25.119 --> 00:01:30.880
song very easily right let's say now

00:01:28.400 --> 00:01:33.439
paul listens to a new song let's label

00:01:30.880 --> 00:01:36.720
it as song b so song b

00:01:33.439 --> 00:01:39.439
lies somewhere here with medium tempo

00:01:36.720 --> 00:01:42.400
and medium intensity neither relaxed nor

00:01:39.439 --> 00:01:44.479
fast neither light nor soaring now can

00:01:42.400 --> 00:01:46.560
you guess whether paul likes it or not

00:01:44.479 --> 00:01:49.200
not able to guess whether paul will like

00:01:46.560 --> 00:01:52.159
it or dislike it are the choices unclear

00:01:49.200 --> 00:01:54.640
correct we could easily classify song a

00:01:52.159 --> 00:01:57.200
but when the choice became complicated

00:01:54.640 --> 00:01:59.119
as in the case of song b yes and that's

00:01:57.200 --> 00:02:01.920
where machine learning comes in let's

00:01:59.119 --> 00:02:04.240
see how in the same example for song b

00:02:01.920 --> 00:02:06.719
if we draw a circle around the song b we

00:02:04.240 --> 00:02:09.440
see that there are four votes for like

00:02:06.719 --> 00:02:11.760
whereas one would for dislike if we go

00:02:09.440 --> 00:02:13.440
for the majority votes we can say that

00:02:11.760 --> 00:02:15.120
paul will definitely like the song

00:02:13.440 --> 00:02:17.120
that's all this was a basic machine

00:02:15.120 --> 00:02:19.200
learning algorithm also it's called k

00:02:17.120 --> 00:02:21.599
nearest neighbors so this is just a

00:02:19.200 --> 00:02:24.319
small example in one of the many machine

00:02:21.599 --> 00:02:27.440
learning algorithms quite easy right

00:02:24.319 --> 00:02:29.840
believe me it is but what happens when

00:02:27.440 --> 00:02:31.760
the choices become complicated as in the

00:02:29.840 --> 00:02:33.920
case of song b that's when machine

00:02:31.760 --> 00:02:35.920
learning comes in it learns the data

00:02:33.920 --> 00:02:38.160
builds the prediction model and when the

00:02:35.920 --> 00:02:40.640
new data point comes in it can easily

00:02:38.160 --> 00:02:43.200
predict for it more the data better the

00:02:40.640 --> 00:02:45.360
model higher will be the accuracy there

00:02:43.200 --> 00:02:47.599
are many ways in which the machine

00:02:45.360 --> 00:02:49.599
learns it could be either supervised

00:02:47.599 --> 00:02:51.280
learning unsupervised learning or

00:02:49.599 --> 00:02:53.680
reinforcement learning let's first

00:02:51.280 --> 00:02:55.519
quickly understand supervised learning

00:02:53.680 --> 00:02:57.280
suppose your friend gives you one

00:02:55.519 --> 00:03:00.000
million coins of three different

00:02:57.280 --> 00:03:02.080
currencies say one rupee one euro and

00:03:00.000 --> 00:03:04.480
one dirham each coin has different

00:03:02.080 --> 00:03:07.120
weights for example a coin of one rupee

00:03:04.480 --> 00:03:09.519
weighs three grams one euro weighs seven

00:03:07.120 --> 00:03:11.440
grams and one dirham weighs four grams

00:03:09.519 --> 00:03:13.920
your model will predict the currency of

00:03:11.440 --> 00:03:16.400
the coin here your weight becomes the

00:03:13.920 --> 00:03:18.400
feature of coins while currency becomes

00:03:16.400 --> 00:03:21.040
the label when you feed this data to the

00:03:18.400 --> 00:03:23.680
machine learning model it learns which

00:03:21.040 --> 00:03:26.319
feature is associated with which label

00:03:23.680 --> 00:03:28.959
for example it will learn that if a coin

00:03:26.319 --> 00:03:30.560
is of 3 grams it will be a 1 rupee coin

00:03:28.959 --> 00:03:32.879
let's give a new coin to the machine on

00:03:30.560 --> 00:03:34.959
the basis of the weight of the new coin

00:03:32.879 --> 00:03:37.599
your model will predict the currency

00:03:34.959 --> 00:03:40.000
hence supervised learning uses labeled

00:03:37.599 --> 00:03:42.400
data to train the model here the machine

00:03:40.000 --> 00:03:44.159
knew the features of the object and also

00:03:42.400 --> 00:03:46.159
the labels associated with those

00:03:44.159 --> 00:03:47.760
features on this note let's move to

00:03:46.159 --> 00:03:49.760
unsupervised learning and see the

00:03:47.760 --> 00:03:51.440
difference suppose you have cricket data

00:03:49.760 --> 00:03:53.760
set of various players with their

00:03:51.440 --> 00:03:56.319
respective scores and wickets taken when

00:03:53.760 --> 00:03:58.640
you feed this data set to the machine

00:03:56.319 --> 00:04:00.959
the machine identifies the pattern of

00:03:58.640 --> 00:04:02.319
player performance so it plots this data

00:04:00.959 --> 00:04:04.799
with the respective wickets on the

00:04:02.319 --> 00:04:06.799
x-axis while runs on the y-axis while

00:04:04.799 --> 00:04:08.879
looking at the data you'll clearly see

00:04:06.799 --> 00:04:10.879
that there are two clusters the one

00:04:08.879 --> 00:04:13.280
cluster are the players who scored

00:04:10.879 --> 00:04:15.680
higher runs and took less wickets while

00:04:13.280 --> 00:04:18.000
the other cluster is of the players who

00:04:15.680 --> 00:04:20.560
scored less runs but took many wickets

00:04:18.000 --> 00:04:22.800
so here we interpret these two clusters

00:04:20.560 --> 00:04:24.800
as batsmen and bowlers the important

00:04:22.800 --> 00:04:27.520
point to note here is that there were no

00:04:24.800 --> 00:04:29.759
labels of batsmen and bowlers hence the

00:04:27.520 --> 00:04:31.360
learning with unlabeled data is

00:04:29.759 --> 00:04:33.199
unsupervised learning so we saw

00:04:31.360 --> 00:04:35.199
supervised learning where the data was

00:04:33.199 --> 00:04:37.520
labeled and the unsupervised learning

00:04:35.199 --> 00:04:39.360
where the data was unlabeled and then

00:04:37.520 --> 00:04:41.280
there is reinforcement learning which is

00:04:39.360 --> 00:04:42.560
a reward based learning or we can say

00:04:41.280 --> 00:04:44.639
that it works on the principle of

00:04:42.560 --> 00:04:46.960
feedback here let's say you provide the

00:04:44.639 --> 00:04:49.919
system with an image of a dog and ask it

00:04:46.960 --> 00:04:52.080
to identify it the system identifies it

00:04:49.919 --> 00:04:54.000
as a cat so you give a negative feedback

00:04:52.080 --> 00:04:55.600
to the machine saying that it's a dog's

00:04:54.000 --> 00:04:57.759
image the machine will learn from the

00:04:55.600 --> 00:04:59.919
feedback and finally if it comes across

00:04:57.759 --> 00:05:01.919
any other image of a dog it will be able

00:04:59.919 --> 00:05:03.840
to classify it correctly that is

00:05:01.919 --> 00:05:05.520
reinforcement learning to generalize

00:05:03.840 --> 00:05:07.680
machine learning model let's see a

00:05:05.520 --> 00:05:09.280
flowchart input is given to a machine

00:05:07.680 --> 00:05:10.960
learning model which then gives the

00:05:09.280 --> 00:05:13.520
output according to the algorithm

00:05:10.960 --> 00:05:16.000
applied if it's right we take the output

00:05:13.520 --> 00:05:18.080
as a final result else we provide

00:05:16.000 --> 00:05:20.639
feedback to the training model and ask

00:05:18.080 --> 00:05:22.160
it to predict until it learns i hope

00:05:20.639 --> 00:05:23.919
you've understood supervised and

00:05:22.160 --> 00:05:26.240
unsupervised learning so let's have a

00:05:23.919 --> 00:05:28.720
quick quiz you have to determine whether

00:05:26.240 --> 00:05:30.560
the given scenarios uses supervised or

00:05:28.720 --> 00:05:32.880
unsupervised learning simple right

00:05:30.560 --> 00:05:35.039
scenario one facebook recognizes your

00:05:32.880 --> 00:05:37.520
friend in a picture from an album of

00:05:35.039 --> 00:05:40.639
tagged photographs

00:05:37.520 --> 00:05:43.840
scenario 2 netflix recommends new movies

00:05:40.639 --> 00:05:46.400
based on someone's past movie choices

00:05:43.840 --> 00:05:48.800
scenario 3 analyzing bank data for

00:05:46.400 --> 00:05:51.120
suspicious transactions and flagging the

00:05:48.800 --> 00:05:53.360
fraud transactions think wisely and

00:05:51.120 --> 00:05:55.440
comment below your answers moving on

00:05:53.360 --> 00:05:57.680
don't you sometimes wonder how is

00:05:55.440 --> 00:05:59.280
machine learning possible in today's era

00:05:57.680 --> 00:06:02.000
well that's because today we have

00:05:59.280 --> 00:06:04.479
humongous data available everybody is

00:06:02.000 --> 00:06:06.240
online either making a transaction or

00:06:04.479 --> 00:06:08.560
just surfing the internet and that's

00:06:06.240 --> 00:06:10.960
generating a huge amount of data every

00:06:08.560 --> 00:06:13.440
minute and that data my friend is the

00:06:10.960 --> 00:06:15.520
key to analysis also the memory handling

00:06:13.440 --> 00:06:17.360
capabilities of computers have largely

00:06:15.520 --> 00:06:20.479
increased which helps them to process

00:06:17.360 --> 00:06:23.280
such huge amount of data at hand without

00:06:20.479 --> 00:06:25.360
any delay and yes computers now have

00:06:23.280 --> 00:06:27.280
great computational powers so there are

00:06:25.360 --> 00:06:29.520
a lot of applications of machine

00:06:27.280 --> 00:06:31.280
learning out there to name a few machine

00:06:29.520 --> 00:06:33.440
learning is used in healthcare where

00:06:31.280 --> 00:06:35.440
diagnostics are predicted for doctor's

00:06:33.440 --> 00:06:37.759
review the sentiment analysis that the

00:06:35.440 --> 00:06:39.600
tech giants are doing on social media is

00:06:37.759 --> 00:06:41.360
another interesting application of

00:06:39.600 --> 00:06:43.280
machine learning fraud detection in the

00:06:41.360 --> 00:06:45.520
finance sector and also to predict

00:06:43.280 --> 00:06:47.120
customer churn in the e-commerce sector

00:06:45.520 --> 00:06:49.759
while booking a gap you must have

00:06:47.120 --> 00:06:51.520
encountered surge pricing often where it

00:06:49.759 --> 00:06:54.240
says the fair of your trip has been

00:06:51.520 --> 00:06:56.000
updated continue booking yes please i'm

00:06:54.240 --> 00:06:58.160
getting late for office

00:06:56.000 --> 00:07:00.240
well that's an interesting machine

00:06:58.160 --> 00:07:02.639
learning model which is used by global

00:07:00.240 --> 00:07:04.639
taxi giant uber and others where they

00:07:02.639 --> 00:07:06.560
have differential pricing in real time

00:07:04.639 --> 00:07:10.000
based on demand the number of cars

00:07:06.560 --> 00:07:12.560
available bad weather rush r etc so they

00:07:10.000 --> 00:07:14.800
use the surge pricing model to ensure

00:07:12.560 --> 00:07:17.280
that those who need a cab can get one

00:07:14.800 --> 00:07:19.599
also it uses predictive modeling to

00:07:17.280 --> 00:07:21.680
predict where the demand will be high

00:07:19.599 --> 00:07:23.759
with the goal that drivers can take care

00:07:21.680 --> 00:07:26.319
of the demand and search pricing can be

00:07:23.759 --> 00:07:29.280
minimized great hey siri can you remind

00:07:26.319 --> 00:07:30.400
me to book a cab at 6 pm today ok i'll

00:07:29.280 --> 00:07:33.120
remind you

00:07:30.400 --> 00:07:35.520
thanks no problem comment below some

00:07:33.120 --> 00:07:37.360
interesting everyday examples around you

00:07:35.520 --> 00:07:39.840
where machines are learning and doing

00:07:37.360 --> 00:07:41.840
amazing jobs so that's all for machine

00:07:39.840 --> 00:07:43.680
learning basics today from my site keep

00:07:41.840 --> 00:07:48.199
watching this space for more interesting

00:07:43.680 --> 00:07:48.199
videos until then happy learning