ks-version-1.0 / Machine Learning.vtt
NIKKI77's picture
Initial commit
7c91632
WEBVTT - Subtitles by: DownloadYoutubeSubtitles.com
00:00:00.240 --> 00:00:03.760
we know humans learn from their past
00:00:02.320 --> 00:00:05.680
experiences
00:00:03.760 --> 00:00:07.359
and machines follow instructions given
00:00:05.680 --> 00:00:09.599
by humans
00:00:07.359 --> 00:00:11.519
but what if humans can train the
00:00:09.599 --> 00:00:14.000
machines to learn from the past data and
00:00:11.519 --> 00:00:15.839
do what humans can do and much faster
00:00:14.000 --> 00:00:17.760
well that's called machine learning but
00:00:15.839 --> 00:00:20.000
it's a lot more than just learning it's
00:00:17.760 --> 00:00:22.400
also about understanding and reasoning
00:00:20.000 --> 00:00:24.240
so today we will learn about the basics
00:00:22.400 --> 00:00:26.800
of machine learning
00:00:24.240 --> 00:00:28.800
so that's paul he loves listening to new
00:00:26.800 --> 00:00:30.880
songs
00:00:28.800 --> 00:00:33.120
he either likes them or dislikes them
00:00:30.880 --> 00:00:34.880
paul decides this on the basis of the
00:00:33.120 --> 00:00:36.000
song's tempo
00:00:34.880 --> 00:00:39.040
genre
00:00:36.000 --> 00:00:41.440
intensity and the gender of voice for
00:00:39.040 --> 00:00:44.559
simplicity let's just use tempo and
00:00:41.440 --> 00:00:47.680
intensity for now so here tempo is on
00:00:44.559 --> 00:00:50.320
the x axis ranging from relaxed to fast
00:00:47.680 --> 00:00:53.280
whereas intensity is on the y axis
00:00:50.320 --> 00:00:56.879
ranging from light to soaring we see
00:00:53.280 --> 00:00:59.840
that paul likes the song with fast tempo
00:00:56.879 --> 00:01:02.800
and soaring intensity while he dislikes
00:00:59.840 --> 00:01:05.280
the song with relaxed tempo and light
00:01:02.800 --> 00:01:07.360
intensity so now we know paul's choices
00:01:05.280 --> 00:01:10.720
let's say paul listens to a new song
00:01:07.360 --> 00:01:13.680
let's name it as song a song a has fast
00:01:10.720 --> 00:01:15.840
tempo and a soaring intensity so it lies
00:01:13.680 --> 00:01:17.759
somewhere here looking at the data can
00:01:15.840 --> 00:01:20.560
you guess whether paul will like the
00:01:17.759 --> 00:01:23.040
song or not correct so paul likes this
00:01:20.560 --> 00:01:25.119
song by looking at paul's past choices
00:01:23.040 --> 00:01:28.400
we were able to classify the unknown
00:01:25.119 --> 00:01:30.880
song very easily right let's say now
00:01:28.400 --> 00:01:33.439
paul listens to a new song let's label
00:01:30.880 --> 00:01:36.720
it as song b so song b
00:01:33.439 --> 00:01:39.439
lies somewhere here with medium tempo
00:01:36.720 --> 00:01:42.400
and medium intensity neither relaxed nor
00:01:39.439 --> 00:01:44.479
fast neither light nor soaring now can
00:01:42.400 --> 00:01:46.560
you guess whether paul likes it or not
00:01:44.479 --> 00:01:49.200
not able to guess whether paul will like
00:01:46.560 --> 00:01:52.159
it or dislike it are the choices unclear
00:01:49.200 --> 00:01:54.640
correct we could easily classify song a
00:01:52.159 --> 00:01:57.200
but when the choice became complicated
00:01:54.640 --> 00:01:59.119
as in the case of song b yes and that's
00:01:57.200 --> 00:02:01.920
where machine learning comes in let's
00:01:59.119 --> 00:02:04.240
see how in the same example for song b
00:02:01.920 --> 00:02:06.719
if we draw a circle around the song b we
00:02:04.240 --> 00:02:09.440
see that there are four votes for like
00:02:06.719 --> 00:02:11.760
whereas one would for dislike if we go
00:02:09.440 --> 00:02:13.440
for the majority votes we can say that
00:02:11.760 --> 00:02:15.120
paul will definitely like the song
00:02:13.440 --> 00:02:17.120
that's all this was a basic machine
00:02:15.120 --> 00:02:19.200
learning algorithm also it's called k
00:02:17.120 --> 00:02:21.599
nearest neighbors so this is just a
00:02:19.200 --> 00:02:24.319
small example in one of the many machine
00:02:21.599 --> 00:02:27.440
learning algorithms quite easy right
00:02:24.319 --> 00:02:29.840
believe me it is but what happens when
00:02:27.440 --> 00:02:31.760
the choices become complicated as in the
00:02:29.840 --> 00:02:33.920
case of song b that's when machine
00:02:31.760 --> 00:02:35.920
learning comes in it learns the data
00:02:33.920 --> 00:02:38.160
builds the prediction model and when the
00:02:35.920 --> 00:02:40.640
new data point comes in it can easily
00:02:38.160 --> 00:02:43.200
predict for it more the data better the
00:02:40.640 --> 00:02:45.360
model higher will be the accuracy there
00:02:43.200 --> 00:02:47.599
are many ways in which the machine
00:02:45.360 --> 00:02:49.599
learns it could be either supervised
00:02:47.599 --> 00:02:51.280
learning unsupervised learning or
00:02:49.599 --> 00:02:53.680
reinforcement learning let's first
00:02:51.280 --> 00:02:55.519
quickly understand supervised learning
00:02:53.680 --> 00:02:57.280
suppose your friend gives you one
00:02:55.519 --> 00:03:00.000
million coins of three different
00:02:57.280 --> 00:03:02.080
currencies say one rupee one euro and
00:03:00.000 --> 00:03:04.480
one dirham each coin has different
00:03:02.080 --> 00:03:07.120
weights for example a coin of one rupee
00:03:04.480 --> 00:03:09.519
weighs three grams one euro weighs seven
00:03:07.120 --> 00:03:11.440
grams and one dirham weighs four grams
00:03:09.519 --> 00:03:13.920
your model will predict the currency of
00:03:11.440 --> 00:03:16.400
the coin here your weight becomes the
00:03:13.920 --> 00:03:18.400
feature of coins while currency becomes
00:03:16.400 --> 00:03:21.040
the label when you feed this data to the
00:03:18.400 --> 00:03:23.680
machine learning model it learns which
00:03:21.040 --> 00:03:26.319
feature is associated with which label
00:03:23.680 --> 00:03:28.959
for example it will learn that if a coin
00:03:26.319 --> 00:03:30.560
is of 3 grams it will be a 1 rupee coin
00:03:28.959 --> 00:03:32.879
let's give a new coin to the machine on
00:03:30.560 --> 00:03:34.959
the basis of the weight of the new coin
00:03:32.879 --> 00:03:37.599
your model will predict the currency
00:03:34.959 --> 00:03:40.000
hence supervised learning uses labeled
00:03:37.599 --> 00:03:42.400
data to train the model here the machine
00:03:40.000 --> 00:03:44.159
knew the features of the object and also
00:03:42.400 --> 00:03:46.159
the labels associated with those
00:03:44.159 --> 00:03:47.760
features on this note let's move to
00:03:46.159 --> 00:03:49.760
unsupervised learning and see the
00:03:47.760 --> 00:03:51.440
difference suppose you have cricket data
00:03:49.760 --> 00:03:53.760
set of various players with their
00:03:51.440 --> 00:03:56.319
respective scores and wickets taken when
00:03:53.760 --> 00:03:58.640
you feed this data set to the machine
00:03:56.319 --> 00:04:00.959
the machine identifies the pattern of
00:03:58.640 --> 00:04:02.319
player performance so it plots this data
00:04:00.959 --> 00:04:04.799
with the respective wickets on the
00:04:02.319 --> 00:04:06.799
x-axis while runs on the y-axis while
00:04:04.799 --> 00:04:08.879
looking at the data you'll clearly see
00:04:06.799 --> 00:04:10.879
that there are two clusters the one
00:04:08.879 --> 00:04:13.280
cluster are the players who scored
00:04:10.879 --> 00:04:15.680
higher runs and took less wickets while
00:04:13.280 --> 00:04:18.000
the other cluster is of the players who
00:04:15.680 --> 00:04:20.560
scored less runs but took many wickets
00:04:18.000 --> 00:04:22.800
so here we interpret these two clusters
00:04:20.560 --> 00:04:24.800
as batsmen and bowlers the important
00:04:22.800 --> 00:04:27.520
point to note here is that there were no
00:04:24.800 --> 00:04:29.759
labels of batsmen and bowlers hence the
00:04:27.520 --> 00:04:31.360
learning with unlabeled data is
00:04:29.759 --> 00:04:33.199
unsupervised learning so we saw
00:04:31.360 --> 00:04:35.199
supervised learning where the data was
00:04:33.199 --> 00:04:37.520
labeled and the unsupervised learning
00:04:35.199 --> 00:04:39.360
where the data was unlabeled and then
00:04:37.520 --> 00:04:41.280
there is reinforcement learning which is
00:04:39.360 --> 00:04:42.560
a reward based learning or we can say
00:04:41.280 --> 00:04:44.639
that it works on the principle of
00:04:42.560 --> 00:04:46.960
feedback here let's say you provide the
00:04:44.639 --> 00:04:49.919
system with an image of a dog and ask it
00:04:46.960 --> 00:04:52.080
to identify it the system identifies it
00:04:49.919 --> 00:04:54.000
as a cat so you give a negative feedback
00:04:52.080 --> 00:04:55.600
to the machine saying that it's a dog's
00:04:54.000 --> 00:04:57.759
image the machine will learn from the
00:04:55.600 --> 00:04:59.919
feedback and finally if it comes across
00:04:57.759 --> 00:05:01.919
any other image of a dog it will be able
00:04:59.919 --> 00:05:03.840
to classify it correctly that is
00:05:01.919 --> 00:05:05.520
reinforcement learning to generalize
00:05:03.840 --> 00:05:07.680
machine learning model let's see a
00:05:05.520 --> 00:05:09.280
flowchart input is given to a machine
00:05:07.680 --> 00:05:10.960
learning model which then gives the
00:05:09.280 --> 00:05:13.520
output according to the algorithm
00:05:10.960 --> 00:05:16.000
applied if it's right we take the output
00:05:13.520 --> 00:05:18.080
as a final result else we provide
00:05:16.000 --> 00:05:20.639
feedback to the training model and ask
00:05:18.080 --> 00:05:22.160
it to predict until it learns i hope
00:05:20.639 --> 00:05:23.919
you've understood supervised and
00:05:22.160 --> 00:05:26.240
unsupervised learning so let's have a
00:05:23.919 --> 00:05:28.720
quick quiz you have to determine whether
00:05:26.240 --> 00:05:30.560
the given scenarios uses supervised or
00:05:28.720 --> 00:05:32.880
unsupervised learning simple right
00:05:30.560 --> 00:05:35.039
scenario one facebook recognizes your
00:05:32.880 --> 00:05:37.520
friend in a picture from an album of
00:05:35.039 --> 00:05:40.639
tagged photographs
00:05:37.520 --> 00:05:43.840
scenario 2 netflix recommends new movies
00:05:40.639 --> 00:05:46.400
based on someone's past movie choices
00:05:43.840 --> 00:05:48.800
scenario 3 analyzing bank data for
00:05:46.400 --> 00:05:51.120
suspicious transactions and flagging the
00:05:48.800 --> 00:05:53.360
fraud transactions think wisely and
00:05:51.120 --> 00:05:55.440
comment below your answers moving on
00:05:53.360 --> 00:05:57.680
don't you sometimes wonder how is
00:05:55.440 --> 00:05:59.280
machine learning possible in today's era
00:05:57.680 --> 00:06:02.000
well that's because today we have
00:05:59.280 --> 00:06:04.479
humongous data available everybody is
00:06:02.000 --> 00:06:06.240
online either making a transaction or
00:06:04.479 --> 00:06:08.560
just surfing the internet and that's
00:06:06.240 --> 00:06:10.960
generating a huge amount of data every
00:06:08.560 --> 00:06:13.440
minute and that data my friend is the
00:06:10.960 --> 00:06:15.520
key to analysis also the memory handling
00:06:13.440 --> 00:06:17.360
capabilities of computers have largely
00:06:15.520 --> 00:06:20.479
increased which helps them to process
00:06:17.360 --> 00:06:23.280
such huge amount of data at hand without
00:06:20.479 --> 00:06:25.360
any delay and yes computers now have
00:06:23.280 --> 00:06:27.280
great computational powers so there are
00:06:25.360 --> 00:06:29.520
a lot of applications of machine
00:06:27.280 --> 00:06:31.280
learning out there to name a few machine
00:06:29.520 --> 00:06:33.440
learning is used in healthcare where
00:06:31.280 --> 00:06:35.440
diagnostics are predicted for doctor's
00:06:33.440 --> 00:06:37.759
review the sentiment analysis that the
00:06:35.440 --> 00:06:39.600
tech giants are doing on social media is
00:06:37.759 --> 00:06:41.360
another interesting application of
00:06:39.600 --> 00:06:43.280
machine learning fraud detection in the
00:06:41.360 --> 00:06:45.520
finance sector and also to predict
00:06:43.280 --> 00:06:47.120
customer churn in the e-commerce sector
00:06:45.520 --> 00:06:49.759
while booking a gap you must have
00:06:47.120 --> 00:06:51.520
encountered surge pricing often where it
00:06:49.759 --> 00:06:54.240
says the fair of your trip has been
00:06:51.520 --> 00:06:56.000
updated continue booking yes please i'm
00:06:54.240 --> 00:06:58.160
getting late for office
00:06:56.000 --> 00:07:00.240
well that's an interesting machine
00:06:58.160 --> 00:07:02.639
learning model which is used by global
00:07:00.240 --> 00:07:04.639
taxi giant uber and others where they
00:07:02.639 --> 00:07:06.560
have differential pricing in real time
00:07:04.639 --> 00:07:10.000
based on demand the number of cars
00:07:06.560 --> 00:07:12.560
available bad weather rush r etc so they
00:07:10.000 --> 00:07:14.800
use the surge pricing model to ensure
00:07:12.560 --> 00:07:17.280
that those who need a cab can get one
00:07:14.800 --> 00:07:19.599
also it uses predictive modeling to
00:07:17.280 --> 00:07:21.680
predict where the demand will be high
00:07:19.599 --> 00:07:23.759
with the goal that drivers can take care
00:07:21.680 --> 00:07:26.319
of the demand and search pricing can be
00:07:23.759 --> 00:07:29.280
minimized great hey siri can you remind
00:07:26.319 --> 00:07:30.400
me to book a cab at 6 pm today ok i'll
00:07:29.280 --> 00:07:33.120
remind you
00:07:30.400 --> 00:07:35.520
thanks no problem comment below some
00:07:33.120 --> 00:07:37.360
interesting everyday examples around you
00:07:35.520 --> 00:07:39.840
where machines are learning and doing
00:07:37.360 --> 00:07:41.840
amazing jobs so that's all for machine
00:07:39.840 --> 00:07:43.680
learning basics today from my site keep
00:07:41.840 --> 00:07:48.199
watching this space for more interesting
00:07:43.680 --> 00:07:48.199
videos until then happy learning