Prince-1 commited on
Commit
17e2002
·
verified ·
1 Parent(s): e62bc71

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. 25. Face Recognition with VGGFace/2.1 Download VGG Face Weights.html +1 -0
  2. 25. Face Recognition with VGGFace/2.2 Download VGG Face Weights.html +1 -0
  3. 25. Face Recognition with VGGFace/3. Face Recognition using WebCam & Identifying Friends TV Show Characters in Video.html +1 -0
  4. 26. Credit Card/26. Credit Card Reader.ipynb +1066 -0
  5. 26. Credit Card/credit_card_color.jpg +0 -0
  6. 26. Credit Card/credit_card_extracted_digits.jpg +0 -0
  7. 26. Credit Card/creditcard_digits1.jpg +0 -0
  8. 26. Credit Card/creditcard_digits2.jpg +0 -0
  9. 26. Credit Card/test_card.jpg +0 -0
  10. 26. The Computer Vision World/4. Popular Computer Vision Conferences & Finding Datasets.srt +127 -0
  11. 26. The Computer Vision World/5. Building a Deep Learning Machine vs. Cloud GPUs.srt +219 -0
  12. 27. BONUS - Build a Credit Card Number Reader/1. Step 1 - Creating a Credit Card Number Dataset.html +246 -0
  13. 27. BONUS - Build a Credit Card Number Reader/1.1 Download Credit Card Files and Data.html +1 -0
  14. 27. BONUS - Build a Credit Card Number Reader/2. Step 2 - Training Our Model.html +116 -0
  15. 27. BONUS - Build a Credit Card Number Reader/3. Step 3 - Extracting A Credit Card from the Background.html +139 -0
  16. 27. BONUS - Build a Credit Card Number Reader/4. Step 4 - Use our Model to Identify the Digits & Display it onto our Credit Card.html +60 -0
  17. 28. BONUS - Use Cloud GPUs on PaperSpace/1. Why use Cloud GPUs and How to Setup a PaperSpace Gradient Notebook.html +1 -0
  18. 28. BONUS - Use Cloud GPUs on PaperSpace/2. Train a AlexNet on PaperSpace.html +100 -0
  19. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/1. Install and Run Flask.html +9 -0
  20. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/1.1 Download CV API Files.html +1 -0
  21. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/2. Running Your Computer Vision Web App on Flask Locally.html +86 -0
  22. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/3. Running Your Computer Vision API.html +76 -0
  23. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/4. Setting Up An AWS Account.html +1 -0
  24. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/5. Setting Up Your AWS EC2 Instance & Installing Keras, TensorFlow, OpenCV & Flask.html +9 -0
  25. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/6. Changing your EC2 Security Group.html +1 -0
  26. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/7. Using FileZilla to transfer files to your EC2 Instance.html +2 -0
  27. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/8. Running your CV Web App on EC2.html +1 -0
  28. 29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/9. Running your CV API on EC2.html +4 -0
  29. 3. Installation Guide/3. Setting up your Deep Learning Virtual Machine (Download Code, VM & Slides here!).srt +651 -0
  30. 3. Installation Guide/3.1 Download Your Deep Learning Virtual Machine HERE.html +1 -0
  31. 3. Installation Guide/3.1 Slides.html +1 -0
  32. 3. Installation Guide/3.2 Code.html +1 -0
  33. 3. Installation Guide/3.2 Slides.html +1 -0
  34. 3. Installation Guide/3.3 Code.html +1 -0
  35. 3. Installation Guide/3.3 Download Your Deep Learning Virtual Machine HERE.html +1 -0
  36. 3. Installation Guide/3.4 MIRROR - Download Your Deep Learning Virtual Machine HERE.html +1 -0
  37. 3. Installation Guide/4. Optional - Troubleshooting Guide for VM Setup & for resolving some MacOS Issues.html +1 -0
  38. 4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.1 - Handwritten Digit Classification Demo (MNIST).ipynb +656 -0
  39. 4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.2 - Image Classifier - CIFAR10.ipynb +167 -0
  40. 4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.3. Live Sketching.ipynb +126 -0
  41. 4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/Test - Imports Keras, OpenCV and tests webcam.ipynb +122 -0
  42. 4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/preprocessors.py +65 -0
  43. 4. Handwriting Recognition/1. Get Started! Handwriting Recognition, Simple Object Classification OpenCV Demo.srt +35 -0
  44. 4. Handwriting Recognition/2. Experiment with a Handwriting Classifier.srt +363 -0
  45. 4. Handwriting Recognition/3. Experiment with a Image Classifier.srt +195 -0
  46. 4. Handwriting Recognition/4. OpenCV Demo – Live Sketch with Webcam.srt +251 -0
  47. 5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/1.1 Code Download.html +1 -0
  48. 5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/10. Transformations, Affine And Non-Affine - The Many Ways We Can Change Images.srt +123 -0
  49. 5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/11. Image Translations - Moving Images Up, Down. Left And Right.srt +163 -0
  50. 5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/12. Rotations - How To Spin Your Image Around And Do Horizontal Flipping.srt +183 -0
25. Face Recognition with VGGFace/2.1 Download VGG Face Weights.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1dgQU8cV9rToR23cFArk68rgY1C0B_N3P/view?usp=sharing";</script>
25. Face Recognition with VGGFace/2.2 Download VGG Face Weights.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1dgQU8cV9rToR23cFArk68rgY1C0B_N3P/view?usp=sharing";</script>
25. Face Recognition with VGGFace/3. Face Recognition using WebCam & Identifying Friends TV Show Characters in Video.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <p><strong>Intro and Lesson:</strong></p><p>In this section, we use the VGGFace model to identify yourself (you'll need to include a picture of yourself in the directory stated in the code).</p><p>It uses one-shot learning, so we only need one picture for our recognition system!</p><p><strong>The iPython Notebooks to load first are:</strong></p><ul><li><p>25.3 Face Recognition - One Shot Learning.ipynb</p></li></ul><p><br></p><p><br></p>
26. Credit Card/26. Credit Card Reader.ipynb ADDED
@@ -0,0 +1,1066 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# 1. Let's Create Our Credit Card Dataset\n",
8
+ "- There two main font variations used in credit cards"
9
+ ]
10
+ },
11
+ {
12
+ "cell_type": "code",
13
+ "execution_count": 1,
14
+ "metadata": {},
15
+ "outputs": [],
16
+ "source": [
17
+ "import cv2\n",
18
+ "\n",
19
+ "cc1 = cv2.imread('creditcard_digits1.jpg', 0)\n",
20
+ "cv2.imshow(\"Digits 1\", cc1)\n",
21
+ "cv2.waitKey(0)\n",
22
+ "cc2 = cv2.imread('creditcard_digits2.jpg', 0)\n",
23
+ "cv2.imshow(\"Digits 2\", cc2)\n",
24
+ "cv2.waitKey(0)\n",
25
+ "cv2.destroyAllWindows()"
26
+ ]
27
+ },
28
+ {
29
+ "cell_type": "code",
30
+ "execution_count": 2,
31
+ "metadata": {},
32
+ "outputs": [],
33
+ "source": [
34
+ "cc1 = cv2.imread('creditcard_digits2.jpg', 0)\n",
35
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)\n",
36
+ "cv2.imshow(\"Digits 2 Thresholded\", th2)\n",
37
+ "cv2.waitKey(0)\n",
38
+ " \n",
39
+ "cv2.destroyAllWindows()"
40
+ ]
41
+ },
42
+ {
43
+ "cell_type": "markdown",
44
+ "metadata": {},
45
+ "source": [
46
+ "## Now let's get generate an Augumentated Dataset from these two samples \n"
47
+ ]
48
+ },
49
+ {
50
+ "cell_type": "code",
51
+ "execution_count": 3,
52
+ "metadata": {},
53
+ "outputs": [
54
+ {
55
+ "name": "stdout",
56
+ "output_type": "stream",
57
+ "text": [
58
+ "./credit_card/train/0\n",
59
+ "./credit_card/train/1\n",
60
+ "./credit_card/train/2\n",
61
+ "./credit_card/train/3\n",
62
+ "./credit_card/train/4\n",
63
+ "./credit_card/train/5\n",
64
+ "./credit_card/train/6\n",
65
+ "./credit_card/train/7\n",
66
+ "./credit_card/train/8\n",
67
+ "./credit_card/train/9\n",
68
+ "./credit_card/test/0\n",
69
+ "./credit_card/test/1\n",
70
+ "./credit_card/test/2\n",
71
+ "./credit_card/test/3\n",
72
+ "./credit_card/test/4\n",
73
+ "./credit_card/test/5\n",
74
+ "./credit_card/test/6\n",
75
+ "./credit_card/test/7\n",
76
+ "./credit_card/test/8\n",
77
+ "./credit_card/test/9\n"
78
+ ]
79
+ }
80
+ ],
81
+ "source": [
82
+ "#Create our dataset directories\n",
83
+ "\n",
84
+ "import os\n",
85
+ "\n",
86
+ "def makedir(directory):\n",
87
+ " \"\"\"Creates a new directory if it does not exist\"\"\"\n",
88
+ " if not os.path.exists(directory):\n",
89
+ " os.makedirs(directory)\n",
90
+ " return None, 0\n",
91
+ " \n",
92
+ "for i in range(0,10):\n",
93
+ " directory_name = \"./credit_card/train/\"+str(i)\n",
94
+ " print(directory_name)\n",
95
+ " makedir(directory_name) \n",
96
+ "\n",
97
+ "for i in range(0,10):\n",
98
+ " directory_name = \"./credit_card/test/\"+str(i)\n",
99
+ " print(directory_name)\n",
100
+ " makedir(directory_name)"
101
+ ]
102
+ },
103
+ {
104
+ "cell_type": "markdown",
105
+ "metadata": {},
106
+ "source": [
107
+ "## Let's make our Data Augmentation Functions\n",
108
+ "These are used to perform image manipulation and pre-processing tasks"
109
+ ]
110
+ },
111
+ {
112
+ "cell_type": "code",
113
+ "execution_count": 4,
114
+ "metadata": {},
115
+ "outputs": [],
116
+ "source": [
117
+ "import cv2\n",
118
+ "import numpy as np \n",
119
+ "import random\n",
120
+ "import cv2\n",
121
+ "from scipy.ndimage import convolve\n",
122
+ "\n",
123
+ "def DigitAugmentation(frame, dim = 32):\n",
124
+ " \"\"\"Randomly alters the image using noise, pixelation and streching image functions\"\"\"\n",
125
+ " frame = cv2.resize(frame, None, fx=2, fy=2, interpolation = cv2.INTER_CUBIC)\n",
126
+ " frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2RGB)\n",
127
+ " random_num = np.random.randint(0,9)\n",
128
+ "\n",
129
+ " if (random_num % 2 == 0):\n",
130
+ " frame = add_noise(frame)\n",
131
+ " if(random_num % 3 == 0):\n",
132
+ " frame = pixelate(frame)\n",
133
+ " if(random_num % 2 == 0):\n",
134
+ " frame = stretch(frame)\n",
135
+ " frame = cv2.resize(frame, (dim, dim), interpolation = cv2.INTER_AREA)\n",
136
+ "\n",
137
+ " return frame \n",
138
+ "\n",
139
+ "def add_noise(image):\n",
140
+ " \"\"\"Addings noise to image\"\"\"\n",
141
+ " prob = random.uniform(0.01, 0.05)\n",
142
+ " rnd = np.random.rand(image.shape[0], image.shape[1])\n",
143
+ " noisy = image.copy()\n",
144
+ " noisy[rnd < prob] = 0\n",
145
+ " noisy[rnd > 1 - prob] = 1\n",
146
+ " return noisy\n",
147
+ "\n",
148
+ "def pixelate(image):\n",
149
+ " \"Pixelates an image by reducing the resolution then upscaling it\"\n",
150
+ " dim = np.random.randint(8,12)\n",
151
+ " image = cv2.resize(image, (dim, dim), interpolation = cv2.INTER_AREA)\n",
152
+ " image = cv2.resize(image, (16, 16), interpolation = cv2.INTER_AREA)\n",
153
+ " return image\n",
154
+ "\n",
155
+ "def stretch(image):\n",
156
+ " \"Randomly applies different degrees of stretch to image\"\n",
157
+ " ran = np.random.randint(0,3)*2\n",
158
+ " if np.random.randint(0,2) == 0:\n",
159
+ " frame = cv2.resize(image, (32, ran+32), interpolation = cv2.INTER_AREA)\n",
160
+ " return frame[int(ran/2):int(ran+32)-int(ran/2), 0:32]\n",
161
+ " else:\n",
162
+ " frame = cv2.resize(image, (ran+32, 32), interpolation = cv2.INTER_AREA)\n",
163
+ " return frame[0:32, int(ran/2):int(ran+32)-int(ran/2)]\n",
164
+ " \n",
165
+ "def pre_process(image, inv = False):\n",
166
+ " \"\"\"Uses OTSU binarization on an image\"\"\"\n",
167
+ " try:\n",
168
+ " gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)\n",
169
+ " except:\n",
170
+ " gray_image = image\n",
171
+ " pass\n",
172
+ " \n",
173
+ " if inv == False:\n",
174
+ " _, th2 = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)\n",
175
+ " else:\n",
176
+ " _, th2 = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
177
+ " resized = cv2.resize(th2, (32,32), interpolation = cv2.INTER_AREA)\n",
178
+ " return resized"
179
+ ]
180
+ },
181
+ {
182
+ "cell_type": "markdown",
183
+ "metadata": {},
184
+ "source": [
185
+ "## Testing our augmentation functions"
186
+ ]
187
+ },
188
+ {
189
+ "cell_type": "code",
190
+ "execution_count": 5,
191
+ "metadata": {},
192
+ "outputs": [],
193
+ "source": [
194
+ "cc1 = cv2.imread('creditcard_digits2.jpg', 0)\n",
195
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
196
+ "cv2.imshow(\"cc1\", th2)\n",
197
+ "cv2.waitKey(0)\n",
198
+ "cv2.destroyAllWindows()\n",
199
+ "\n",
200
+ "# This is the coordinates of the region enclosing the first digit\n",
201
+ "# This is preset and was done manually based on this specific image\n",
202
+ "region = [(0, 0), (35, 48)]\n",
203
+ "\n",
204
+ "# Assigns values to each region for ease of interpretation\n",
205
+ "top_left_y = region[0][1]\n",
206
+ "bottom_right_y = region[1][1]\n",
207
+ "top_left_x = region[0][0]\n",
208
+ "bottom_right_x = region[1][0]\n",
209
+ "\n",
210
+ "for i in range(0,1): #We only look at the first digit in testing out augmentation functions\n",
211
+ " roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
212
+ " for j in range(0,10):\n",
213
+ " roi2 = DigitAugmentation(roi)\n",
214
+ " roi_otsu = pre_process(roi2, inv = False)\n",
215
+ " cv2.imshow(\"otsu\", roi_otsu)\n",
216
+ " cv2.waitKey(0)\n",
217
+ " \n",
218
+ "cv2.destroyAllWindows()"
219
+ ]
220
+ },
221
+ {
222
+ "cell_type": "markdown",
223
+ "metadata": {},
224
+ "source": [
225
+ "## Creating our Training Data (1000 variations of each font type)"
226
+ ]
227
+ },
228
+ {
229
+ "cell_type": "code",
230
+ "execution_count": 6,
231
+ "metadata": {},
232
+ "outputs": [
233
+ {
234
+ "name": "stdout",
235
+ "output_type": "stream",
236
+ "text": [
237
+ "Augmenting Digit - 0\n",
238
+ "Augmenting Digit - 1\n",
239
+ "Augmenting Digit - 2\n",
240
+ "Augmenting Digit - 3\n",
241
+ "Augmenting Digit - 4\n",
242
+ "Augmenting Digit - 5\n",
243
+ "Augmenting Digit - 6\n",
244
+ "Augmenting Digit - 7\n",
245
+ "Augmenting Digit - 8\n",
246
+ "Augmenting Digit - 9\n"
247
+ ]
248
+ }
249
+ ],
250
+ "source": [
251
+ "# Creating 2000 Images for each digit in creditcard_digits1 - TRAINING DATA\n",
252
+ "\n",
253
+ "# Load our first image\n",
254
+ "cc1 = cv2.imread('creditcard_digits1.jpg', 0)\n",
255
+ "\n",
256
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
257
+ "cv2.imshow(\"cc1\", th2)\n",
258
+ "cv2.imshow(\"creditcard_digits1\", cc1)\n",
259
+ "cv2.waitKey(0)\n",
260
+ "cv2.destroyAllWindows()\n",
261
+ "\n",
262
+ "region = [(2, 19), (50, 72)]\n",
263
+ "\n",
264
+ "top_left_y = region[0][1]\n",
265
+ "bottom_right_y = region[1][1]\n",
266
+ "top_left_x = region[0][0]\n",
267
+ "bottom_right_x = region[1][0]\n",
268
+ "\n",
269
+ "for i in range(0,10): \n",
270
+ " # We jump the next digit each time we loop\n",
271
+ " if i > 0:\n",
272
+ " top_left_x = top_left_x + 59\n",
273
+ " bottom_right_x = bottom_right_x + 59\n",
274
+ "\n",
275
+ " roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
276
+ " print(\"Augmenting Digit - \", str(i))\n",
277
+ " # We create 200 versions of each image for our dataset\n",
278
+ " for j in range(0,2000):\n",
279
+ " roi2 = DigitAugmentation(roi)\n",
280
+ " roi_otsu = pre_process(roi2, inv = True)\n",
281
+ " cv2.imwrite(\"./credit_card/train/\"+str(i)+\"./_1_\"+str(j)+\".jpg\", roi_otsu)\n",
282
+ "cv2.destroyAllWindows()"
283
+ ]
284
+ },
285
+ {
286
+ "cell_type": "code",
287
+ "execution_count": null,
288
+ "metadata": {},
289
+ "outputs": [],
290
+ "source": [
291
+ "# Creating 2000 Images for each digit in creditcard_digits2 - TRAINING DATA\n",
292
+ "\n",
293
+ "cc1 = cv2.imread('creditcard_digits2.jpg', 0)\n",
294
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
295
+ "cv2.imshow(\"cc1\", th2)\n",
296
+ "cv2.waitKey(0)\n",
297
+ "cv2.destroyAllWindows()\n",
298
+ "\n",
299
+ "region = [(0, 0), (35, 48)]\n",
300
+ "\n",
301
+ "top_left_y = region[0][1]\n",
302
+ "bottom_right_y = region[1][1]\n",
303
+ "top_left_x = region[0][0]\n",
304
+ "bottom_right_x = region[1][0]\n",
305
+ "\n",
306
+ "for i in range(0,10): \n",
307
+ " if i > 0:\n",
308
+ " # We jump the next digit each time we loop\n",
309
+ " top_left_x = top_left_x + 35\n",
310
+ " bottom_right_x = bottom_right_x + 35\n",
311
+ "\n",
312
+ " roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
313
+ " print(\"Augmenting Digit - \", str(i))\n",
314
+ " # We create 200 versions of each image for our dataset\n",
315
+ " for j in range(0,2000):\n",
316
+ " roi2 = DigitAugmentation(roi)\n",
317
+ " roi_otsu = pre_process(roi2, inv = False)\n",
318
+ " cv2.imwrite(\"./credit_card/train/\"+str(i)+\"./_2_\"+str(j)+\".jpg\", roi_otsu)\n",
319
+ "cv2.destroyAllWindows()"
320
+ ]
321
+ },
322
+ {
323
+ "cell_type": "code",
324
+ "execution_count": 9,
325
+ "metadata": {},
326
+ "outputs": [
327
+ {
328
+ "name": "stdout",
329
+ "output_type": "stream",
330
+ "text": [
331
+ "Augmenting Digit - 0\n",
332
+ "Augmenting Digit - 1\n",
333
+ "Augmenting Digit - 2\n",
334
+ "Augmenting Digit - 3\n",
335
+ "Augmenting Digit - 4\n",
336
+ "Augmenting Digit - 5\n",
337
+ "Augmenting Digit - 6\n",
338
+ "Augmenting Digit - 7\n",
339
+ "Augmenting Digit - 8\n",
340
+ "Augmenting Digit - 9\n"
341
+ ]
342
+ }
343
+ ],
344
+ "source": [
345
+ "# Creating 200 Images for each digit in creditcard_digits1 - TEST DATA\n",
346
+ "\n",
347
+ "# Load our first image\n",
348
+ "cc1 = cv2.imread('creditcard_digits1.jpg', 0)\n",
349
+ "\n",
350
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
351
+ "cv2.imshow(\"cc1\", th2)\n",
352
+ "cv2.imshow(\"creditcard_digits1\", cc1)\n",
353
+ "cv2.waitKey(0)\n",
354
+ "cv2.destroyAllWindows()\n",
355
+ "\n",
356
+ "region = [(2, 19), (50, 72)]\n",
357
+ "\n",
358
+ "top_left_y = region[0][1]\n",
359
+ "bottom_right_y = region[1][1]\n",
360
+ "top_left_x = region[0][0]\n",
361
+ "bottom_right_x = region[1][0]\n",
362
+ "\n",
363
+ "for i in range(0,10): \n",
364
+ " # We jump the next digit each time we loop\n",
365
+ " if i > 0:\n",
366
+ " top_left_x = top_left_x + 59\n",
367
+ " bottom_right_x = bottom_right_x + 59\n",
368
+ "\n",
369
+ " roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
370
+ " print(\"Augmenting Digit -\", str(i))\n",
371
+ " # We create 200 versions of each image for our dataset\n",
372
+ " for j in range(0,2000):\n",
373
+ " roi2 = DigitAugmentation(roi)\n",
374
+ " roi_otsu = pre_process(roi2, inv = True)\n",
375
+ " cv2.imwrite(\"./credit_card/test/\"+str(i)+\"./_1_\"+str(j)+\".jpg\", roi_otsu)\n",
376
+ "cv2.destroyAllWindows()"
377
+ ]
378
+ },
379
+ {
380
+ "cell_type": "code",
381
+ "execution_count": 12,
382
+ "metadata": {},
383
+ "outputs": [
384
+ {
385
+ "name": "stdout",
386
+ "output_type": "stream",
387
+ "text": [
388
+ "Augmenting Digit - 0\n",
389
+ "Augmenting Digit - 1\n",
390
+ "Augmenting Digit - 2\n",
391
+ "Augmenting Digit - 3\n",
392
+ "Augmenting Digit - 4\n",
393
+ "Augmenting Digit - 5\n",
394
+ "Augmenting Digit - 6\n",
395
+ "Augmenting Digit - 7\n",
396
+ "Augmenting Digit - 8\n",
397
+ "Augmenting Digit - 9\n"
398
+ ]
399
+ }
400
+ ],
401
+ "source": [
402
+ "# Creating 200 Images for each digit in creditcard_digits2 - TEST DATA\n",
403
+ "\n",
404
+ "cc1 = cv2.imread('creditcard_digits2.jpg', 0)\n",
405
+ "_, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)\n",
406
+ "cv2.imshow(\"cc1\", th2)\n",
407
+ "cv2.waitKey(0)\n",
408
+ "cv2.destroyAllWindows()\n",
409
+ "\n",
410
+ "region = [(0, 0), (35, 48)]\n",
411
+ "\n",
412
+ "top_left_y = region[0][1]\n",
413
+ "bottom_right_y = region[1][1]\n",
414
+ "top_left_x = region[0][0]\n",
415
+ "bottom_right_x = region[1][0]\n",
416
+ "\n",
417
+ "for i in range(0,10): \n",
418
+ " if i > 0:\n",
419
+ " # We jump the next digit each time we loop\n",
420
+ " top_left_x = top_left_x + 35\n",
421
+ " bottom_right_x = bottom_right_x + 35\n",
422
+ "\n",
423
+ " roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
424
+ " print(\"Augmenting Digit - \", str(i))\n",
425
+ " # We create 200 versions of each image for our dataset\n",
426
+ " for j in range(0,2000):\n",
427
+ " roi2 = DigitAugmentation(roi)\n",
428
+ " roi_otsu = pre_process(roi2, inv = False)\n",
429
+ " cv2.imwrite(\"./credit_card/test/\"+str(i)+\"./_2_\"+str(j)+\".jpg\", roi_otsu)\n",
430
+ " #cv2.imshow(\"otsu\", roi_otsu)\n",
431
+ " #print(\"-\")\n",
432
+ " #cv2.waitKey(0)\n",
433
+ "cv2.destroyAllWindows()"
434
+ ]
435
+ },
436
+ {
437
+ "cell_type": "markdown",
438
+ "metadata": {},
439
+ "source": [
440
+ "# 2. Creating our Classifier"
441
+ ]
442
+ },
443
+ {
444
+ "cell_type": "code",
445
+ "execution_count": 13,
446
+ "metadata": {},
447
+ "outputs": [
448
+ {
449
+ "name": "stdout",
450
+ "output_type": "stream",
451
+ "text": [
452
+ "Found 20254 images belonging to 10 classes.\n",
453
+ "Found 40000 images belonging to 10 classes.\n"
454
+ ]
455
+ }
456
+ ],
457
+ "source": [
458
+ "import os\n",
459
+ "import numpy as np\n",
460
+ "from tensorflow.keras.models import Sequential\n",
461
+ "from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense\n",
462
+ "from tensorflow.keras.preprocessing.image import ImageDataGenerator\n",
463
+ "from tensorflow.keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D\n",
464
+ "from tensorflow.keras import optimizers\n",
465
+ "import tensorflow as tf\n",
466
+ "\n",
467
+ "input_shape = (32, 32, 3)\n",
468
+ "img_width = 32\n",
469
+ "img_height = 32\n",
470
+ "num_classes = 10\n",
471
+ "nb_train_samples = 10000\n",
472
+ "nb_validation_samples = 2000\n",
473
+ "batch_size = 16\n",
474
+ "epochs = 1\n",
475
+ "\n",
476
+ "train_data_dir = './credit_card/train'\n",
477
+ "validation_data_dir = './credit_card/test'\n",
478
+ "\n",
479
+ "# Creating our data generator for our test data\n",
480
+ "validation_datagen = ImageDataGenerator(\n",
481
+ " # used to rescale the pixel values from [0, 255] to [0, 1] interval\n",
482
+ " rescale = 1./255)\n",
483
+ "\n",
484
+ "# Creating our data generator for our training data\n",
485
+ "train_datagen = ImageDataGenerator(\n",
486
+ " rescale = 1./255, # normalize pixel values to [0,1]\n",
487
+ " rotation_range = 10, # randomly applies rotations\n",
488
+ " width_shift_range = 0.25, # randomly applies width shifting\n",
489
+ " height_shift_range = 0.25, # randomly applies height shifting\n",
490
+ " shear_range=0.5,\n",
491
+ " zoom_range=0.5,\n",
492
+ " horizontal_flip = False, # randonly flips the image\n",
493
+ " fill_mode = 'nearest') # uses the fill mode nearest to fill gaps created by the above\n",
494
+ "\n",
495
+ "# Specify criteria about our training data, such as the directory, image size, batch size and type \n",
496
+ "# automagically retrieve images and their classes for train and validation sets\n",
497
+ "train_generator = train_datagen.flow_from_directory(\n",
498
+ " train_data_dir,\n",
499
+ " target_size = (img_width, img_height),\n",
500
+ " batch_size = batch_size,\n",
501
+ " class_mode = 'categorical')\n",
502
+ "\n",
503
+ "validation_generator = validation_datagen.flow_from_directory(\n",
504
+ " validation_data_dir,\n",
505
+ " target_size = (img_width, img_height),\n",
506
+ " batch_size = batch_size,\n",
507
+ " class_mode = 'categorical',\n",
508
+ " shuffle = False) "
509
+ ]
510
+ },
511
+ {
512
+ "cell_type": "markdown",
513
+ "metadata": {},
514
+ "source": [
515
+ "## Creating out Model based on the LeNet CNN Architecture"
516
+ ]
517
+ },
518
+ {
519
+ "cell_type": "code",
520
+ "execution_count": 15,
521
+ "metadata": {},
522
+ "outputs": [
523
+ {
524
+ "name": "stdout",
525
+ "output_type": "stream",
526
+ "text": [
527
+ "Model: \"sequential_1\"\n",
528
+ "_________________________________________________________________\n",
529
+ "Layer (type) Output Shape Param # \n",
530
+ "=================================================================\n",
531
+ "conv2d_2 (Conv2D) (None, 32, 32, 20) 1520 \n",
532
+ "_________________________________________________________________\n",
533
+ "activation_4 (Activation) (None, 32, 32, 20) 0 \n",
534
+ "_________________________________________________________________\n",
535
+ "max_pooling2d_2 (MaxPooling2 (None, 16, 16, 20) 0 \n",
536
+ "_________________________________________________________________\n",
537
+ "conv2d_3 (Conv2D) (None, 16, 16, 50) 25050 \n",
538
+ "_________________________________________________________________\n",
539
+ "activation_5 (Activation) (None, 16, 16, 50) 0 \n",
540
+ "_________________________________________________________________\n",
541
+ "max_pooling2d_3 (MaxPooling2 (None, 8, 8, 50) 0 \n",
542
+ "_________________________________________________________________\n",
543
+ "flatten_1 (Flatten) (None, 3200) 0 \n",
544
+ "_________________________________________________________________\n",
545
+ "dense_2 (Dense) (None, 500) 1600500 \n",
546
+ "_________________________________________________________________\n",
547
+ "activation_6 (Activation) (None, 500) 0 \n",
548
+ "_________________________________________________________________\n",
549
+ "dense_3 (Dense) (None, 10) 5010 \n",
550
+ "_________________________________________________________________\n",
551
+ "activation_7 (Activation) (None, 10) 0 \n",
552
+ "=================================================================\n",
553
+ "Total params: 1,632,080\n",
554
+ "Trainable params: 1,632,080\n",
555
+ "Non-trainable params: 0\n",
556
+ "_________________________________________________________________\n",
557
+ "None\n"
558
+ ]
559
+ }
560
+ ],
561
+ "source": [
562
+ "# create model\n",
563
+ "model = Sequential()\n",
564
+ "\n",
565
+ "# 2 sets of CRP (Convolution, RELU, Pooling)\n",
566
+ "model.add(Conv2D(20, (5, 5),\n",
567
+ " padding = \"same\", \n",
568
+ " input_shape = input_shape))\n",
569
+ "model.add(Activation(\"relu\"))\n",
570
+ "model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))\n",
571
+ "\n",
572
+ "model.add(Conv2D(50, (5, 5),\n",
573
+ " padding = \"same\"))\n",
574
+ "model.add(Activation(\"relu\"))\n",
575
+ "model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))\n",
576
+ "\n",
577
+ "# Fully connected layers (w/ RELU)\n",
578
+ "model.add(Flatten())\n",
579
+ "model.add(Dense(500))\n",
580
+ "model.add(Activation(\"relu\"))\n",
581
+ "\n",
582
+ "# Softmax (for classification)\n",
583
+ "model.add(Dense(num_classes))\n",
584
+ "model.add(Activation(\"softmax\"))\n",
585
+ " \n",
586
+ "model.compile(loss = 'categorical_crossentropy',\n",
587
+ " optimizer = tf.keras.optimizers.Adadelta(),\n",
588
+ " metrics = ['accuracy'])\n",
589
+ " \n",
590
+ "print(model.summary())"
591
+ ]
592
+ },
593
+ {
594
+ "cell_type": "markdown",
595
+ "metadata": {},
596
+ "source": [
597
+ "## Training our Model"
598
+ ]
599
+ },
600
+ {
601
+ "cell_type": "code",
602
+ "execution_count": 17,
603
+ "metadata": {},
604
+ "outputs": [
605
+ {
606
+ "name": "stdout",
607
+ "output_type": "stream",
608
+ "text": [
609
+ "WARNING:tensorflow:sample_weight modes were coerced from\n",
610
+ " ...\n",
611
+ " to \n",
612
+ " ['...']\n",
613
+ "WARNING:tensorflow:sample_weight modes were coerced from\n",
614
+ " ...\n",
615
+ " to \n",
616
+ " ['...']\n",
617
+ "Train for 1250 steps, validate for 250 steps\n",
618
+ "1249/1250 [============================>.] - ETA: 0s - loss: 0.2923 - accuracy: 0.9020\n",
619
+ "Epoch 00001: val_loss improved from inf to 0.00030, saving model to creditcard.h5\n",
620
+ "1250/1250 [==============================] - 150s 120ms/step - loss: 0.2922 - accuracy: 0.9020 - val_loss: 3.0485e-04 - val_accuracy: 1.0000\n"
621
+ ]
622
+ }
623
+ ],
624
+ "source": [
625
+ "from tensorflow.keras.optimizers import RMSprop\n",
626
+ "from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping\n",
627
+ " \n",
628
+ "checkpoint = ModelCheckpoint(\"creditcard.h5\",\n",
629
+ " monitor=\"val_loss\",\n",
630
+ " mode=\"min\",\n",
631
+ " save_best_only = True,\n",
632
+ " verbose=1)\n",
633
+ "\n",
634
+ "earlystop = EarlyStopping(monitor = 'val_loss', \n",
635
+ " min_delta = 0, \n",
636
+ " patience = 3,\n",
637
+ " verbose = 1,\n",
638
+ " restore_best_weights = True)\n",
639
+ "\n",
640
+ "# we put our call backs into a callback list\n",
641
+ "callbacks = [earlystop, checkpoint]\n",
642
+ "\n",
643
+ "# Note we use a very small learning rate \n",
644
+ "model.compile(loss = 'categorical_crossentropy',\n",
645
+ " optimizer = RMSprop(lr = 0.001),\n",
646
+ " metrics = ['accuracy'])\n",
647
+ "\n",
648
+ "nb_train_samples = 20000\n",
649
+ "nb_validation_samples = 4000\n",
650
+ "epochs = 1\n",
651
+ "batch_size = 16\n",
652
+ "\n",
653
+ "history = model.fit_generator(\n",
654
+ " train_generator,\n",
655
+ " steps_per_epoch = nb_train_samples // batch_size,\n",
656
+ " epochs = epochs,\n",
657
+ " callbacks = callbacks,\n",
658
+ " validation_data = validation_generator,\n",
659
+ " validation_steps = nb_validation_samples // batch_size)\n",
660
+ "\n",
661
+ "model.save(\"creditcard.h5\")"
662
+ ]
663
+ },
664
+ {
665
+ "cell_type": "markdown",
666
+ "metadata": {},
667
+ "source": [
668
+ "# 3. Extract a Credit Card from the backgroud\n",
669
+ "#### NOTE:\n",
670
+ "You may need to install imutils \n",
671
+ "run *pip install imutils* in terminal and restart your kernal to install"
672
+ ]
673
+ },
674
+ {
675
+ "cell_type": "code",
676
+ "execution_count": 20,
677
+ "metadata": {},
678
+ "outputs": [
679
+ {
680
+ "name": "stdout",
681
+ "output_type": "stream",
682
+ "text": [
683
+ "Collecting scikit-image\n",
684
+ " Downloading scikit_image-0.17.2-cp37-cp37m-win_amd64.whl (11.5 MB)\n",
685
+ "Requirement already satisfied, skipping upgrade: numpy>=1.15.1 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (1.16.5)\n",
686
+ "Requirement already satisfied, skipping upgrade: networkx>=2.0 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (2.3)\n",
687
+ "Requirement already satisfied, skipping upgrade: pillow!=7.1.0,!=7.1.1,>=4.3.0 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (6.2.0)\n",
688
+ "Requirement already satisfied, skipping upgrade: scipy>=1.0.1 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (1.4.1)\n",
689
+ "Requirement already satisfied, skipping upgrade: matplotlib!=3.0.0,>=2.0.0 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (3.1.1)\n",
690
+ "Collecting tifffile>=2019.7.26\n",
691
+ " Downloading tifffile-2020.6.3-py3-none-any.whl (133 kB)\n",
692
+ "Requirement already satisfied, skipping upgrade: imageio>=2.3.0 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from scikit-image) (2.6.0)\n",
693
+ "Collecting PyWavelets>=1.1.1\n",
694
+ " Downloading PyWavelets-1.1.1-cp37-cp37m-win_amd64.whl (4.2 MB)\n",
695
+ "Requirement already satisfied, skipping upgrade: decorator>=4.3.0 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from networkx>=2.0->scikit-image) (4.4.0)\n",
696
+ "Requirement already satisfied, skipping upgrade: cycler>=0.10 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (0.10.0)\n",
697
+ "Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (1.1.0)\n",
698
+ "Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (2.4.2)\n",
699
+ "Requirement already satisfied, skipping upgrade: python-dateutil>=2.1 in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (2.8.0)\n",
700
+ "Requirement already satisfied, skipping upgrade: six in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from cycler>=0.10->matplotlib!=3.0.0,>=2.0.0->scikit-image) (1.12.0)\n",
701
+ "Requirement already satisfied, skipping upgrade: setuptools in c:\\programdata\\anaconda3\\envs\\cv\\lib\\site-packages (from kiwisolver>=1.0.1->matplotlib!=3.0.0,>=2.0.0->scikit-image) (41.4.0)\n",
702
+ "Installing collected packages: tifffile, PyWavelets, scikit-image\n",
703
+ " Attempting uninstall: PyWavelets\n",
704
+ " Found existing installation: PyWavelets 1.0.3\n",
705
+ " Uninstalling PyWavelets-1.0.3:\n",
706
+ " Successfully uninstalled PyWavelets-1.0.3\n"
707
+ ]
708
+ },
709
+ {
710
+ "name": "stderr",
711
+ "output_type": "stream",
712
+ "text": [
713
+ " WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002C6B8509088>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')': /packages/6e/c7/6411a4ce983bf06db8c3b8093b04b268c2580816f61156a5848e24e97118/scikit_image-0.17.2-cp37-cp37m-win_amd64.whl\n",
714
+ "ERROR: keras-vis 0.4.1 requires keras, which is not installed.\n",
715
+ "ERROR: Could not install packages due to an EnvironmentError: [WinError 5] Access is denied: 'c:\\\\programdata\\\\anaconda3\\\\envs\\\\cv\\\\lib\\\\site-packages\\\\~ywt\\\\_extensions\\\\_cwt.cp37-win_amd64.pyd'\n",
716
+ "Consider using the `--user` option or check the permissions.\n",
717
+ "\n"
718
+ ]
719
+ }
720
+ ],
721
+ "source": [
722
+ "!pip install -U scikit-image"
723
+ ]
724
+ },
725
+ {
726
+ "cell_type": "code",
727
+ "execution_count": 24,
728
+ "metadata": {},
729
+ "outputs": [
730
+ {
731
+ "ename": "ImportError",
732
+ "evalue": "cannot import name 'filter' from 'skimage' (C:\\ProgramData\\Anaconda3\\envs\\cv\\lib\\site-packages\\skimage\\__init__.py)",
733
+ "output_type": "error",
734
+ "traceback": [
735
+ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
736
+ "\u001b[1;31mImportError\u001b[0m Traceback (most recent call last)",
737
+ "\u001b[1;32m<ipython-input-24-83690ff69298>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mimutils\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;31m#from skimage.filters import threshold_adaptive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 5\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mskimage\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mfilter\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 6\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mos\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
738
+ "\u001b[1;31mImportError\u001b[0m: cannot import name 'filter' from 'skimage' (C:\\ProgramData\\Anaconda3\\envs\\cv\\lib\\site-packages\\skimage\\__init__.py)"
739
+ ]
740
+ }
741
+ ],
742
+ "source": [
743
+ "import cv2\n",
744
+ "import numpy as np\n",
745
+ "import imutils\n",
746
+ "#from skimage.filters import threshold_adaptive\n",
747
+ "from skimage import filter\n",
748
+ "import os\n",
749
+ "\n",
750
+ "def order_points(pts):\n",
751
+ " # initialzie a list of coordinates that will be ordered\n",
752
+ " # such that the first entry in the list is the top-left,\n",
753
+ " # the second entry is the top-right, the third is the\n",
754
+ " # bottom-right, and the fourth is the bottom-left\n",
755
+ " rect = np.zeros((4, 2), dtype = \"float32\")\n",
756
+ "\n",
757
+ " # the top-left point will have the smallest sum, whereas\n",
758
+ " # the bottom-right point will have the largest sum\n",
759
+ " s = pts.sum(axis = 1)\n",
760
+ " rect[0] = pts[np.argmin(s)]\n",
761
+ " rect[2] = pts[np.argmax(s)]\n",
762
+ "\n",
763
+ " # now, compute the difference between the points, the\n",
764
+ " # top-right point will have the smallest difference,\n",
765
+ " # whereas the bottom-left will have the largest difference\n",
766
+ " diff = np.diff(pts, axis = 1)\n",
767
+ " rect[1] = pts[np.argmin(diff)]\n",
768
+ " rect[3] = pts[np.argmax(diff)]\n",
769
+ "\n",
770
+ " # return the ordered coordinates\n",
771
+ " return rect\n",
772
+ "\n",
773
+ "def four_point_transform(image, pts):\n",
774
+ " # obtain a consistent order of the points and unpack them\n",
775
+ " # individually\n",
776
+ " rect = order_points(pts)\n",
777
+ " (tl, tr, br, bl) = rect\n",
778
+ "\n",
779
+ " # compute the width of the new image, which will be the\n",
780
+ " # maximum distance between bottom-right and bottom-left\n",
781
+ " # x-coordiates or the top-right and top-left x-coordinates\n",
782
+ " widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))\n",
783
+ " widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))\n",
784
+ " maxWidth = max(int(widthA), int(widthB))\n",
785
+ "\n",
786
+ " # compute the height of the new image, which will be the\n",
787
+ " # maximum distance between the top-right and bottom-right\n",
788
+ " # y-coordinates or the top-left and bottom-left y-coordinates\n",
789
+ " heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))\n",
790
+ " heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))\n",
791
+ " maxHeight = max(int(heightA), int(heightB))\n",
792
+ "\n",
793
+ " # now that we have the dimensions of the new image, construct\n",
794
+ " # the set of destination points to obtain a \"birds eye view\",\n",
795
+ " # (i.e. top-down view) of the image, again specifying points\n",
796
+ " # in the top-left, top-right, bottom-right, and bottom-left\n",
797
+ " # order\n",
798
+ " dst = np.array([\n",
799
+ " [0, 0],\n",
800
+ " [maxWidth - 1, 0],\n",
801
+ " [maxWidth - 1, maxHeight - 1],\n",
802
+ " [0, maxHeight - 1]], dtype = \"float32\")\n",
803
+ "\n",
804
+ " # compute the perspective transform matrix and then apply it\n",
805
+ " M = cv2.getPerspectiveTransform(rect, dst)\n",
806
+ " warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))\n",
807
+ "\n",
808
+ " # return the warped image\n",
809
+ " return warped\n",
810
+ "\n",
811
+ "def doc_Scan(image):\n",
812
+ " orig_height, orig_width = image.shape[:2]\n",
813
+ " ratio = image.shape[0] / 500.0\n",
814
+ "\n",
815
+ " orig = image.copy()\n",
816
+ " image = imutils.resize(image, height = 500)\n",
817
+ " orig_height, orig_width = image.shape[:2]\n",
818
+ " Original_Area = orig_height * orig_width\n",
819
+ " \n",
820
+ " # convert the image to grayscale, blur it, and find edges\n",
821
+ " # in the image\n",
822
+ " gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)\n",
823
+ " gray = cv2.GaussianBlur(gray, (5, 5), 0)\n",
824
+ " edged = cv2.Canny(gray, 75, 200)\n",
825
+ "\n",
826
+ " cv2.imshow(\"Image\", image)\n",
827
+ " cv2.imshow(\"Edged\", edged)\n",
828
+ " cv2.waitKey(0)\n",
829
+ " # show the original image and the edge detected image\n",
830
+ "\n",
831
+ " # find the contours in the edged image, keeping only the\n",
832
+ " # largest ones, and initialize the screen contour\n",
833
+ " _, contours, hierarchy = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)\n",
834
+ " contours = sorted(contours, key = cv2.contourArea, reverse = True)[:5]\n",
835
+ " \n",
836
+ " # loop over the contours\n",
837
+ " for c in contours:\n",
838
+ "\n",
839
+ " # approximate the contour\n",
840
+ " area = cv2.contourArea(c)\n",
841
+ " if area < (Original_Area/3):\n",
842
+ " print(\"Error Image Invalid\")\n",
843
+ " return(\"ERROR\")\n",
844
+ " peri = cv2.arcLength(c, True)\n",
845
+ " approx = cv2.approxPolyDP(c, 0.02 * peri, True)\n",
846
+ "\n",
847
+ " # if our approximated contour has four points, then we\n",
848
+ " # can assume that we have found our screen\n",
849
+ " if len(approx) == 4:\n",
850
+ " screenCnt = approx\n",
851
+ " break\n",
852
+ "\n",
853
+ " # show the contour (outline) of the piece of paper\n",
854
+ " cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)\n",
855
+ " cv2.imshow(\"Outline\", image)\n",
856
+ "\n",
857
+ " warped = four_point_transform(orig, screenCnt.reshape(4, 2) * ratio)\n",
858
+ " # convert the warped image to grayscale, then threshold it\n",
859
+ " # to give it that 'black and white' paper effect\n",
860
+ " cv2.resize(warped, (640,403), interpolation = cv2.INTER_AREA)\n",
861
+ " cv2.imwrite(\"credit_card_color.jpg\", warped)\n",
862
+ " warped = cv2.cvtColor(warped, cv2.COLOR_BGR2GRAY)\n",
863
+ " warped = warped.astype(\"uint8\") * 255\n",
864
+ " cv2.imshow(\"Extracted Credit Card\", warped)\n",
865
+ " cv2.waitKey(0)\n",
866
+ " cv2.destroyAllWindows()\n",
867
+ " return warped"
868
+ ]
869
+ },
870
+ {
871
+ "cell_type": "code",
872
+ "execution_count": 30,
873
+ "metadata": {},
874
+ "outputs": [],
875
+ "source": [
876
+ "cv2.destroyAllWindows()"
877
+ ]
878
+ },
879
+ {
880
+ "cell_type": "markdown",
881
+ "metadata": {},
882
+ "source": [
883
+ "## Extract our Credit Card and the Region of Interest (ROI)"
884
+ ]
885
+ },
886
+ {
887
+ "cell_type": "code",
888
+ "execution_count": 2,
889
+ "metadata": {},
890
+ "outputs": [],
891
+ "source": [
892
+ "image = cv2.imread('test_card.jpg')\n",
893
+ "image = doc_Scan(image)\n",
894
+ "\n",
895
+ "region = [(55, 210), (640, 290)]\n",
896
+ "\n",
897
+ "top_left_y = region[0][1]\n",
898
+ "bottom_right_y = region[1][1]\n",
899
+ "top_left_x = region[0][0]\n",
900
+ "bottom_right_x = region[1][0]\n",
901
+ "\n",
902
+ "# Extracting the area were the credit numbers are located\n",
903
+ "roi = image[top_left_y:bottom_right_y, top_left_x:bottom_right_x]\n",
904
+ "cv2.imshow(\"Region\", roi)\n",
905
+ "cv2.imwrite(\"credit_card_extracted_digits.jpg\", roi)\n",
906
+ "cv2.waitKey(0)\n",
907
+ "cv2.destroyAllWindows()"
908
+ ]
909
+ },
910
+ {
911
+ "cell_type": "markdown",
912
+ "metadata": {},
913
+ "source": [
914
+ "## Loading our trained model"
915
+ ]
916
+ },
917
+ {
918
+ "cell_type": "code",
919
+ "execution_count": 3,
920
+ "metadata": {},
921
+ "outputs": [
922
+ {
923
+ "name": "stderr",
924
+ "output_type": "stream",
925
+ "text": [
926
+ "Using TensorFlow backend.\n"
927
+ ]
928
+ }
929
+ ],
930
+ "source": [
931
+ "from tensorflow.keras.models import load_model\n",
932
+ "import keras\n",
933
+ "\n",
934
+ "classifier = load_model('creditcard.h5')"
935
+ ]
936
+ },
937
+ {
938
+ "cell_type": "markdown",
939
+ "metadata": {},
940
+ "source": [
941
+ "# Let's test on our extracted image"
942
+ ]
943
+ },
944
+ {
945
+ "cell_type": "code",
946
+ "execution_count": null,
947
+ "metadata": {},
948
+ "outputs": [
949
+ {
950
+ "name": "stdout",
951
+ "output_type": "stream",
952
+ "text": [
953
+ "5\n",
954
+ "3\n",
955
+ "5\n",
956
+ "5\n",
957
+ "2\n",
958
+ "2\n",
959
+ "0\n",
960
+ "3\n",
961
+ "2\n",
962
+ "3\n",
963
+ "9\n",
964
+ "0\n"
965
+ ]
966
+ }
967
+ ],
968
+ "source": [
969
+ "def x_cord_contour(contours):\n",
970
+ " #Returns the X cordinate for the contour centroid\n",
971
+ " if cv2.contourArea(contours) > 10:\n",
972
+ " M = cv2.moments(contours)\n",
973
+ " return (int(M['m10']/M['m00']))\n",
974
+ " else:\n",
975
+ " pass\n",
976
+ "\n",
977
+ "img = cv2.imread('credit_card_extracted_digits.jpg')\n",
978
+ "orig_img = cv2.imread('credit_card_color.jpg')\n",
979
+ "gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)\n",
980
+ "cv2.imshow(\"image\", img)\n",
981
+ "cv2.waitKey(0)\n",
982
+ "\n",
983
+ "# Blur image then find edges using Canny \n",
984
+ "blurred = cv2.GaussianBlur(gray, (5, 5), 0)\n",
985
+ "#cv2.imshow(\"blurred\", blurred)\n",
986
+ "#cv2.waitKey(0)\n",
987
+ "\n",
988
+ "edged = cv2.Canny(blurred, 30, 150)\n",
989
+ "#cv2.imshow(\"edged\", edged)\n",
990
+ "#cv2.waitKey(0)\n",
991
+ "\n",
992
+ "# Find Contours\n",
993
+ "_, contours, _ = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)\n",
994
+ "\n",
995
+ "#Sort out contours left to right by using their x cordinates\n",
996
+ "contours = sorted(contours, key=cv2.contourArea, reverse=True)[:13] #Change this to 16 to get all digits\n",
997
+ "contours = sorted(contours, key = x_cord_contour, reverse = False)\n",
998
+ "\n",
999
+ "# Create empty array to store entire number\n",
1000
+ "full_number = []\n",
1001
+ "\n",
1002
+ "# loop over the contours\n",
1003
+ "for c in contours:\n",
1004
+ " # compute the bounding box for the rectangle\n",
1005
+ " (x, y, w, h) = cv2.boundingRect(c) \n",
1006
+ " if w >= 5 and h >= 25 and cv2.contourArea(c) < 1000:\n",
1007
+ " roi = blurred[y:y + h, x:x + w]\n",
1008
+ " #ret, roi = cv2.threshold(roi, 20, 255,cv2.THRESH_BINARY_INV)\n",
1009
+ " cv2.imshow(\"ROI1\", roi)\n",
1010
+ " roi_otsu = pre_process(roi, True)\n",
1011
+ " cv2.imshow(\"ROI2\", roi_otsu)\n",
1012
+ " roi_otsu = cv2.cvtColor(roi_otsu, cv2.COLOR_GRAY2RGB)\n",
1013
+ " roi_otsu = keras.preprocessing.image.img_to_array(roi_otsu)\n",
1014
+ " roi_otsu = roi_otsu * 1./255\n",
1015
+ " roi_otsu = np.expand_dims(roi_otsu, axis=0)\n",
1016
+ " image = np.vstack([roi_otsu])\n",
1017
+ " label = str(classifier.predict_classes(image, batch_size = 10))[1]\n",
1018
+ " print(label)\n",
1019
+ " (x, y, w, h) = (x+region[0][0], y+region[0][1], w, h)\n",
1020
+ " cv2.rectangle(orig_img, (x, y), (x + w, y + h), (0, 255, 0), 2)\n",
1021
+ " cv2.putText(orig_img, label, (x , y + 90), cv2.FONT_HERSHEY_COMPLEX, 2, (0, 255, 0), 2)\n",
1022
+ " cv2.imshow(\"image\", orig_img)\n",
1023
+ " cv2.waitKey(0) \n",
1024
+ " \n",
1025
+ "cv2.destroyAllWindows()"
1026
+ ]
1027
+ },
1028
+ {
1029
+ "cell_type": "code",
1030
+ "execution_count": 11,
1031
+ "metadata": {},
1032
+ "outputs": [],
1033
+ "source": [
1034
+ "cv2.destroyAllWindows()"
1035
+ ]
1036
+ },
1037
+ {
1038
+ "cell_type": "code",
1039
+ "execution_count": null,
1040
+ "metadata": {},
1041
+ "outputs": [],
1042
+ "source": []
1043
+ }
1044
+ ],
1045
+ "metadata": {
1046
+ "kernelspec": {
1047
+ "display_name": "Python 3",
1048
+ "language": "python",
1049
+ "name": "python3"
1050
+ },
1051
+ "language_info": {
1052
+ "codemirror_mode": {
1053
+ "name": "ipython",
1054
+ "version": 3
1055
+ },
1056
+ "file_extension": ".py",
1057
+ "mimetype": "text/x-python",
1058
+ "name": "python",
1059
+ "nbconvert_exporter": "python",
1060
+ "pygments_lexer": "ipython3",
1061
+ "version": "3.7.4"
1062
+ }
1063
+ },
1064
+ "nbformat": 4,
1065
+ "nbformat_minor": 2
1066
+ }
26. Credit Card/credit_card_color.jpg ADDED
26. Credit Card/credit_card_extracted_digits.jpg ADDED
26. Credit Card/creditcard_digits1.jpg ADDED
26. Credit Card/creditcard_digits2.jpg ADDED
26. Credit Card/test_card.jpg ADDED
26. The Computer Vision World/4. Popular Computer Vision Conferences & Finding Datasets.srt ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:01,180 --> 00:00:07,450
3
+ So welcome to Chapter 25 point to where we talk about some popular API is complete a vision API is provided
4
+
5
+ 2
6
+ 00:00:07,450 --> 00:00:08,790
7
+ by these big players here.
8
+
9
+ 3
10
+ 00:00:10,650 --> 00:00:16,160
11
+ So what do these computer division API do well and why are they even there.
12
+
13
+ 4
14
+ 00:00:16,650 --> 00:00:21,480
15
+ Well if you didn't want to go through all the trouble of creating your own model and deploying it in
16
+
17
+ 5
18
+ 00:00:21,480 --> 00:00:26,730
19
+ production and mentoring this and making sure I can scale up properly with traffic you could very well
20
+
21
+ 6
22
+ 00:00:26,790 --> 00:00:31,400
23
+ pay pretty low costs to use someone else's complete division API.
24
+
25
+ 7
26
+ 00:00:31,710 --> 00:00:38,160
27
+ And there are some very good services out there that provide these features such as object detection
28
+
29
+ 8
30
+ 00:00:38,490 --> 00:00:44,900
31
+ face recognition image classification image and station video analysis and text recognition.
32
+
33
+ 9
34
+ 00:00:45,090 --> 00:00:46,680
35
+ So let's take a look at what's out there.
36
+
37
+ 10
38
+ 00:00:48,910 --> 00:00:54,810
39
+ So these other major popular ones right now Google's cloud vision and Microsoft vision.
40
+
41
+ 11
42
+ 00:00:55,050 --> 00:00:59,300
43
+ To me these are the two best ones followed by Amazon and terrifying.
44
+
45
+ 12
46
+ 00:00:59,550 --> 00:01:04,570
47
+ I haven't used IBM since visual recognition much but I've heard good things about it.
48
+
49
+ 13
50
+ 00:01:04,860 --> 00:01:13,780
51
+ And Chiro's which is excellent for perhaps the best in facial recognition.
52
+
53
+ 14
54
+ 00:01:13,800 --> 00:01:18,390
55
+ OK so moving on and we're now into chapter twenty five point three.
56
+
57
+ 15
58
+ 00:01:18,950 --> 00:01:25,390
59
+ Basically this is about popular computer vision conferences and a way to find interesting data sets.
60
+
61
+ 16
62
+ 00:01:25,400 --> 00:01:31,340
63
+ So these are the big conferences you are going to want to be a wealth CVP is perhaps the biggest one
64
+
65
+ 17
66
+ 00:01:31,340 --> 00:01:36,970
67
+ any industry conference in could be the vision and pattern recognition is also Nap's nips is more of
68
+
69
+ 18
70
+ 00:01:36,970 --> 00:01:43,940
71
+ a machine than in conference but you see a lot of the groundbreaking CNN image.
72
+
73
+ 19
74
+ 00:01:44,210 --> 00:01:49,400
75
+ I guess you can a computer vision type machine learning algorithms being introduced at MIPS.
76
+
77
+ 20
78
+ 00:01:49,730 --> 00:01:55,690
79
+ There's also the I c c v and e CCV these Ole's have pretty big as well.
80
+
81
+ 21
82
+ 00:01:55,700 --> 00:01:57,560
83
+ So keep an eye on these conferences.
84
+
85
+ 22
86
+ 00:01:59,560 --> 00:02:00,710
87
+ And how about this.
88
+
89
+ 23
90
+ 00:02:00,730 --> 00:02:03,940
91
+ Where do you find it interesting computer vision data sets.
92
+
93
+ 24
94
+ 00:02:03,940 --> 00:02:09,460
95
+ Now I've been looking a lot for discourse and one of the best places I found was Kaggle.
96
+
97
+ 25
98
+ 00:02:09,530 --> 00:02:14,350
99
+ There are a lot of Kaggle contests out there with some really really good data sets and some really
100
+
101
+ 26
102
+ 00:02:14,350 --> 00:02:16,180
103
+ good community support.
104
+
105
+ 27
106
+ 00:02:16,270 --> 00:02:20,120
107
+ So you can actually see what other people have done and interact with them.
108
+
109
+ 28
110
+ 00:02:20,140 --> 00:02:27,490
111
+ So it's quite cool these links may also provide some listings of other good data sets you can feel free
112
+
113
+ 29
114
+ 00:02:27,490 --> 00:02:28,740
115
+ to try them.
116
+
117
+ 30
118
+ 00:02:28,750 --> 00:02:34,390
119
+ There's a lot of good stuff out there so keep exploring and use your knowledge in discourse to create
120
+
121
+ 31
122
+ 00:02:34,390 --> 00:02:35,610
123
+ your own classifiers.
124
+
125
+ 32
126
+ 00:02:35,680 --> 00:02:39,030
127
+ Or perhaps if you're feeling adventurous create your own data set.
26. The Computer Vision World/5. Building a Deep Learning Machine vs. Cloud GPUs.srt ADDED
@@ -0,0 +1,219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:01,230 --> 00:00:08,750
3
+ And welcome the chapter 25 point four on building depleting machines or do we do it just kludgey Pew's.
4
+
5
+ 2
6
+ 00:00:08,790 --> 00:00:10,050
7
+ So let's take a look.
8
+
9
+ 3
10
+ 00:00:11,510 --> 00:00:17,330
11
+ So basically I have some hardware recommendations that I've actually learned a lot from the sky on his
12
+
13
+ 4
14
+ 00:00:17,330 --> 00:00:17,820
15
+ blog.
16
+
17
+ 5
18
+ 00:00:17,960 --> 00:00:21,360
19
+ He basically did a lot of benchmarking with DGP use.
20
+
21
+ 6
22
+ 00:00:21,410 --> 00:00:24,260
23
+ So it building on your budget and needs.
24
+
25
+ 7
26
+ 00:00:24,440 --> 00:00:26,260
27
+ I have some recommendations for you.
28
+
29
+ 8
30
+ 00:00:26,570 --> 00:00:33,620
31
+ Now this is as of November 2013 the best card you can buy for building a deploying machine is a Titan
32
+
33
+ 9
34
+ 00:00:33,640 --> 00:00:40,640
35
+ V from Nvidia and you're only going to want to buy Nvidia cards since only Nvidia supports kuda which
36
+
37
+ 10
38
+ 00:00:40,640 --> 00:00:41,820
39
+ is used in deploying.
40
+
41
+ 11
42
+ 00:00:41,820 --> 00:00:49,490
43
+ It basically accelerated tremendously accelerate our depleting algorithms on Gippi use.
44
+
45
+ 12
46
+ 00:00:49,520 --> 00:00:54,570
47
+ So the best consumer card that you can buy though because the Tyson V is a basically a pro user card
48
+
49
+ 13
50
+ 00:00:55,280 --> 00:00:59,940
51
+ that's consumer level called game card is the TR x 28.
52
+
53
+ 14
54
+ 00:01:00,320 --> 00:01:03,360
55
+ So awesome video card games and for deep cleaning.
56
+
57
+ 15
58
+ 00:01:03,360 --> 00:01:05,850
59
+ So at this price it better be awesome.
60
+
61
+ 16
62
+ 00:01:05,850 --> 00:01:13,440
63
+ At 12 mentioned US and if you want to go people get it RTX 2070 half the price and it's also very good
64
+
65
+ 17
66
+ 00:01:14,100 --> 00:01:16,370
67
+ in the next slide you'll see the benchmark figures here.
68
+
69
+ 18
70
+ 00:01:16,740 --> 00:01:21,560
71
+ And the best budget invalid GP is still the GTX 10:52 II.
72
+
73
+ 19
74
+ 00:01:21,840 --> 00:01:26,600
75
+ A call actually was going to buy two years ago.
76
+
77
+ 20
78
+ 00:01:26,820 --> 00:01:32,460
79
+ So some other recommendations to always make sure you have at least 16 gigs from because when you are
80
+
81
+ 21
82
+ 00:01:32,460 --> 00:01:39,270
83
+ losing a data sets and doing a lot of pre-processing that's done in around not in GP you memory yet.
84
+
85
+ 22
86
+ 00:01:39,270 --> 00:01:47,490
87
+ So you still need a lot of from also an SSD will help a lot when you're doing a lot of data processing.
88
+
89
+ 23
90
+ 00:01:47,490 --> 00:01:54,130
91
+ All us know that and what you should be aware of is that while these are quite good they don't have
92
+
93
+ 24
94
+ 00:01:54,130 --> 00:01:55,590
95
+ a lot of storage space.
96
+
97
+ 25
98
+ 00:01:55,690 --> 00:02:00,010
99
+ So if you're dealing with super large image assets you're going to want to have a large harddrive to
100
+
101
+ 26
102
+ 00:02:00,010 --> 00:02:02,480
103
+ just dump these assets after you create them.
104
+
105
+ 27
106
+ 00:02:03,830 --> 00:02:05,530
107
+ And lastly you do need a CPA.
108
+
109
+ 28
110
+ 00:02:05,550 --> 00:02:12,320
111
+ You know I think most modern DCP is a cost of 100 US are going to be fast enough.
112
+
113
+ 29
114
+ 00:02:12,510 --> 00:02:15,310
115
+ However just make sure it's at least a quad core.
116
+
117
+ 30
118
+ 00:02:15,450 --> 00:02:24,170
119
+ M.D. reizen line is very good as well as pretty much any modern i 5 7 0 9 by Intel.
120
+
121
+ 31
122
+ 00:02:24,180 --> 00:02:26,750
123
+ So these are some GP performance charts here.
124
+
125
+ 32
126
+ 00:02:26,790 --> 00:02:33,170
127
+ So this is Google's T-P you specialize in the cloud can see they actually provide by far the best performance.
128
+
129
+ 33
130
+ 00:02:33,210 --> 00:02:41,040
131
+ However look at the Titan V that is actually very close and it's relatively affordable if you're a serious
132
+
133
+ 34
134
+ 00:02:41,040 --> 00:02:42,320
135
+ Duplin.
136
+
137
+ 35
138
+ 00:02:43,140 --> 00:02:45,600
139
+ Now what about RTX 28:18.
140
+
141
+ 36
142
+ 00:02:45,950 --> 00:02:48,690
143
+ It is very very close to Tyson V.
144
+
145
+ 37
146
+ 00:02:48,810 --> 00:02:51,510
147
+ So I would definitely buy discard overtighten Wii.
148
+
149
+ 38
150
+ 00:02:52,230 --> 00:02:56,270
151
+ And as you can see performance is denormalized performance here.
152
+
153
+ 39
154
+ 00:02:56,520 --> 00:02:59,870
155
+ It probably gets decent around the same.
156
+
157
+ 40
158
+ 00:03:00,150 --> 00:03:04,690
159
+ So if I were you would probably get a GTX 10 HDI D-I.
160
+
161
+ 41
162
+ 00:03:05,170 --> 00:03:14,100
163
+ Or if you if you don't want to spend that much the GTX 10 70 is actually a very good car attendee and
164
+
165
+ 42
166
+ 00:03:14,100 --> 00:03:14,840
167
+ this is a performance.
168
+
169
+ 43
170
+ 00:03:14,850 --> 00:03:22,260
171
+ But all it can see the RTX 2070 coming up on top next would be the GTX 10 60 and discards here.
172
+
173
+ 44
174
+ 00:03:26,290 --> 00:03:31,720
175
+ So there are also a number of places where we can find a cloud GP as is AWOS Google.
176
+
177
+ 45
178
+ 00:03:31,740 --> 00:03:37,420
179
+ How does all tons of them actually and most of their rates are relatively affordable.
180
+
181
+ 46
182
+ 00:03:37,420 --> 00:03:40,140
183
+ Some saw as little as 20 cents per hour.
184
+
185
+ 47
186
+ 00:03:40,150 --> 00:03:45,490
187
+ However if you think about it if you're if you're a serious guy who's going to train models for days
188
+
189
+ 48
190
+ 00:03:45,490 --> 00:03:49,130
191
+ a week sometimes you don't want to be doing this continuously.
192
+
193
+ 49
194
+ 00:03:49,180 --> 00:03:55,660
195
+ So you may actually want to invest in something affordable like this or perhaps one of these cards here.
196
+
197
+ 50
198
+ 00:03:56,200 --> 00:03:57,540
199
+ Not this one or this one.
200
+
201
+ 51
202
+ 00:03:57,550 --> 00:04:02,180
203
+ Probably these and that's mainly because of price.
204
+
205
+ 52
206
+ 00:04:02,180 --> 00:04:09,440
207
+ By the way however one of the best places I've seen for good value and good clergy abuse.
208
+
209
+ 53
210
+ 00:04:09,760 --> 00:04:12,570
211
+ As of November 20 it is paper space.
212
+
213
+ 54
214
+ 00:04:12,700 --> 00:04:16,230
215
+ Go to the site and check it out they're actually very easy to use.
216
+
217
+ 55
218
+ 00:04:16,240 --> 00:04:21,200
219
+ Got some very good tutorials and their prices are some of the cheapest online that I've seen.
27. BONUS - Build a Credit Card Number Reader/1. Step 1 - Creating a Credit Card Number Dataset.html ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Step 1 - Exploring the Credit Card Font</strong></h4><p><br></p><p><strong>NOTE</strong>: Download all code in resources</p><p><strong>Open the file titled: </strong><code><strong><em>Credit Card Reader.ipynb</em></strong></code></p><p>Unfortunately, there isn't an official standard credit card number font - some of the fonts used go by the names <strong>Farrington 7B</strong>, <strong>OCR-B</strong>, <strong>SecurePay</strong>, <strong>OCR-A</strong> and <strong>MICR E13B. </strong>However, in my experience there seem to be two main font variations used in credit cards:</p><p><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_05-56-32-af03064a01e050776ae191c8c61fb2aa.jpg"></p><p>and</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_05-56-51-ac5823c68461a928b659677c5d7b438f.jpg"></figure><p>Note the differences, especially in the 1, 7 and 8.</p><p><br></p><p>Let's open these images in Python using OpenCV</p><pre class="prettyprint linenums">import cv2
2
+
3
+ cc1 = cv2.imread('creditcard_digits1.jpg', 0)
4
+ cv2.imshow("Digits 1", cc1)
5
+ cv2.waitKey(0)
6
+ cc2 = cv2.imread('creditcard_digits2.jpg', 0)
7
+ cv2.imshow("Digits 2", cc2)
8
+ cv2.waitKey(0)
9
+ cv2.destroyAllWindows()</pre><p><br></p><p>Let's experiment with testing OTSU Binarization. Remember binarization converts a grayscale image to two colors, black and white. Values under a certain threshold (typically 127  out of 255) are clipped to 0, while the values greater than 127 are clipped to 255. It looks like this below. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-00-29-855d880d0b635790c33d3f0402f8158f.JPG"></figure><p>The code to perform this is:</p><pre class="prettyprint linenums">cc1 = cv2.imread('creditcard_digits2.jpg', 0)
10
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
11
+ cv2.imshow("Digits 2 Thresholded", th2)
12
+ cv2.waitKey(0)
13
+
14
+ cv2.destroyAllWindows()</pre><p><br></p><h4><strong>Step 2 - Creating our Dataset Directories</strong></h4><p>This sets up our training and test directories for the digits (0 to 9).</p><pre class="prettyprint linenums">#Create our dataset directories
15
+
16
+ import os
17
+
18
+ def makedir(directory):
19
+ """Creates a new directory if it does not exist"""
20
+ if not os.path.exists(directory):
21
+ os.makedirs(directory)
22
+ return None, 0
23
+
24
+ for i in range(0,10):
25
+ directory_name = "./credit_card/train/"+str(i)
26
+ print(directory_name)
27
+ makedir(directory_name)
28
+
29
+ for i in range(0,10):
30
+ directory_name = "./credit_card/test/"+str(i)
31
+ print(directory_name)
32
+ makedir(directory_name)</pre><h4></h4><h4><strong>Step 3 - Creating our Data Augmentation Functions</strong></h4><p>Let's now create some functions to create more data. What we're doing here is taking the two samples of each digit we saw above, and adding small variations to the digit. This is very similar to Keras's Data Augmentation, however, we're using OpenCV to create an augmented dataset instead. We will further use Keras to Augment this even further. </p><p>We've created 5 functions here, let's discuss each: </p><ul><li><p><strong><em>DigitAugmentation()</em></strong> - This one simply uses the other image manipulating functions, but calls them randomly. Examine to code to see how it's done.</p></li><li><p><strong><em>add_noise() </em>-</strong> This function introduces some noise elements to the image</p></li><li><p><strong><em>pixelate()</em></strong> - This function re-sizes the image then upscales/upsamples it. This degrades the quality and is meant to simulate blur to the image from either a shakey or poor quality camera. </p></li><li><p><strong><em>stretch() </em></strong>- This simulates some variation in re-sizing where it stretches the image to a small random amount</p></li><li><p>pre_process() - This is a simple function that applies OTSU Binarization to the image and re-sizes it. We use this on the extracted digits. To create a clean dataset akin to the MNIST style format. </p><p><br></p></li></ul><pre class="prettyprint linenums">import cv2
33
+ import numpy as np
34
+ import random
35
+ import cv2
36
+ from scipy.ndimage import convolve
37
+
38
+ def DigitAugmentation(frame, dim = 32):
39
+ """Randomly alters the image using noise, pixelation and streching image functions"""
40
+ frame = cv2.resize(frame, None, fx=2, fy=2, interpolation = cv2.INTER_CUBIC)
41
+ frame = cv2.cvtColor(frame, cv2.COLOR_GRAY2RGB)
42
+ random_num = np.random.randint(0,9)
43
+
44
+ if (random_num % 2 == 0):
45
+ frame = add_noise(frame)
46
+ if(random_num % 3 == 0):
47
+ frame = pixelate(frame)
48
+ if(random_num % 2 == 0):
49
+ frame = stretch(frame)
50
+ frame = cv2.resize(frame, (dim, dim), interpolation = cv2.INTER_AREA)
51
+
52
+ return frame
53
+
54
+ def add_noise(image):
55
+ """Addings noise to image"""
56
+ prob = random.uniform(0.01, 0.05)
57
+ rnd = np.random.rand(image.shape[0], image.shape[1])
58
+ noisy = image.copy()
59
+ noisy[rnd &lt; prob] = 0
60
+ noisy[rnd &gt; 1 - prob] = 1
61
+ return noisy
62
+
63
+ def pixelate(image):
64
+ "Pixelates an image by reducing the resolution then upscaling it"
65
+ dim = np.random.randint(8,12)
66
+ image = cv2.resize(image, (dim, dim), interpolation = cv2.INTER_AREA)
67
+ image = cv2.resize(image, (16, 16), interpolation = cv2.INTER_AREA)
68
+ return image
69
+
70
+ def stretch(image):
71
+ "Randomly applies different degrees of stretch to image"
72
+ ran = np.random.randint(0,3)*2
73
+ if np.random.randint(0,2) == 0:
74
+ frame = cv2.resize(image, (32, ran+32), interpolation = cv2.INTER_AREA)
75
+ return frame[int(ran/2):int(ran+32)-int(ran/2), 0:32]
76
+ else:
77
+ frame = cv2.resize(image, (ran+32, 32), interpolation = cv2.INTER_AREA)
78
+ return frame[0:32, int(ran/2):int(ran+32)-int(ran/2)]
79
+
80
+ def pre_process(image, inv = False):
81
+ """Uses OTSU binarization on an image"""
82
+ try:
83
+ gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
84
+ except:
85
+ gray_image = image
86
+ pass
87
+
88
+ if inv == False:
89
+ _, th2 = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
90
+ else:
91
+ _, th2 = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
92
+ resized = cv2.resize(th2, (32,32), interpolation = cv2.INTER_AREA)
93
+ return resized
94
+ </pre><p>We can test our augmentation by using this bit of code:</p><pre class="prettyprint linenums">cc1 = cv2.imread('creditcard_digits2.jpg', 0)
95
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
96
+ cv2.imshow("cc1", th2)
97
+ cv2.waitKey(0)
98
+ cv2.destroyAllWindows()
99
+
100
+ # This is the coordinates of the region enclosing the first digit
101
+ # This is preset and was done manually based on this specific image
102
+ region = [(0, 0), (35, 48)]
103
+
104
+ # Assigns values to each region for ease of interpretation
105
+ top_left_y = region[0][1]
106
+ bottom_right_y = region[1][1]
107
+ top_left_x = region[0][0]
108
+ bottom_right_x = region[1][0]
109
+
110
+ for i in range(0,1): #We only look at the first digit in testing out augmentation functions
111
+ roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
112
+ for j in range(0,10):
113
+ roi2 = DigitAugmentation(roi)
114
+ roi_otsu = pre_process(roi2, inv = False)
115
+ cv2.imshow("otsu", roi_otsu)
116
+ cv2.waitKey(0)
117
+
118
+ cv2.destroyAllWindows()</pre><p>Typically it looks like this:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-11-17-38e732423b5d5dcecc18371c022619eb.JPG"></figure><p><br></p><p>You can try more adventurous forms of varying the original image. My suggestion would be to try dilation and erosion on these. </p><h4><strong>Step 4 - Creating our dataset</strong></h4><p>Let's create 1000 variations of the first font we're sampling (note 1000 is perhaps way too much, but the data sizes were small and quick to train so why not use the arbitrary number of 1000).</p><pre class="prettyprint linenums"># Creating 2000 Images for each digit in creditcard_digits1 - TRAINING DATA
119
+
120
+ # Load our first image
121
+ cc1 = cv2.imread('creditcard_digits1.jpg', 0)
122
+
123
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
124
+ cv2.imshow("cc1", th2)
125
+ cv2.imshow("creditcard_digits1", cc1)
126
+ cv2.waitKey(0)
127
+ cv2.destroyAllWindows()
128
+
129
+ region = [(2, 19), (50, 72)]
130
+
131
+ top_left_y = region[0][1]
132
+ bottom_right_y = region[1][1]
133
+ top_left_x = region[0][0]
134
+ bottom_right_x = region[1][0]
135
+
136
+ for i in range(0,10):
137
+ # We jump the next digit each time we loop
138
+ if i &gt; 0:
139
+ top_left_x = top_left_x + 59
140
+ bottom_right_x = bottom_right_x + 59
141
+
142
+ roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
143
+ print("Augmenting Digit - ", str(i))
144
+ # We create 200 versions of each image for our dataset
145
+ for j in range(0,2000):
146
+ roi2 = DigitAugmentation(roi)
147
+ roi_otsu = pre_process(roi2, inv = True)
148
+ cv2.imwrite("./credit_card/train/"+str(i)+"./_1_"+str(j)+".jpg", roi_otsu)
149
+ cv2.destroyAllWindows()
150
+
151
+ </pre><p>Next, let's make 1000 variations to each digit of the second font type.</p><pre class="prettyprint linenums"># Creating 2000 Images for each digit in creditcard_digits2 - TRAINING DATA
152
+
153
+ cc1 = cv2.imread('creditcard_digits2.jpg', 0)
154
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
155
+ cv2.imshow("cc1", th2)
156
+ cv2.waitKey(0)
157
+ cv2.destroyAllWindows()
158
+
159
+ region = [(0, 0), (35, 48)]
160
+
161
+ top_left_y = region[0][1]
162
+ bottom_right_y = region[1][1]
163
+ top_left_x = region[0][0]
164
+ bottom_right_x = region[1][0]
165
+
166
+ for i in range(0,10):
167
+ if i &gt; 0:
168
+ # We jump the next digit each time we loop
169
+ top_left_x = top_left_x + 35
170
+ bottom_right_x = bottom_right_x + 35
171
+
172
+ roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
173
+ print("Augmenting Digit - ", str(i))
174
+ # We create 200 versions of each image for our dataset
175
+ for j in range(0,2000):
176
+ roi2 = DigitAugmentation(roi)
177
+ roi_otsu = pre_process(roi2, inv = False)
178
+ cv2.imwrite("./credit_card/train/"+str(i)+"./_2_"+str(j)+".jpg", roi_otsu)
179
+ cv2.imshow("otsu", roi_otsu)
180
+ print("-")
181
+ cv2.waitKey(0)
182
+
183
+ cv2.destroyAllWindows()</pre><p><br></p><p><strong>Making our Test Data </strong></p><p>- Note is a VERY bad practice to create a test dataset like this. Even though we're adding random variations, our test data here is too similar to our training data. Ideally, you'd want to use some real life unseen data from another source. In our case, we're sampling for the same dataset.</p><pre class="prettyprint linenums"># Creating 200 Images for each digit in creditcard_digits1 - TEST DATA
184
+
185
+ # Load our first image
186
+ cc1 = cv2.imread('creditcard_digits1.jpg', 0)
187
+
188
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
189
+ cv2.imshow("cc1", th2)
190
+ cv2.imshow("creditcard_digits1", cc1)
191
+ cv2.waitKey(0)
192
+ cv2.destroyAllWindows()
193
+
194
+ region = [(2, 19), (50, 72)]
195
+
196
+ top_left_y = region[0][1]
197
+ bottom_right_y = region[1][1]
198
+ top_left_x = region[0][0]
199
+ bottom_right_x = region[1][0]
200
+
201
+ for i in range(0,10):
202
+ # We jump the next digit each time we loop
203
+ if i &gt; 0:
204
+ top_left_x = top_left_x + 59
205
+ bottom_right_x = bottom_right_x + 59
206
+
207
+ roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
208
+ print("Augmenting Digit - ", str(i))
209
+ # We create 200 versions of each image for our dataset
210
+ for j in range(0,2000):
211
+ roi2 = DigitAugmentation(roi)
212
+ roi_otsu = pre_process(roi2, inv = True)
213
+ cv2.imwrite("./credit_card/test/"+str(i)+"./_1_"+str(j)+".jpg", roi_otsu)
214
+ cv2.destroyAllWindows()</pre><pre class="prettyprint linenums"># Creating 200 Images for each digit in creditcard_digits2 - TEST DATA
215
+
216
+ cc1 = cv2.imread('creditcard_digits2.jpg', 0)
217
+ _, th2 = cv2.threshold(cc1, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
218
+ cv2.imshow("cc1", th2)
219
+ cv2.waitKey(0)
220
+ cv2.destroyAllWindows()
221
+
222
+ region = [(0, 0), (35, 48)]
223
+
224
+ top_left_y = region[0][1]
225
+ bottom_right_y = region[1][1]
226
+ top_left_x = region[0][0]
227
+ bottom_right_x = region[1][0]
228
+
229
+ for i in range(0,10):
230
+ if i &gt; 0:
231
+ # We jump the next digit each time we loop
232
+ top_left_x = top_left_x + 35
233
+ bottom_right_x = bottom_right_x + 35
234
+
235
+ roi = cc1[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
236
+ print("Augmenting Digit - ", str(i))
237
+ # We create 200 versions of each image for our dataset
238
+ for j in range(0,2000):
239
+ roi2 = DigitAugmentation(roi)
240
+ roi_otsu = pre_process(roi2, inv = False)
241
+ cv2.imwrite("./credit_card/test/"+str(i)+"./_2_"+str(j)+".jpg", roi_otsu)
242
+ cv2.imshow("otsu", roi_otsu)
243
+ print("-")
244
+ cv2.waitKey(0)
245
+ cv2.destroyAllWindows()
246
+ </pre><p><br></p>
27. BONUS - Build a Credit Card Number Reader/1.1 Download Credit Card Files and Data.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1bMcHna6h6YWDmMQm1XYedYVcEPXJ_S3v/view?usp=sharing";</script>
27. BONUS - Build a Credit Card Number Reader/2. Step 2 - Training Our Model.html ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Creating our Classifier in Keras</strong></h4><p>Now that we have our dataset we can dive in your  using the Keras code you're now so familiar with :)</p><p>Let's now take advantage of Keras's Data Augmentation and apply some small rotations, shifts, shearing and zooming. </p><p><br></p><pre class="prettyprint linenums">import os
2
+ import numpy as np
3
+ from keras.models import Sequential
4
+ from keras.layers import Activation, Dropout, Flatten, Dense
5
+ from keras.preprocessing.image import ImageDataGenerator
6
+ from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
7
+ from keras import optimizers
8
+ import keras
9
+
10
+ input_shape = (32, 32, 3)
11
+ img_width = 32
12
+ img_height = 32
13
+ num_classes = 10
14
+ nb_train_samples = 10000
15
+ nb_validation_samples = 2000
16
+ batch_size = 16
17
+ epochs = 1
18
+
19
+ train_data_dir = './credit_card/train'
20
+ validation_data_dir = './credit_card/test'
21
+
22
+ # Creating our data generator for our test data
23
+ validation_datagen = ImageDataGenerator(
24
+ # used to rescale the pixel values from [0, 255] to [0, 1] interval
25
+ rescale = 1./255)
26
+
27
+ # Creating our data generator for our training data
28
+ train_datagen = ImageDataGenerator(
29
+ rescale = 1./255, # normalize pixel values to [0,1]
30
+ rotation_range = 10, # randomly applies rotations
31
+ width_shift_range = 0.25, # randomly applies width shifting
32
+ height_shift_range = 0.25, # randomly applies height shifting
33
+ shear_range=0.5,
34
+ zoom_range=0.5,
35
+ horizontal_flip = False, # randonly flips the image
36
+ fill_mode = 'nearest') # uses the fill mode nearest to fill gaps created by the above
37
+
38
+ # Specify criteria about our training data, such as the directory, image size, batch size and type
39
+ # automagically retrieve images and their classes for train and validation sets
40
+ train_generator = train_datagen.flow_from_directory(
41
+ train_data_dir,
42
+ target_size = (img_width, img_height),
43
+ batch_size = batch_size,
44
+ class_mode = 'categorical')
45
+
46
+ validation_generator = validation_datagen.flow_from_directory(
47
+ validation_data_dir,
48
+ target_size = (img_width, img_height),
49
+ batch_size = batch_size,
50
+ class_mode = 'categorical',
51
+ shuffle = False) </pre><p><strong>Let's use the effective and famous (on MNIST) LeNET Convolutional Neural Network Architecture</strong></p><pre class="prettyprint linenums"># create model
52
+ model = Sequential()
53
+
54
+ # 2 sets of CRP (Convolution, RELU, Pooling)
55
+ model.add(Conv2D(20, (5, 5),
56
+ padding = "same",
57
+ input_shape = input_shape))
58
+ model.add(Activation("relu"))
59
+ model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))
60
+
61
+ model.add(Conv2D(50, (5, 5),
62
+ padding = "same"))
63
+ model.add(Activation("relu"))
64
+ model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))
65
+
66
+ # Fully connected layers (w/ RELU)
67
+ model.add(Flatten())
68
+ model.add(Dense(500))
69
+ model.add(Activation("relu"))
70
+
71
+ # Softmax (for classification)
72
+ model.add(Dense(num_classes))
73
+ model.add(Activation("softmax"))
74
+
75
+ model.compile(loss = 'categorical_crossentropy',
76
+ optimizer = keras.optimizers.Adadelta(),
77
+ metrics = ['accuracy'])
78
+
79
+ print(model.summary())
80
+ </pre><p><strong>And now let's train our model for 5 EPOCHS.</strong></p><pre class="prettyprint linenums">from keras.optimizers import RMSprop
81
+ from keras.callbacks import ModelCheckpoint, EarlyStopping
82
+
83
+ checkpoint = ModelCheckpoint("/home/deeplearningcv/DeepLearningCV/Trained Models/creditcard.h5",
84
+ monitor="val_loss",
85
+ mode="min",
86
+ save_best_only = True,
87
+ verbose=1)
88
+
89
+ earlystop = EarlyStopping(monitor = 'val_loss',
90
+ min_delta = 0,
91
+ patience = 3,
92
+ verbose = 1,
93
+ restore_best_weights = True)
94
+
95
+ # we put our call backs into a callback list
96
+ callbacks = [earlystop, checkpoint]
97
+
98
+ # Note we use a very small learning rate
99
+ model.compile(loss = 'categorical_crossentropy',
100
+ optimizer = RMSprop(lr = 0.001),
101
+ metrics = ['accuracy'])
102
+
103
+ nb_train_samples = 20000
104
+ nb_validation_samples = 4000
105
+ epochs = 5
106
+ batch_size = 16
107
+
108
+ history = model.fit_generator(
109
+ train_generator,
110
+ steps_per_epoch = nb_train_samples // batch_size,
111
+ epochs = epochs,
112
+ callbacks = callbacks,
113
+ validation_data = validation_generator,
114
+ validation_steps = nb_validation_samples // batch_size)
115
+
116
+ model.save("/home/deeplearningcv/DeepLearningCV/Trained Models/creditcard.h5")</pre><p><br></p><p>Perfect, now you have an extremely accurate (on our limited test data) model. </p><p><br></p><p><strong>In the next section we will build a Credit Card Extractor using OpenCV</strong></p>
27. BONUS - Build a Credit Card Number Reader/3. Step 3 - Extracting A Credit Card from the Background.html ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Extracting a Credit Card</strong></h4><p><br></p><p><strong>NOTE</strong>: This example extracts only 12 of the 16 digits because, this is a working credit card :) </p><p><br></p><p>The code below does contains the functions we use to do the following:</p><p><br></p><p>1. It loads the image, image we're using our mobile phone camera to take this picture. (Note: ideally place the card on a contrasting background)</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-31-06-6e0ea340958fbce841a07b78d48215f2.JPG"></figure><p>2. We use Canny Edge detection to identify the edges of card</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-31-32-83ee0c50b79d9008b9823bfa98da7be9.JPG"></figure><p>3. We using cv2.findContours() to extract the largest contour (which we assume will be the credit card)</p><p><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-31-14-fee84b836e81142e2829fb266dec2244.JPG"></p><p>4. This is where we use the function <strong>four_point_transform() </strong>and <strong>order_points() </strong>to adjust the perspective of the card. It creates a top-down type view that is useful because:</p><ul><li><p>It standardizes the view of the card so that the credit card digits are always roughly in the same area.</p></li><li><p>It removes/reduces skew and warped perspectives from the image when taken from a camera. All cameras unless taken exactly top down, will introduce some skew to text/digits. This is why scanners always produce more realistic looking images.</p></li></ul><figure><img height="214" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-31-39-48dc83992e0158333fdffecfbeeaf78b.JPG" width="340"><img height="214" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-31-44-5c9e8bd1c39030b2f38f2dd72936c08d.JPG" width="341"></figure><pre class="prettyprint linenums">import cv2
2
+ import numpy as np
3
+ import imutils
4
+ from skimage.filters import threshold_adaptive
5
+ import os
6
+
7
+ def order_points(pts):
8
+ # initialize a list of coordinates that will be ordered
9
+ # such that the first entry in the list is the top-left,
10
+ # the second entry is the top-right, the third is the
11
+ # bottom-right, and the fourth is the bottom-left
12
+ rect = np.zeros((4, 2), dtype = "float32")
13
+
14
+ # the top-left point will have the smallest sum, whereas
15
+ # the bottom-right point will have the largest sum
16
+ s = pts.sum(axis = 1)
17
+ rect[0] = pts[np.argmin(s)]
18
+ rect[2] = pts[np.argmax(s)]
19
+
20
+ # now, compute the difference between the points, the
21
+ # top-right point will have the smallest difference,
22
+ # whereas the bottom-left will have the largest difference
23
+ diff = np.diff(pts, axis = 1)
24
+ rect[1] = pts[np.argmin(diff)]
25
+ rect[3] = pts[np.argmax(diff)]
26
+
27
+ # return the ordered coordinates
28
+ return rect
29
+
30
+ def four_point_transform(image, pts):
31
+ # obtain a consistent order of the points and unpack them
32
+ # individually
33
+ rect = order_points(pts)
34
+ (tl, tr, br, bl) = rect
35
+
36
+ # compute the width of the new image, which will be the
37
+ # maximum distance between bottom-right and bottom-left
38
+ # x-coordinates or the top-right and top-left x-coordinates
39
+ widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
40
+ widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
41
+ maxWidth = max(int(widthA), int(widthB))
42
+
43
+ # compute the height of the new image, which will be the
44
+ # maximum distance between the top-right and bottom-right
45
+ # y-coordinates or the top-left and bottom-left y-coordinates
46
+ heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
47
+ heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
48
+ maxHeight = max(int(heightA), int(heightB))
49
+
50
+ # now that we have the dimensions of the new image, construct
51
+ # the set of destination points to obtain a "birds eye view",
52
+ # (i.e. top-down view) of the image, again specifying points
53
+ # in the top-left, top-right, bottom-right, and bottom-left
54
+ # order
55
+ dst = np.array([
56
+ [0, 0],
57
+ [maxWidth - 1, 0],
58
+ [maxWidth - 1, maxHeight - 1],
59
+ [0, maxHeight - 1]], dtype = "float32")
60
+
61
+ # compute the perspective transform matrix and then apply it
62
+ M = cv2.getPerspectiveTransform(rect, dst)
63
+ warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))
64
+
65
+ # return the warped image
66
+ return warped
67
+
68
+ def doc_Scan(image):
69
+ orig_height, orig_width = image.shape[:2]
70
+ ratio = image.shape[0] / 500.0
71
+
72
+ orig = image.copy()
73
+ image = imutils.resize(image, height = 500)
74
+ orig_height, orig_width = image.shape[:2]
75
+ Original_Area = orig_height * orig_width
76
+
77
+ # convert the image to grayscale, blur it, and find edges
78
+ # in the image
79
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
80
+ gray = cv2.GaussianBlur(gray, (5, 5), 0)
81
+ edged = cv2.Canny(gray, 75, 200)
82
+
83
+ cv2.imshow("Image", image)
84
+ cv2.imshow("Edged", edged)
85
+ cv2.waitKey(0)
86
+ # show the original image and the edge detected image
87
+
88
+ # find the contours in the edged image, keeping only the
89
+ # largest ones, and initialize the screen contour
90
+ _, contours, hierarchy = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
91
+ contours = sorted(contours, key = cv2.contourArea, reverse = True)[:5]
92
+
93
+ # loop over the contours
94
+ for c in contours:
95
+
96
+ # approximate the contour
97
+ area = cv2.contourArea(c)
98
+ if area &lt; (Original_Area/3):
99
+ print("Error Image Invalid")
100
+ return("ERROR")
101
+ peri = cv2.arcLength(c, True)
102
+ approx = cv2.approxPolyDP(c, 0.02 * peri, True)
103
+
104
+ # if our approximated contour has four points, then we
105
+ # can assume that we have found our screen
106
+ if len(approx) == 4:
107
+ screenCnt = approx
108
+ break
109
+
110
+ # show the contour (outline) of the piece of paper
111
+ cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
112
+ cv2.imshow("Outline", image)
113
+
114
+ warped = four_point_transform(orig, screenCnt.reshape(4, 2) * ratio)
115
+ # convert the warped image to grayscale, then threshold it
116
+ # to give it that 'black and white' paper effect
117
+ cv2.resize(warped, (640,403), interpolation = cv2.INTER_AREA)
118
+ cv2.imwrite("credit_card_color.jpg", warped)
119
+ warped = cv2.cvtColor(warped, cv2.COLOR_BGR2GRAY)
120
+ warped = warped.astype("uint8") * 255
121
+ cv2.imshow("Extracted Credit Card", warped)
122
+ cv2.waitKey(0)
123
+ cv2.destroyAllWindows()
124
+ return warped</pre><p><br></p><p>We now need to extract the credit card number region. If you looked at the code in the above example we are always re-sizing our extracted credit card image to a size of 640 x 403. The odd choice of 403 was chosen because 640:403 is the actual ratio of a credit card. We are trying to maintain as accurate as possible the length and width dimensions so that we don't necessarily warp the image too much.</p><p><strong>NOTE</strong>: I know we're re-sizing all extracted digits to 32 x 32, but even still keeping the initial ratio correct will only help our classifier accuracy. </p><p><br></p><p>As such, because of the fixed size, we can now extract the region ([(55, 210), (640, 290)]) easily from our image. </p><pre class="prettyprint linenums">image = cv2.imread('test_card.jpg')
125
+ image = doc_Scan(image)
126
+
127
+ region = [(55, 210), (640, 290)]
128
+
129
+ top_left_y = region[0][1]
130
+ bottom_right_y = region[1][1]
131
+ top_left_x = region[0][0]
132
+ bottom_right_x = region[1][0]
133
+
134
+ # Extracting the area were the credit numbers are located
135
+ roi = image[top_left_y:bottom_right_y, top_left_x:bottom_right_x]
136
+ cv2.imshow("Region", roi)
137
+ cv2.imwrite("credit_card_extracted_digits.jpg", roi)
138
+ cv2.waitKey(0)
139
+ cv2.destroyAllWindows()</pre><p>This is what our extracted region looks like.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-42-43-3d2f27840197bd5ff18b47aa759b74b4.JPG"></figure><p>In the next section, we're going to using some OpenCV techniques to extract each digit and pass it to our classifier for identification. </p>
27. BONUS - Build a Credit Card Number Reader/4. Step 4 - Use our Model to Identify the Digits & Display it onto our Credit Card.html ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Classifying Digits on a real Credit Card</strong></h4><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-46-01-1309a41efbd20f05579ceca7b62c163b.JPG"></figure><p>In this section we're going to write the code to make the above!</p><p>Firstly, let's load the model we created by:</p><pre class="prettyprint linenums">from keras.models import load_model
2
+ import keras
3
+
4
+ classifier = load_model('/home/deeplearningcv/DeepLearningCV/Trained Models/creditcard.h5')</pre><p><br></p><p>Now let's do some OpenCV magic and extract the digits from the image below:</p><p> </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-04_06-47-23-51bd1976506279715b6324297b86b22b.JPG"></figure><p>The algorithm we follow to do this is really quite simple:</p><ol><li><p>We first load our grayscale extracted image and the original color (note we could have just loaded the color and grayscaled it)</p></li><li><p>We apply the Canny Edge algorithm (typically we apply blur first to reduce noise in finding the edges). </p></li><li><p>We then use findCountours to isolate the digits</p></li><li><p>We sort the contours by size (so that smaller irrelevant contours aren't used)</p></li><li><p>We then sort it left to right by creating a function that returns the x-cordinate of a contour. </p></li><li><p>Once we have our cleaned up contours, we find the bounding rectange of the contour which gives us an enclosed rectangle around the digit. (To ensure these contours are valid we do extract only contours meeting the minimum width and height expectations). Also because I've created a black square around the last 4 digits, we discard contours of large area so that it isn't fed into our classifier. </p></li><li><p>We then take each extracted digit, use our pre_processing function (which applies OTSU Binarization and re-sizes it) then breakdown that image array so that it can be loaded into our classifier.</p></li></ol><p>The full code to do this is shown below:</p><pre class="prettyprint linenums">def x_cord_contour(contours):
5
+ #Returns the X cordinate for the contour centroid
6
+ if cv2.contourArea(contours) &gt; 10:
7
+ M = cv2.moments(contours)
8
+ return (int(M['m10']/M['m00']))
9
+ else:
10
+ pass
11
+
12
+ img = cv2.imread('credit_card_extracted_digits.jpg')
13
+ orig_img = cv2.imread('credit_card_color.jpg')
14
+ gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
15
+ cv2.imshow("image", img)
16
+ cv2.waitKey(0)
17
+
18
+ # Blur image then find edges using Canny
19
+ blurred = cv2.GaussianBlur(gray, (5, 5), 0)
20
+ #cv2.imshow("blurred", blurred)
21
+ #cv2.waitKey(0)
22
+
23
+ edged = cv2.Canny(blurred, 30, 150)
24
+ #cv2.imshow("edged", edged)
25
+ #cv2.waitKey(0)
26
+
27
+ # Find Contours
28
+ _, contours, _ = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
29
+
30
+ #Sort out contours left to right by using their x cordinates
31
+ contours = sorted(contours, key=cv2.contourArea, reverse=True)[:13] #Change this to 16 to get all digits
32
+ contours = sorted(contours, key = x_cord_contour, reverse = False)
33
+
34
+ # Create empty array to store entire number
35
+ full_number = []
36
+
37
+ # loop over the contours
38
+ for c in contours:
39
+ # compute the bounding box for the rectangle
40
+ (x, y, w, h) = cv2.boundingRect(c)
41
+ if w &gt;= 5 and h &gt;= 25 and cv2.contourArea(c) &lt; 1000:
42
+ roi = blurred[y:y + h, x:x + w]
43
+ #ret, roi = cv2.threshold(roi, 20, 255,cv2.THRESH_BINARY_INV)
44
+ cv2.imshow("ROI1", roi)
45
+ roi_otsu = pre_process(roi, True)
46
+ cv2.imshow("ROI2", roi_otsu)
47
+ roi_otsu = cv2.cvtColor(roi_otsu, cv2.COLOR_GRAY2RGB)
48
+ roi_otsu = keras.preprocessing.image.img_to_array(roi_otsu)
49
+ roi_otsu = roi_otsu * 1./255
50
+ roi_otsu = np.expand_dims(roi_otsu, axis=0)
51
+ image = np.vstack([roi_otsu])
52
+ label = str(classifier.predict_classes(image, batch_size = 10))[1]
53
+ print(label)
54
+ (x, y, w, h) = (x+region[0][0], y+region[0][1], w, h)
55
+ cv2.rectangle(orig_img, (x, y), (x + w, y + h), (0, 255, 0), 2)
56
+ cv2.putText(orig_img, label, (x , y + 90), cv2.FONT_HERSHEY_COMPLEX, 2, (255, 0, 0), 2)
57
+ cv2.imshow("image", orig_img)
58
+ cv2.waitKey(0)
59
+
60
+ cv2.destroyAllWindows()</pre>
28. BONUS - Use Cloud GPUs on PaperSpace/1. Why use Cloud GPUs and How to Setup a PaperSpace Gradient Notebook.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <h4><strong>Why GPU's are better for Deep Learning</strong> </h4><p><a href="https://www.quora.com/Why-are-GPUs-well-suited-to-deep-learning" rel="noopener noreferrer" target="_blank">https://www.quora.com/Why-are-GPUs-well-suited-to-deep-learning</a></p><p><br></p><p>As many have said GPUs are so fast because they are so efficient for matrix multiplication and convolution, but nobody gave a real explanation for why this is so. The real reason for this is memory bandwidth and not necessarily parallelism.</p><p>First of all, you have to understand that CPUs are latency optimized while GPUs are bandwidth optimized. You can visualize this as a CPU being a Ferrari and a GPU being a big truck. The task of both is to pick up packages from a random location A and to transport those packages to another random location B. The CPU (Ferrari) can fetch some memory (packages) in your RAM quickly while the GPU (big truck) is slower in doing that (much higher latency). However, the CPU (Ferrari) needs to go back and forth many times to do its job (location A $\rightarrow$ pick up 2 packages $\rightarrow$ location B ... repeat) while the GPU can fetch much more memory at once (location A $\rightarrow$ pick up 100 packages $\rightarrow$ location B ... repeat).</p><p>So, in other words, the CPU is good at fetching small amounts of memory quickly (5 * 3 * 7) while the GPU is good at fetching large amounts of memory (Matrix multiplication: (A*B)*C). The best CPUs have about 50GB/s while the best GPUs have 750GB/s memory bandwidth. So the more memory your computational operations require, the more significant the advantage of GPUs over CPUs. But there is still the latency that may hurt performance in the case of the GPU. A big truck may be able to pick up a lot of packages with each tour, but the problem is that you are waiting a long time until the next set of packages arrives. Without solving this problem, GPUs would be very slow even for large amounts of data. So how is this solved?</p><p>If you ask a big truck to make many tours to fetch packages you will always wait for a long time for the next load of packages once the truck has departed to do the next tour — the truck is just slow. However, if you now use a fleet of either Ferraris and big trucks (thread parallelism), and you have a big job with many packages (large chunks of memory such as matrices) then you will wait for the first truck a bit, but after that you will have no waiting time at all — unloading the packages takes so much time that all the trucks will queue in unloading location B so that you always have direct access to your packages (memory). This effectively hides latency so that GPUs offer high bandwidth while hiding their latency under thread parallelism — so for large chunks of memory GPUs provide the best memory bandwidth while having almost no drawback due to latency via thread parallelism. This is the second reason why GPUs are faster than CPUs for deep learning. As a side note, you will also see why more threads do not make sense for CPUs: A fleet of Ferraris has no real benefit in any scenario.</p><p>But the advantages for the GPU do not end here. This is the first step where the memory is fetched from the main memory (RAM) to the local memory on the chip (L1 cache and registers). This second step is less critical for performance but still adds to the lead for GPUs. All computation that ever is executed happens in registers which are directly attached to the execution unit (a core for CPUs, a stream processor for GPUs). Usually, you have the fast L1 and register memory very close to the execution engine, and you want to keep these memories small so that access is fast. Increased distance to the execution engine dramatically reduces memory access speed, so the larger the distance to access it the slower it gets. If you make your memory larger and larger, then, in turn, it gets slower to access its memory (on average, finding what you want to buy in a small store is faster than finding what you want to buy in a huge store, even if you know where that item is). So the size is limited for register files - we are just at the limits of physics here and every nanometer counts, we want to keep them small.</p><p>The advantage of the GPU is here that it can have a small pack of registers for every processing unit (stream processor, or SM), of which it has many. Thus we can have in total a lot of register memory, which is very small and thus very fast. This leads to the aggregate GPU registers size being more than 30 times larger compared to CPUs and still twice as fast which translates to up to 14MB register memory that operates at a whopping 80TB/s. As a comparison, the CPU L1 cache only operates at about 5TB/s which is quite slow and has the size of roughly 1MB; CPU registers usually have sizes of around 64-128KB and operate at 10-20TB/s. Of course, this comparison of numbers is a bit flawed because registers operate a bit differently than GPU registers (a bit like apples and oranges), but the difference in size here is more crucial than the difference in speed, and it does make a difference.</p><p>As a side note, full register utilization in GPUs seems to be difficult to achieve at first because it is the smallest unit of computation which needs to be fine-tuned by hand for good performance. However, NVIDIA has developed helpful compiler tools which indicate when you are using too much or too few registers per stream processor. It is easy to tweak your GPU code to make use of the right amount of registers and L1 cache for fast performance. This gives GPUs an advantage over other architectures like Xeon Phis where this utilization is complicated to achieve and painful to debug which in the end makes it difficult to maximize performance on a Xeon Phi.</p><p>What this means, in the end, is that you can store a lot of data in your L1 caches and register files on GPUs to reuse convolutional and matrix multiplication tiles. For example the best matrix multiplication algorithms use 2 tiles of 64x32 to 96x64 numbers for 2 matrices in L1 cache, and a 16x16 to 32x32 number register tile for the outputs sums per thread block (1 thread block = up to 1024 threads; you have 8 thread blocks per stream processor, there are 60 stream processors in total for the entire GPU). If you have a 100MB matrix, you can split it up in smaller matrices that fit into your cache and registers, and then do matrix multiplication with three matrix tiles at speeds of 10-80TB/s — that is fast! This is the third reason why GPUs are so much faster than CPUs, and why they are so well suited for deep learning.</p><p>Keep in mind that the slower memory always dominates performance bottlenecks. If 95% of your memory movements take place in registers (80TB/s), and 5% in your main memory (0.75TB/s), then you still spend most of the time on memory access of main memory (about six times as much).</p><p>Thus in order of importance: (1) High bandwidth main memory, (2) hiding memory access latency under thread parallelism, and (3) large and fast register and L1 memory which is easily programmable are the components which make GPUs so well suited for deep learning.</p><p>However, building your own deep learning rig is a pricey affair. Factor in costs of a fast and powerful GPU, CPU, SSD, compatible motherboard and power supply, air-conditioning bills, maintenance and damage to components. On top of it, you run the risk of falling behind on the latest hardware in this rapidly advancing industry.</p><p>Moreover, just assembling the components is not enough. You need to setup all the required libraries and compatible drivers before you can start training your first model. <a href="https://medium.com/towards-data-science/building-your-own-deep-learning-box-47b918aea1eb" rel="noopener noreferrer" target="_blank">People</a> still go along this route, and if you plan to use deep learning extensively (&gt;150 hrs/mo), <a href="https://medium.com/towards-data-science/build-a-deep-learning-rig-for-800-4434e21a424f" rel="noopener noreferrer" target="_blank">building your own deep learning workstation</a> might be the right move.</p><p>A better and cheaper alternative is to use cloud-based GPU servers provided by the likes of Amazon, Google, Microsoft and others, especially if you are just breaking into this domain and plan to use the computing power for learning and experimenting. I have been using AWS, Paperspace and FloydHub for the past 4–5 months. Google Cloud Platform and Microsoft Azure were similar to AWS in their pricing and offerings, hence, I stuck to the previously mentioned three.</p><p><br></p><p>However, the reason you WANT to use Cloud GPUs is because GPU Architecture</p><p><img src="https://cdn-images-1.medium.com/max/1200/1*kG_TBX339Kv-s4QtJuDpFg.png"></p><p>Source - <a href="https://www.nvidia.com/en-us/data-center/tesla-v100/" rel="noopener noreferrer" target="_blank">https://www.nvidia.com/en-us/data-center/tesla-v100/</a></p><p><br></p><h4><strong>Now that we know we want to use GPUs - Sign up and Create a Gradient Notebook with Keras and TensorFlow pre-installed</strong></h4><p><br></p><p>In my opinion <strong>PaperSpace's </strong>Gradient containers are some of quickest and simplest ways to started using a cloud GPU. In a matter of minutes you'll be training a CNN!</p><p>Of course you're free to use other Cloud GPU services, AWS, Azure, Floydhub, Google Cloud (GPUs and TPUs), Vast.AI and many others.</p><p>Here's an entite list of Cloud GPU providers - <a href="https://towardsdatascience.com/list-of-deep-learning-cloud-service-providers-579f2c769ed6" rel="noopener noreferrer" target="_blank">https://towardsdatascience.com/list-of-deep-learning-cloud-service-providers-579f2c769ed6</a></p><p><br></p><h4><strong>Using PaperSpace's GPU and Gradient Notebooks</strong></h4><p><strong>Step 1</strong></p><p>Go to <a href="http://www.paperspace.com" rel="noopener noreferrer" target="_blank">www.paperspace.com</a></p><p><strong>Step 2</strong></p><p>Sign up using your GitHub account or a regular email and password:</p><figure><img height="548" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-01-28-2731b9ed9c55a05ef3173c51e167b280.jpg" width="772"></figure><p><br></p><p><strong>Step 3A - Adding Credit Card and using my Referral Code to get $10 Free Credit</strong></p><p>Verify your email and sign in to PaperSpace, you'll be greeted by this page. It may not immediately appear, but notice <strong>Error Alert box in pink below (see yellow box I drew over it)</strong>.</p><figure><img height="669" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-20-58-1f3fa9e8e3520ee03e0b3c210236ca08.jpg" width="776"></figure><p><br></p><p><strong>STEP 3B - Go to the Billing Page</strong></p><figure><img height="664" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-24-18-8946a909736e325b2a3ea3d1bd664c59.jpg" width="770"></figure><p><br></p><p><strong>Step 3C</strong> - Enter Credit Card Info and Referral code </p><h4><strong>Get $5 Free Credit by using Referral Code - 2DSSNCI</strong></h4><p><strong>or click below</strong></p><p><a href="https://paperspace.io/&amp;R=2DSSNCI" rel="noopener noreferrer" target="_blank">https://paperspace.io/&amp;R=2DSSNCI</a></p><p>This gives you ~10 hours of usage on a P4000 GPU</p><p>After the user must enter their credit card information, otherwise you won't be able to launch a gradient machine<img height="711" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-25-51-fbc69be7fa153b677f79199325f8bc3a.jpg" width="813"></p><p><br></p><p><strong>Step 4A - Creating a Gradient Notebook</strong></p><p>Go back to the <strong>Home </strong>landing page and click on the Gradient Box shown below.</p><figure><img height="704" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-41-53-11d27f6f8a2ce11310545188c0281cfc.jpg" width="796"></figure><p><br></p><p><strong>Step 4B</strong> - Click on box in yellow to create your first notebook</p><figure><img height="716" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-41-53-0c7321f58c7d5390e46b0bd13f3d1130.jpg" width="816"></figure><p><strong>Step 4C</strong> -Under the<strong> Public Containers Tab</strong> (should be shown by default), click on the u<strong>foym/deepo:all-py36-jupyter</strong> container box (shown in the yellow box below).</p><p>This container contains many essential Deep Learning Libraries including Keras and TensorFlow. The main libraries included are:</p><ul><li><p>darknet latest (git)</p></li><li><p>python 3.6 (apt)</p></li><li><p>torch latest (git)</p></li><li><p>chainer latest (pip)</p></li><li><p>jupyter latest (pip)</p></li><li><p>mxnet latest (pip)</p></li><li><p>onnx latest (pip)</p></li><li><p>pytorch latest (pip)</p></li><li><p>tensorflow latest (pip)</p></li><li><p>theano latest (git)</p></li><li><p>keras latest (pip)</p></li><li><p>lasagne latest (git)</p></li><li><p>opencv 4.0.1 (git)</p></li><li><p>sonnet latest (pip)</p></li><li><p>caffe latest (git)</p></li><li><p>cntk latest (pip)</p></li></ul><figure><img height="466" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-17_04-19-17-440063cb83a37d599dcbc4f53114dc70.jpg" width="762"></figure><p><strong>Step 4D </strong>- Select the following instance, the <strong>P4000</strong>. Note feel free to select other more expensive GPUs, but as a cost per value system, the P4000 is excellent. </p><figure><img height="727" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_22-22-15-46c43d464f8855522ce902c70489b44b.jpg" width="767"></figure><p><strong>Step 4E</strong> - Name your Notebook (or keep the default name, it's your choice). I chose<em> "DL CV P4000"</em>.</p><p><strong>Next </strong>you should see the Create Notebook button below in green, click that to create your notebook!</p><p><strong>Note</strong>: You can ignore the 04. Auto-Shutdown are for now, but remember to shutdown your notebook after use (will show you how soon) so that your credit does not run down.</p><figure><img height="690" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_21-41-53-180261afb3d050de2477dee1716e0e84.jpg" width="748"></figure><p><strong><br>Step 5 - Launching your notebook</strong></p><p>Your notebook will now show up in the Notebook section shown below.</p><p>Click the greyed out <strong>Start </strong>button (located in the Actions column) to boot it up.</p><figure><img height="509" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_22-30-22-6d05506ee36de6474990c673cde31048.jpg" width="764"></figure><p>This window will now appear, click <strong>Start Notebook </strong>below. </p><figure><img height="812" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-16_22-30-22-b6600eb8d9457dce80b1655c85ed081d.jpg" width="796"></figure><p>Status will change from <strong>Stopped </strong>to <strong>Pending </strong>-&gt; <strong>Provisioned </strong>-&gt; <strong>Running</strong></p><p>Once it's running you'll be able to launch it by pressing <strong>Open</strong></p><figure><img height="262" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-17_04-24-00-a14237a295cc12888557706d1ec983e7.jpg" width="766"></figure>
28. BONUS - Use Cloud GPUs on PaperSpace/2. Train a AlexNet on PaperSpace.html ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Training  AlexNet on the CIFAR10 Dataset</strong></h4><p><br></p><p>Upon clicking start from our previous section, we'll now be greeted with this screen:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_05-55-38-8807e87ad20b67082d7a81ddc676a32c.JPG"></figure><p>Looks a fairly harmless Notebook, but this cloud system will allow us to training AlexNet almost 100X faster!</p><p>But first, let's explore our new PaperSpace Gradient Notebook.</p><p>Observe the two directories that are setup (see above) by the PaperSpace Gradient system:</p><ul><li><p><strong>datasets </strong>- contains some common datasets some of which you'd recognize from earlier in our course, these are there as test data to check performance of new CNNs or even to aid in transfer learning etc. (See below for a screenshot)</p></li><li><p><strong>storage </strong>- This is where you'd preferably want to keep files (ipynb notebooks and data) (see below)</p></li></ul><p><strong>Datasets </strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_05-59-00-9a7284073358f7814397a6c36cbc5b14.JPG"></figure><p><strong>Storage</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_05-59-47-7b0b7bf1bfbe814402932a98f2b1ab1a.png"></figure><p><strong>Note</strong>: data stored in the storage folder is persistent across all gradient machines in your account.</p><p>Data saved in the main workspace area (i.e. the default directory which is /home/paperspace) will be lost when your terminate the session - do not save important files there or they will be lost!</p><p><strong>lost+found</strong> can be ignored as it's a directory created in Unix systems where corrupted data is temporarily stored (if you do lose files it's possible they maybe stored there)</p><p><br></p><p><strong>There are two easy ways to use PaperSpace Gradients</strong></p><p>1. Go to the <strong>storage </strong>directory Click on New and....</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_06-06-40-9cfc7503e060112e2a3ac23fd1eb21a0.png"></figure><p>Start a new Python3 notebook</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_06-07-19-a39201c771b5bf3630df8ac67a936b53.JPG"></figure><p>This launches a new python3 notebook where you can import Keras, TensorFlow etc and use just like you do on your local machine or Virtual Machine. Except we're now taking advantage of their lightening quick GPUs! Feel free to copy and paste your existing code into these notebooks.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_06-09-34-9d435dac8a1c8f5da9db5454054de554.JPG"></figure><h4><strong>OR</strong></h4><p>2. Upload your existing notebooks and datasets by using the Upload button. (see resources for AlexNet CIFAR10 .ipynb)</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_06-10-04-b5db0ae7ef26b044b83e1fcf1b986c00.png"></figure><p><strong>It's that simple!</strong></p><p><br></p><h4><strong>Now let's try our AlexNet CIFAR10 CNN</strong></h4><p><br></p><p>Either upload the notebook in the attached resources or copy and paste this code into </p><p><br></p><pre class="prettyprint linenums">#Keras Imports and Loading Our CIFAR10 Dataset
2
+ from __future__ import print_function
3
+ import keras
4
+ from keras.datasets import cifar10
5
+ from keras.preprocessing.image import ImageDataGenerator
6
+ from keras.models import Sequential
7
+ from keras.layers import Dense, Dropout, Activation, Flatten
8
+ from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
9
+ from keras.layers.normalization import BatchNormalization
10
+ from keras.regularizers import l2
11
+
12
+ # Loads the CIFAR dataset
13
+ (x_train, y_train), (x_test, y_test) = cifar10.load_data()
14
+
15
+ # Display our data shape/dimensions
16
+ print('x_train shape:', x_train.shape)
17
+ print(x_train.shape[0], 'train samples')
18
+ print(x_test.shape[0], 'test samples')
19
+
20
+ # Now we one hot encode outputs
21
+ num_classes = 10
22
+ y_train = keras.utils.to_categorical(y_train, num_classes)
23
+ y_test = keras.utils.to_categorical(y_test, num_classes)</pre><p><br></p><pre class="prettyprint linenums">#Defining our AlexNet Convolutional Neural Network
24
+ l2_reg = 0
25
+
26
+
27
+ # Initialize model
28
+ model = Sequential()
29
+
30
+ # 1st Conv Layer
31
+ model.add(Conv2D(96, (11, 11), input_shape=x_train.shape[1:],
32
+ padding='same', kernel_regularizer=l2(l2_reg)))
33
+ model.add(BatchNormalization())
34
+ model.add(Activation('relu'))
35
+ model.add(MaxPooling2D(pool_size=(2, 2)))
36
+
37
+ # 2nd Conv Layer
38
+ model.add(Conv2D(256, (5, 5), padding='same'))
39
+ model.add(BatchNormalization())
40
+ model.add(Activation('relu'))
41
+ model.add(MaxPooling2D(pool_size=(2, 2)))
42
+
43
+ # 3rd Conv Layer
44
+ model.add(ZeroPadding2D((1, 1)))
45
+ model.add(Conv2D(512, (3, 3), padding='same'))
46
+ model.add(BatchNormalization())
47
+ model.add(Activation('relu'))
48
+ model.add(MaxPooling2D(pool_size=(2, 2)))
49
+
50
+ # 4th Conv Layer
51
+ model.add(ZeroPadding2D((1, 1)))
52
+ model.add(Conv2D(1024, (3, 3), padding='same'))
53
+ model.add(BatchNormalization())
54
+ model.add(Activation('relu'))
55
+
56
+ # 5th Conv Layer
57
+ model.add(ZeroPadding2D((1, 1)))
58
+ model.add(Conv2D(1024, (3, 3), padding='same'))
59
+ model.add(BatchNormalization())
60
+ model.add(Activation('relu'))
61
+ model.add(MaxPooling2D(pool_size=(2, 2)))
62
+
63
+ # 1st FC Layer
64
+ model.add(Flatten())
65
+ model.add(Dense(3072))
66
+ model.add(BatchNormalization())
67
+ model.add(Activation('relu'))
68
+ model.add(Dropout(0.5))
69
+
70
+ # 2nd FC Layer
71
+ model.add(Dense(4096))
72
+ model.add(BatchNormalization())
73
+ model.add(Activation('relu'))
74
+ model.add(Dropout(0.5))
75
+
76
+ # 3rd FC Layer
77
+ model.add(Dense(num_classes))
78
+ model.add(BatchNormalization())
79
+ model.add(Activation('softmax'))
80
+
81
+ print(model.summary())
82
+
83
+ model.compile(loss = 'categorical_crossentropy',
84
+ optimizer = keras.optimizers.Adadelta(),
85
+ metrics = ['accuracy'])</pre><p><br></p><pre class="prettyprint linenums"># Training Parameters
86
+ batch_size = 32
87
+ epochs = 10
88
+
89
+ history = model.fit(x_train, y_train,
90
+ batch_size=batch_size,
91
+ epochs=epochs,
92
+ validation_data=(x_test, y_test),
93
+ shuffle=True)
94
+
95
+ model.save("./Models/CIFAR10_AlexNet_10_Epoch.h5")
96
+
97
+ # Evaluate the performance of our trained model
98
+ scores = model.evaluate(x_test, y_test, verbose=1)
99
+ print('Test loss:', scores[0])
100
+ print('Test accuracy:', scores[1])</pre><p><strong>Our Cloud GPU as trained almost 60-100X faster than our CPU!</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_07-36-41-b5a56c95d46e10a12f1ab13fbd1ea5bd.png"></figure><p>We've trained 10 EPOCHs are 80% Accuracy in just bout 45 minutes. A CPU only system that would have taken over well over 24 hours.</p><p><br></p><p><strong>Our Training Loss and Accuracy Graphs</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_07-44-07-a687596eb0b9344a5634548668ed7560.JPG"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-18_07-44-07-bdd067ec83d59a05da0ad2399ed246ce.JPG"></figure>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/1. Install and Run Flask.html ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Introduction</strong></h4><p><br></p><p>We're about to create a <strong>Computer Vision (CV) API </strong>and<strong> Web App.</strong> But first, why would we want to do this? </p><ul><li><p>Ever wanted to share your Computer Vision App on the web to show a friend using their PC&nbsp;or Phone?</p></li><li><p>Want to sell or commercialize your python computer vision code, but not sure how to distribute Python?</p></li></ul><p>Well if you want to make your CV code available to the world you need to make it accessible from the web.&nbsp;There are three main ways you can do this:</p><p>1. Create an <strong>API</strong>, which allows any other application (whether it's an iOS, Android or within any other software) to access it via the API&nbsp;protocol. APIs are Application Programming Interfaces that facilitate access to complex external applications via a simplified interface. In our example, we'll be using the RESTful&nbsp;API protocol which allows us to run our Python CV via the web.</p><p>2. Create a <strong>Web App</strong>, this is basically the same as an API, except we'll be allowing users to access or run our CV app via any web browser.</p><p>3. Run the CV code natively on another platform. This is the more difficult option is it requires you to re-code your Python CV code into another language. This is far more difficult and requires you learning an entirely new language e.g. Java if making an Android App. But the local access reduces the need to send data over the web and can perform real-time operations. This method won't be shown in this tutorial.</p><h4></h4><p><br></p><h4><strong>Introducing Flask </strong></h4><p><br></p><p>Before we begin installing Flask and setting up everything on AWS, let me first explain what exactly is <strong>Flask,</strong> and why we need it.</p><figure><img height="138" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_22-32-59-b6e6be615ec9b18d3f348b500492faca.JPG" width="340"></figure><p><strong>Flask </strong>is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or libraries. It has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions. This allows us to easily build simple APIs without much effort. </p><p><strong>Step 1: Installing Flask</strong></p><p>If using your VM, activate your environment (so that we can use OpenCV and Keras).</p><p><code>source activate cv</code> or whatever your environment is named </p><p> <code>pip install flask</code></p><p>That's it, Flask is ready to go!</p><p>Let's run a simple Flask test.</p><p><strong>Step 2:&nbsp;</strong>Installing a code editor since we won't be using Ipython Notebooks. I recommend Sublime, but VSCode, Atom,&nbsp;Spyder, and PyCharm are all great!</p><p>If using Windows or Mac, visit the Sublime website to get the download link - <a href="https://www.sublimetext.com/download" rel="noopener noreferrer" target="_blank">https://www.sublimetext.com/download</a></p><p>Ubuntu/VM users can bring up the Ubuntu Software app, search for Sublime and install it from there.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_22-53-13-6613cddd9daf61ecf9d07c4d56e2c38c.JPG"></figure><p>Open up your chosen Text/Code Editor and copy the code below into it and save it in an easy to find directory, e.g. </p><p><code>/home/deeplearningcv/MyFlaskProjects</code></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-00-53-bfd6feec89e093ff03e58bc52e8d3e94.JPG"></figure><p>This is code for your first simple Flask app, the 'hello world' app. </p><p><strong>Also, see Resources to download all code required for this section</strong></p><pre class="prettyprint linenums">from flask import Flask
2
+ app = Flask(__name__)
3
+
4
+ @app.route("/")
5
+ def hello():
6
+ return "Hello World!"
7
+
8
+ if __name__ == "__main__":
9
+ app.run()</pre><p>Once your code is saved in the path, use Terminal or Command Prompt and navigate to the folder where you saved your helloworld.py file.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-26-30-dbbc9629d30892671866580522faea9f.JPG"></figure><p>Upon executing this code, you'll see the following: </p><p>This means your server is running successfully and you can now go to your web browser and view your hello world project!</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-32-38-e41640a083fa6900739c4eddf0143da9.JPG"></figure><p>You should see this if you enter <code>127.0.0.1:5000 </code>in your browser. This means you Flask server is pushing/serving the text "Hello World!" to your localhost on Port 5000. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-34-14-2a757fe288b275bff67d55a208fef895.JPG"></figure><p>Remember to press <strong>CTRL+C t</strong>o exit (when in Terminal).</p><p>Congratulations! Now you're ready to start testing your Computer&nbsp;Vision&nbsp;API locally, which we'll do in the next chapter!</p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/1.1 Download CV API Files.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1ScD6jTBcYW5RkTP1OwCCFoWXYYtZJguJ/view?usp=sharing";</script>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/2. Running Your Computer Vision Web App on Flask Locally.html ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Our Flask&nbsp;WebApp Template Code </strong></h4><p><br></p><p>In the previous chapter, we created a simple Hello World app. We're now going create a simple App that does the following:</p><ul><li><p>Allows you to upload an image</p></li><li><p>Uses OpenCV and Keras to do some operations on the image</p></li><li><p>Returns the outputs of our operations to the user</p></li></ul><p>We'll be making a simple App that users OpenCV to find the dominant color (Red vs Green vs Blue) and determines whether the animal in the picture is a Cat or Dog.</p><p><br></p><p><strong>Our Web&nbsp;App 1:</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-58-35-80e8fa501f933728e7f9ed8d002e9713.JPG"></figure><p><strong>Our Web&nbsp;App 2:</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-59-43-e3f4f1bffc29b2e9a51a7a354e9d87ac.JPG"></figure><p><strong>Our Web&nbsp;App 3:</strong></p><figure><img height="261" src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-58-35-170c078ea8cbb82743a7f77bbaf78b6a.JPG" width="434"></figure><p><strong>Our Web&nbsp;App 4:</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-22_23-58-35-98e72167229d356fe6e3429e79a43e33.JPG"></figure><p><br></p><h4><strong>NOTE</strong>: Download the code in the resources section</h4><p>The code for our web app is as follows:</p><pre class="prettyprint linenums">import os
2
+ from flask import Flask, flash, request, redirect, url_for, jsonify
3
+ from werkzeug.utils import secure_filename
4
+ import cv2
5
+ import keras
6
+ import numpy as np
7
+ from keras.models import load_model
8
+ from keras import backend as K
9
+
10
+ UPLOAD_FOLDER = './uploads/'
11
+ ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg'])
12
+ DEBUG = True
13
+ app = Flask(__name__)
14
+ app.config.from_object(__name__)
15
+ app.config['SECRET_KEY'] = '7d441f27d441f27567d441f2b6176a'
16
+ app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
17
+
18
+ def allowed_file(filename):
19
+ return '.' in filename and \
20
+ filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
21
+
22
+ @app.route('/', methods=['GET', 'POST'])
23
+ def upload_file():
24
+ if request.method == 'POST':
25
+ # check if the post request has the file part
26
+ if 'file' not in request.files:
27
+ flash('No file part')
28
+ return redirect(request.url)
29
+ file = request.files['file']
30
+ if file.filename == '':
31
+ flash('No selected file')
32
+ return redirect(request.url)
33
+ if file and allowed_file(file.filename):
34
+ filename = secure_filename(file.filename)
35
+ file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
36
+ image = cv2.imread(os.path.dirname(os.path.realpath(__file__))+"/uploads/"+filename)
37
+ color_result = getDominantColor(image)
38
+ result = catOrDog(image)
39
+ redirect(url_for('upload_file',filename=filename))
40
+ return '''
41
+ &lt;!doctype html&gt;
42
+ &lt;title&gt;Results&lt;/title&gt;
43
+ &lt;h1&gt;Image contains a - '''+result+'''&lt;/h1&gt;
44
+ &lt;h2&gt;Dominant color is - '''+color_result+'''&lt;/h2&gt;
45
+ &lt;form method=post enctype=multipart/form-data&gt;
46
+ &lt;input type=file name=file&gt;
47
+ &lt;input type=submit value=Upload&gt;
48
+ &lt;/form&gt;
49
+ '''
50
+ return '''
51
+ &lt;!doctype html&gt;
52
+ &lt;title&gt;Upload new File&lt;/title&gt;
53
+ &lt;h1&gt;Upload new File&lt;/h1&gt;
54
+ &lt;form method=post enctype=multipart/form-data&gt;
55
+ &lt;input type=file name=file&gt;
56
+ &lt;input type=submit value=Upload&gt;
57
+ &lt;/form&gt;
58
+ '''
59
+
60
+ def catOrDog(image):
61
+ '''Determines if the image contains a cat or dog'''
62
+ classifier = load_model('./models/cats_vs_dogs_V1.h5')
63
+ image = cv2.resize(image, (150,150), interpolation = cv2.INTER_AREA)
64
+ image = image.reshape(1,150,150,3)
65
+ res = str(classifier.predict_classes(image, 1, verbose = 0)[0][0])
66
+ print(res)
67
+ print(type(res))
68
+ if res == "0":
69
+ res = "Cat"
70
+ else:
71
+ res = "Dog"
72
+ K.clear_session()
73
+ return res
74
+
75
+ def getDominantColor(image):
76
+ '''returns the dominate color among Blue, Green and Reds in the image '''
77
+ B, G, R = cv2.split(image)
78
+ B, G, R = np.sum(B), np.sum(G), np.sum(R)
79
+ color_sums = [B,G,R]
80
+ color_values = {"0": "Blue", "1":"Green", "2": "Red"}
81
+ return color_values[str(np.argmax(color_sums))]
82
+
83
+ if __name__ == "__main__":
84
+ app.run()
85
+
86
+ </pre><p>Running our web app in Terminal (or Command Prompt, it should be exactly the same). </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_00-01-02-ece80015f52c109880a2dcb76dfeede8.JPG"></figure><p><br></p><p><strong>Brief Code Description:</strong></p><p><strong>Lines 1 to 8:</strong> Importing our flask functions and other relevant functions/Libraries we'll be using in this program such as <em>werkzeug, keras, numpy </em>and <em>opencv</em></p><p><strong>Line 10 to 16: </strong>Defining our paths, allowed files and setting our Flask Parameters. Look up the Flask documentation to better understand</p><p><strong>Line 18 to 10:</strong> Our 'allowed files function', it simply checks the extension of the selected file to ensure only images are uploaded.</p><p><strong>Line 22:</strong>&nbsp;The route()&nbsp;function of Flask. It is a decorator that tells the Application which URL should call the associated function.</p><p><strong>Line 23 to 58:</strong> Our main function, it results to both GET&nbsp;or POST&nbsp;Requests.&nbsp;These are HTTP&nbsp;methods that form the basis of data communication over the internet. GET Sends data in unencrypted form to the server. Most common method. POS is used to send HTML form data to the server. Data received by POST method is not cached by the server. We're using an unconventional method of serving the HTML to our client Typically Flask apps used templates stored in a Templates folder which contains our HTML. However, this is a simple web app and it's better for your understanding if we server the HTML like this. Note we have two blocks of code, one is the default HTML used prompting the user to upload an image. The second block sends the response to the user. </p><p><strong>Line 60 to 73:&nbsp;</strong>This is our Cats vs Dogs function that takes an image and outputs a string stating which animal is found in the image, either "Cat" or "Dog".</p><p><strong>Line 75 to 81: </strong>This function sums all the Blue, Green and Red color components of an image and returns the color with the largest sum. </p><p><strong>Line 83 to 84:</strong> Our main code that runs the Flask app by calling the app.run() function. </p><p><br></p><p><strong>Our Folder Setup:</strong></p><p><code>MyFlaskProjects/</code></p><p><code>------Models/&nbsp;</code>(where our Keras catsvsdogs.h5 is stored)</p><p><code>------Uploads/</code> (where our uploaded files are stored)</p><p><code>------webapp.py</code> </p><p><br></p><p>Feel free to experiment with different images! Let's now move on to a variation of this code that acts as a standalone API.</p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/3. Running Your Computer Vision API.html ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Our Flask API Template Code </strong></h4><p><br></p><p>In the previous chapter, we created a Web App that's accessible via the web browser. Pretty cool! but what if we wanted to call this API from different Apps e.g. a Native Android or iOS App?</p><p><br>Let's turn it into RESTful API that returns simple JSON responses encapsulating the results. </p><p><strong>NOTE</strong>: A <strong>RESTful API</strong> is an application program interface (<strong>API</strong>) that uses HTTP requests to GET, PUT, POST and DELETE data.</p><p><br></p><p><strong>Step 1: </strong>Firstly, install <strong>Postman </strong>to test our API </p><ul><li><p>Windows/Max - <a href="https://www.getpostman.com/downloads/" rel="noopener noreferrer" target="_blank">Download and install here </a></p></li><li><p>Ubuntu Users - Launch Ubuntu Software and Install </p></li></ul><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_06-16-17-a75ad1a8a678b5837139fcab17cd4f6f.JPG"></figure><p><strong>Step 2: </strong>Our Flask API Code</p><pre class="prettyprint linenums">import os
2
+ from flask import Flask, flash, request, redirect, url_for, jsonify
3
+ from werkzeug.utils import secure_filename
4
+ import cv2
5
+ import numpy as np
6
+ import keras
7
+ from keras.models import load_model
8
+ from keras import backend as K
9
+
10
+ UPLOAD_FOLDER = './uploads/'
11
+ ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg'])
12
+ DEBUG = True
13
+ app = Flask(__name__)
14
+ app.config.from_object(__name__)
15
+ app.config['SECRET_KEY'] = '7d441f27d441f27567d441f2b6176a'
16
+ app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
17
+
18
+ def allowed_file(filename):
19
+ return '.' in filename and \
20
+ filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
21
+
22
+ @app.route('/', methods=['GET', 'POST'])
23
+ def upload_file():
24
+ if request.method == 'POST':
25
+ # check if the post request has the file part
26
+ if 'file' not in request.files:
27
+ flash('No file part')
28
+ return redirect(request.url)
29
+ file = request.files['file']
30
+ # if user does not select file, browser also
31
+ # submit an empty part without filename
32
+ if file.filename == '':
33
+ flash('No selected file')
34
+ return redirect(request.url)
35
+ if file and allowed_file(file.filename):
36
+ filename = secure_filename(file.filename)
37
+ file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
38
+ image = cv2.imread(os.path.dirname(os.path.realpath(__file__))+"/uploads/"+filename)
39
+ color_result = getDominantColor(image)
40
+ dogOrCat = catOrDog(image)
41
+ #return redirect(url_for('upload_file',filename=filename)), jsonify({"key":
42
+ return jsonify({"MainColor": color_result, "catOrDog": dogOrCat} )
43
+ return '''
44
+ &lt;!doctype html&gt;
45
+ &lt;title&gt;API&lt;/title&gt;
46
+ &lt;h1&gt;API Running Successfully&lt;/h1&gt;'''
47
+
48
+ def catOrDog(image):
49
+ '''Determines if the image contains a cat or dog'''
50
+ classifier = load_model('./models/cats_vs_dogs_V1.h5')
51
+ image = cv2.resize(image, (150,150), interpolation = cv2.INTER_AREA)
52
+ image = image.reshape(1,150,150,3)
53
+ res = str(classifier.predict_classes(image, 1, verbose = 0)[0][0])
54
+ print(res)
55
+ print(type(res))
56
+ if res == "0":
57
+ res = "Cat"
58
+ else:
59
+ res = "Dog"
60
+ K.clear_session()
61
+ return res
62
+
63
+ def getDominantColor(image):
64
+ '''returns the dominate color among Blue, Green and Reds in the image '''
65
+ B, G, R = cv2.split(image)
66
+ B, G, R = np.sum(B), np.sum(G), np.sum(R)
67
+ color_sums = [B,G,R]
68
+ color_values = {"0": "Blue", "1":"Green", "2": "Red"}
69
+ return color_values[str(np.argmax(color_sums))]
70
+
71
+
72
+ if __name__ == "__main__":
73
+ app.run()</pre><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_00-48-49-5db46541ad63d65d835cdd569211c973.JPG"></figure><p><strong>Using Postman (see image above and the corresponding numbered steps below:</strong></p><ol><li><p>Change the protocol to POST</p></li><li><p>Enter the local host address: http://127.0.0.1:5000/</p></li><li><p>Change Tab to Body</p></li><li><p>Select the form-data radio button</p></li><li><p>From the drop-down, select Key type to be file</p></li><li><p>For Value, select one of our test images</p></li><li><p>Click send to send our image to our API</p></li><li><p>Our response will be shown in the window below.</p></li></ol><p>The output is JSON file containing:</p><pre class="prettyprint linenums">{
74
+ "MainColor": "Red",
75
+ "catOrDog": "Cat"
76
+ }</pre><p>The cool thing about using Postman is that we can generate the code to call this API in several different languages:</p><p>See blue box to bring up the code box:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_00-54-19-04fa84ce24394a4b9df2347dee6ac632.jpg"></figure><p>Code Generator</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_00-55-22-1aaeea8249c6c70d464a1ece0466b273.JPG"></figure><p><strong>Now that you've got your Flask API and Web App working, let's look at deploying this on AWS using an EC2 Instance.</strong></p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/4. Setting Up An AWS Account.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <h4><strong>Setting up AWS</strong></h4><p><br></p><p><strong>Step 1: </strong>Go to <a href="https://aws.amazon.com/" rel="noopener noreferrer" target="_blank">https://aws.amazon.com/</a> and click on Complete Sign Up </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-06-56-916201cbff1910166f9da1db21d28155.JPG"></figure><p><strong>Step 2: </strong>Create New Account</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-08-34-ff0bf36b97e2c4609d84a5e41b302e7a.JPG"></figure><p><strong>Step 3:</strong> Enter your main account details </p><p><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-05-33-d59697bc4760ae5f6188b43575779bbf.JPG"></p><p><strong>Step 4: </strong>Enter Contact Info</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-10-38-f288864877ba579880e7dcf59fca8bc2.JPG"></figure><p><strong>Step 5: </strong>Enter your verification code</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-11-02-19874a11e64e4bb498c83e8f009f7519.JPG"></figure><p><strong>Step 6: </strong>Select the Free Basic Plan</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-05-33-41a548a0f53eab00e4df36218f53f6d9.JPG"></figure><p><strong>Step 7: </strong>Sign up is now complete! Sign in to your account now.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-05-33-cd82af8c20fc8cb5fd4382c0b02e599a.JPG"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_01-05-33-5276c92f7ea948c9ca03d7d26fbe8368.JPG"></figure><p>Your AWS Account is Complete!</p><p>Let's now create our EC2 Instance.</p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/5. Setting Up Your AWS EC2 Instance & Installing Keras, TensorFlow, OpenCV & Flask.html ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ <h4><strong>Launching Your E2C Instance</strong></h4><p><br></p><p><strong>Step 1: </strong>Your landing page upon logging in should be the AWS Management Console (see below). Click on the area highlighted in yellow that says "<strong>Launch a virtual Machine" with EC2. </strong></p><p><strong>What is EC2?</strong> Amazon Elastic Compute Cloud <strong>(Amazon EC2</strong>) provides scalable computing capacity in the Amazon Web Services (AWS) cloud. Using Amazon EC2 eliminates your need to invest in hardware up front, so you can develop and deploy applications faster.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-0aee9b0a53c5ff7d333b8dd78c366150.png">.</figure><p><strong>Step 2: </strong>As a Basic Plan (FREE) User, you're eligible to launch and use any of the "Free tier" AMIs. The one we'll be using in this project is the <strong>Ubuntu Server 18.04 LTS (HVM), SSD Volume Type. </strong>See image below, you'll have to scroll down a bit to find it amongst the many AMIs. We use this image as it's quite easy to get everything up and running.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-4d7f42fec9fa73be91dd3f3a3220528e.JPG"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-d96a0fbdc7d781c023e2757b48448c94.png"></figure><p><strong>Step 3: </strong>Choosing an Instance Type - We already have chosen the type of OS we want to use (Ubuntu 18.04), now we have to choose a hardware configuration. We aren't given much of a choice due to our Free Tier limitations. The <strong>t2.micro </strong>will suffice for our application needs for now. Click<strong> Review and Launch</strong> (highlighted in yellow)</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-ed95c62d0d19c294264eaeb85edcccca.png"></figure><p><strong>Step 4: </strong>Reviewing and launching. No need to change anything here, the defaults are fine. Click <strong>Launch </strong>(highlighted in yellow)</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-0025d77e49af0f8a63c75d9e48eaaff7.png"></figure><p><strong>Step 5A:</strong> Creating a new Key Pair. If this is your first time you will need to create a new Key Pair. Click on the drop-down highlighted in yellow below. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-86bd4ffddae1ee4571d968a8f83eb03c.png"></figure><p><strong>Step 5B:</strong> GIve you key a name e.g. my_aws_key and Download Key Pair (please don't lose this file otherwise you won't be able to log in to your EC2 instance). </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-abdb2db33153383c11e9b9b2ba8d3507.png"></figure><p><strong>Step 5C:</strong> Your Instance has been launched, it will take a few mins for AWS to create and have your instance up and running </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-4cb08dac37637149ccea79e3ac7c5308.JPG"></figure><p><strong>Step 6A: </strong>Clicking View instances brings up the EC2 Dashboard. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_04-43-09-5ca178d2ec6fd40b8d12cb7cfb4be0aa.JPG"></figure><p><strong>Step 6B: </strong>To view the details of the individual instance you just launched, click on Instances under Instances on the left panel. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_04-42-05-97d85a5b7eddd3f6b1daa5f13e628bbb.JPG"></figure><p><strong>Step 7:</strong> Viewing your Instance public IP. Note your IP as you'll need it later when connecting to your instance.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-7078b5ec1ac74f7cef875bcb375f4491.JPG"></figure><p><strong>Step 8:</strong> Assuming you're using Ubuntu on the VM (if not it's highly recommended as I find using Putty and SSH keys over windows to be sometimes problematic). Note Windows isn't always, so you're welcome to try using the guide here: <a href="https://docs.aws.amazon.com/codedeploy/latest/userguide/tutorials-windows-launch-instance.html" rel="noopener noreferrer" target="_blank">https://docs.aws.amazon.com/codedeploy/latest/userguide/tutorials-windows-launch-instance.html</a>) </p><p>Mac instructions would be the same as the Ubuntu instructions.</p><p>Let's chmod your downloaded key to protect it from overwriting. If you've never used ssh on your VM, create a ./ssh folder in the home director. The following commands also assume your key is in the downloaded directory.</p><pre class="prettyprint linenums">mkdir ${HOME}/.ssh
2
+ cp ~/Downloads/my_aws_key.pem ~/.ssh
3
+ chmod 400 ~/.ssh/my_aws_key.pem
4
+ </pre><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-801b1770189ee4cf1c938303db5c8e1a.JPG"></figure><p><strong>Step 9: </strong>Connecting to your EC2 instance via Terminal, by running this:</p><pre class="prettyprint linenums">ssh -i ~/.ssh/my_aws_key.pem ubuntu@18.224.96.38</pre><p>You'll be prompted by the following, type yes and hit Enter.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_04-55-08-2c5ae7fb992ca463892c46e286737ec1.JPG"></figure><p>You'll now be greeted by this! Congratulations, you've connected to your EC2 Instance successfully!</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-2dd7621c922e4b091bff0b7ad3946f5a.JPG"></figure><p><strong>Step 10:</strong> Installing the necessary packages by executing the following commands:</p><pre class="prettyprint linenums">sudo apt update
5
+ sudo apt install python3-pip
6
+ sudo pip3 install tensorflow
7
+ sudo pip3 install keras
8
+ sudo pip3 install flask</pre><p><strong>Note</strong>: As of April 2019, <code>sudo pip3 install opencv-python</code> appears to be broken, if importing cv2 fails, run <code>sudo apt install python3-opencv</code></p><pre class="prettyprint linenums">sudo pip3 install opencv-python
9
+ sudo apt install python3-opencv </pre><p>Screenshots of the TensorFlow install:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-24960fe6ebff21df1e02c50823526ade.jpg"></figure><p>Screenshots of the Keras install:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-403c9055c8dddcf04c44cff5c7a2ec17.jpg"></figure><p>Screenshots of the Flask install:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_03-06-48-cf2ee18e183e6455ff9d01957dece8af.jpg"></figure><p>We've successfully installed all libraries needed for our API and Web App on our EC2 Instance!</p><p><strong>In the next chapters, we'll:</strong></p><ul><li><p>Setup Filezilla to transfer our code to your EC2</p></li><li><p>Setup our Security Group on AWS to allow TCP traffic.</p></li></ul>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/6. Changing your EC2 Security Group.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <h4><strong>Changing your EC2 Security Group to allow HTTP traffic</strong></h4><p><br></p><p><strong>Step 1:</strong> Go to <strong>Instances </strong>on the left panel of your AWS EC2 Console and click on <strong>Instances. </strong>Then click on the blue text under the <strong>Security Groups column.</strong> It may either be named '<strong>default' </strong>or '<strong>launch-wizard-1'. </strong>This brings up the <strong>Security Group Page.</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-41-31-98f629ee997a6a8436f8283d628ae498.jpg"></figure><p><strong>Step 2A: </strong>Changing the Security Group Inbound Rules. Right click on the security group shown below.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-41-31-9b235eb73431c4e7f70aeb86d4e3fca1.JPG"></figure><p><strong>Step 2B:</strong> Click on <strong>Edit Inbound Rules</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-41-31-98d9f2ee4cfa0e45ed3ea763a6644724.jpg"></figure><p><strong>Step 2C: </strong>Change the Rules to as shown below.</p><p><strong>Type </strong>- All Traffic (Protocol will be automatically set to All and Port Range to 0-65535)</p><p><strong>Source </strong>- Custom which should autofill the box the right to, 0.0.0.0/0 (this indicates we are accepting traffic from all IPs).</p><p>Click Save and that's it, your Security Group settings are now set.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-41-31-15840abd9ac7b2c2c15daceffc695821.JPG"></figure>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/7. Using FileZilla to transfer files to your EC2 Instance.html ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ <h4><strong>Using FileZilla to Connect to Your EC2 Instance to Allow Transferring of Files</strong></h4><p><br></p><p><strong>Step 1: Install FileZilla</strong></p><ul><li><p>Windows and Mac Users can go here and download and install - <a href="https://filezilla-project.org/" rel="noopener noreferrer" target="_blank">https://filezilla-project.org/</a></p></li><li><p>Ubuntu users can open the Terminal and install as follows:</p></li><li><pre class="prettyprint linenums">sudo apt-get update
2
+ sudo apt-get install filezilla </pre></li></ul><figure>After installing you can find and launch FileZilla by simply searching for it:<img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-4136243a8af88a3f38b752a9adbeb7bb.JPG"></figure><p><strong>Step 2:</strong> Let's load our SSH Key into FileZilla. Open FileZilla, and click Preferences in the main toolbar (should be the same for Windows and Mac users)</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-392a1cbb6dc5cd76196f692d6a2bca80.JPG"></figure><p><strong>Step 3A: </strong>In the left panel, select <strong>SFTP </strong>and then click <strong>Add key file...</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-b8ac3f9c8b4d1633f472204099911d28.JPG"></figure><p><strong>Step 3B: </strong>To find your ssh key, navigate to your "./ssh" directory, as it is a hidden directory, in Ubuntu, you can hit CTRL+L which changes the Location bar to an address bar. Simply type ".ssh/" as shown below to find your ssh key directory. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-fd67f533da61e4b6e6a3f8173c7fd18f.JPG"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-e861aeafa0d05fdc4b1221c283cd52fd.JPG"></figure><p><strong>Step 4A: Connecting to your EC2.</strong> Now that you've added your key we can create a new connection.</p><p>On the main toolbar again, click <strong>File </strong>and <strong>Site Manager</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-59-7094ce39622c5205fc7a199c97eabead.JPG"></figure><p><strong>Step 4B: </strong>This brings up the Site Manager window. Click<strong> New Site</strong> as shown below</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-3ebedc419bfba9facc83d3ca94ee6aed.JPG"></figure><p><strong>Step 4C: </strong>Enter the IP (<strong>IPv4 Public IP)</strong> of your EC2 Instance (can be found under Instances in your AWS Console Manager).</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-119a9990f34809b4d270b44a0920be88.JPG"></figure><p><strong>Step 4D: </strong>Change the <strong>Protocol </strong>to <strong>SFTP</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-c70e300f0e3bea0e4e019569b9aae7cb.JPG"></figure><p><strong>Step 4E:</strong> Change the<strong> Logon Type</strong> to <strong>Normal </strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-5ec407abfc844c69d20c216b7e701bfb.JPG"></figure><p><strong>Step 4F:</strong> Put the <strong>Username </strong>as <strong>ubuntu.</strong> Leave the password field alone.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-272f537a13d6ca250ac3e5e12035f2cd.JPG"></figure><p><strong>Step 4G:</strong> <strong>Check </strong>the "Always trust this host, add the key to the cache box" and click on <strong>OK</strong>.</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-7deb4172c6ab15ff64fa4e6de5bcc977.JPG"></figure><p><strong>Step 5: We've Connected! </strong></p><p>The numbered boxes below show the main windows we'll be using in FileZilla. </p><p>1. Is the connection status box that shows the logs as the system connects to your EC2 instance.</p><p>2. These are your local files on your system</p><p>3. These are the files on your EC2 VM. By default both these </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-17-24-382888a7caaa7e085dbed8ba415276ee.JPG"></figure><p><strong>Step 6: </strong>Navigate to your Flask python files you downloaded in the previous resource section and drag them accross to your EC2 home directory (default directory).</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-36-01-9339191f4d972516df58fda9846a758f.JPG"></figure><p><strong>Great! You're almost ready to have a fully working CV API live on the web!</strong></p><p>Next, we need to change the security group of our EC2 to allow HTTP inbound traffic.</p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/8. Running your CV Web App on EC2.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <h4><strong>Running Our CV Web App</strong></h4><p><br></p><p><strong>Step 1</strong>: Let's go back to your terminal in our VM or local OS and connect to our EC2 system </p><p><code>ssh -i ~/.ssh/my_aws_key.pem ubuntu@18.224.96.38</code></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-055aacf3331c034d10cb070afd33191b.JPG"></figure><p><strong>Step 2: </strong>Check to see if the files we transferred using FileZilla are indeed located on our EC2 Instance. Running <code>ls</code> should display the following files:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-11abbea6c691e37aed4e7ee95c7559d0.JPG"></figure><p><strong>Step 3: </strong>Launch our Web App by running this line:</p><p><code>sudo python3 webapp.py</code></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-6eca288bff15051487a40c55bb77c6a1.JPG"></figure><p>If you see the above, Flask is running our CV Web App! Let's go the our web browser and access it:</p><p><strong>Step 4: </strong>Enter the EC2 IP in your web browser and your should see the following.</p><p><strong>Congratulations</strong>! You've just launched a simple Computer Vision API for anyone in the world to use! Let's upload some test images to see our Web App in action!</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-467b63c0ea3e8de2a4b9cb0c33db2693.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-1d6c77e1e20a7dd571cefa84d2fd6ea7.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-47742e4675b799abf7c93f81765e6101.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_05-55-04-8d41647668253e2f92f8c4ac6de5d166.png"></figure><p>Awesome!</p><p><strong>Note: </strong></p><ol><li><p>This isn't a production grade Web App, there are many things needed to make this ready for public consumption (mainly in terms of security, scalability and appearance), however as of now, it's perfect to demo friends, colleagues, potential investors etc. </p></li><li><p>To keep our API running constantly launch the python code using this instead:</p></li></ol><p><code>sudo nohup python3 forms.py &amp; screen</code></p><p><br></p><p><strong>Finally, let's deploy our API version of this App and test using Postman.</strong></p>
29. BONUS - Create a Computer Vision API & Web App Using Flask and AWS/9. Running your CV API on EC2.html ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ <h4><strong>Running your CV API on EC2</strong></h4><p><br>Postman is an extremely useful app used to test APIs. We'll be using it to demonstrate our CV API </p><p><br></p><p><strong>Step 1: Install Postman</strong></p><p>Windows/Mac Users - <a href="https://www.getpostman.com/downloads/" rel="noopener noreferrer" target="_blank">https://www.getpostman.com/downloads/</a></p><p>Ubuntu - Open Ubuntu Software, search for Postman and install </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_06-06-41-8ce224de735c72f9cee7fd145abcbab4.JPG"></figure><p><br></p><p><strong>Step 2: </strong>Setting up Postman for testing your EC2 API</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_06-12-53-d966f3cbf2ec41dac34a0b3e943572bc.jpg"></figure><p><strong>Using Postman (see image above and the corresponding numbered steps below:</strong></p><ol><li><p>Change the protocol to POST</p></li><li><p>Enter the Public IP Address of your EC2 Instance: <strong>http://18.224.96.38</strong></p></li><li><p>Change Tab to Body</p></li><li><p>Select the form-data radio button</p></li><li><p>From the drop down, select Key type to be file</p></li><li><p>For Value, select one of our test images</p></li><li><p>Click send to send our image to our API</p></li><li><p>Our response will be shown in the window below.</p></li></ol><p><strong>The output is JSON file containing:</strong></p><pre class="prettyprint linenums">{
2
+ "MainColor": "Red",
3
+ "catOrDog": "Cat"
4
+ }</pre><p>The cool thing about using Postman is that we can generate the code to call this API in several different languages:</p><p>See blue box to bring up the <strong>code </strong>box:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_06-15-21-9b7592ec57e8a4769ac2d8d8d649580c.jpg"></figure><p><strong>Code Generator</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-23_00-55-22-1aaeea8249c6c70d464a1ece0466b273.JPG"></figure><p><strong>Note</strong>: To keep our API running constantly launch the python code using this instead:</p><p><code>sudo nohup python3 forms.py &amp; screen</code></p><p><br></p><h4><strong>Conclusion</strong></h4><p><br></p><p>So you've just successfully deployed a Computer Vision Web App and API that can be used by users anywhere in the world! This facilitates many possibilities from developing Mobile Apps, startup products and research tools. However, it should be noted there are many improvements that can be made:</p><ul><li><p>Saving our uploaded image to as S3 Bucket. We don't want to store all these images on our API server and it wasn't designed for mass storage. It's quite easy to configure and S3 bucket and store all uploaded images there. If privacy is a concern we can forget that entirely and use Python to delete the image after it's been processed.</p></li><li><p>Using Nginx or Gunicorn or something similar to be the main server for our API</p></li><li><p>Using a more scalable design (this far from my expertise, but if you ever wanted to support simultaneous users, you'll need to make some adjustments. </p></li></ul><p><br></p>
3. Installation Guide/3. Setting up your Deep Learning Virtual Machine (Download Code, VM & Slides here!).srt ADDED
@@ -0,0 +1,651 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:00,550 --> 00:00:00,880
3
+ All right.
4
+
5
+ 2
6
+ 00:00:00,880 --> 00:00:05,500
7
+ So welcome to chapter tree where we set up a planning development machine.
8
+
9
+ 3
10
+ 00:00:05,500 --> 00:00:10,160
11
+ So now that you've gone through the intro you're now ready to get started setting up your machine.
12
+
13
+ 4
14
+ 00:00:10,240 --> 00:00:11,730
15
+ So let's get to it.
16
+
17
+ 5
18
+ 00:00:11,740 --> 00:00:17,380
19
+ So now the first step in setting up a deepening virtual machine is you need to download Virtual Box
20
+
21
+ 6
22
+ 00:00:17,440 --> 00:00:24,100
23
+ Virtual Box is a software that allows us to actually start an entirely new operating system whether
24
+
25
+ 7
26
+ 00:00:24,130 --> 00:00:30,460
27
+ it be Linux Windows or Mac OS although it's not fully supported for Mac OS but you can actually run
28
+
29
+ 8
30
+ 00:00:30,490 --> 00:00:33,500
31
+ an entire new operating system with in a virtual box.
32
+
33
+ 9
34
+ 00:00:33,510 --> 00:00:34,170
35
+ It's super cool.
36
+
37
+ 10
38
+ 00:00:34,160 --> 00:00:35,560
39
+ It's called ritualize.
40
+
41
+ 11
42
+ 00:00:35,800 --> 00:00:43,960
43
+ And it is super fast on new use didn't experience any slowdown really and it is entirely useful for
44
+
45
+ 12
46
+ 00:00:43,960 --> 00:00:49,580
47
+ us to set up our deepening machine because deepening software on Windows is not well-supported.
48
+
49
+ 13
50
+ 00:00:49,660 --> 00:00:55,450
51
+ However on Linux or Ubuntu it is a very good environment for running on our software.
52
+
53
+ 14
54
+ 00:00:55,780 --> 00:01:01,740
55
+ So what you do now is you download either to Windows OS X versions according to what you're using.
56
+
57
+ 15
58
+ 00:01:01,960 --> 00:01:03,480
59
+ So I'm not going to be Windows.
60
+
61
+ 16
62
+ 00:01:03,480 --> 00:01:05,020
63
+ I've really done with it here.
64
+
65
+ 17
66
+ 00:01:05,380 --> 00:01:13,390
67
+ So now what you do as well is you download from the link in the resources file download the deepening
68
+
69
+ 18
70
+ 00:01:13,420 --> 00:01:15,100
71
+ C-v open to that.
72
+
73
+ 19
74
+ 00:01:15,180 --> 00:01:15,950
75
+ File.
76
+
77
+ 20
78
+ 00:01:16,210 --> 00:01:21,450
79
+ This is your wish your machine that we're going to set up so that you have those two files.
80
+
81
+ 21
82
+ 00:01:21,760 --> 00:01:23,840
83
+ Here we are ready to get started.
84
+
85
+ 22
86
+ 00:01:23,890 --> 00:01:27,250
87
+ So now we have to firstly install virtual box.
88
+
89
+ 23
90
+ 00:01:27,250 --> 00:01:30,050
91
+ Now I know this file is quite huge just nine gigs.
92
+
93
+ 24
94
+ 00:01:30,220 --> 00:01:35,470
95
+ So you probably would have wanted to leave this overnight if you have a relatively slow broadband connection.
96
+
97
+ 25
98
+ 00:01:35,470 --> 00:01:38,350
99
+ However if you're a fast connection should take about an hour.
100
+
101
+ 26
102
+ 00:01:38,790 --> 00:01:40,870
103
+ Either way let's install virtual box
104
+
105
+ 27
106
+ 00:01:43,800 --> 00:01:46,150
107
+ so you get to run standard procedure.
108
+
109
+ 28
110
+ 00:01:49,150 --> 00:01:52,600
111
+ I'm not going to do the Mac install video because I don't have access to a Mac.
112
+
113
+ 29
114
+ 00:01:52,690 --> 00:01:55,590
115
+ But as soon as I do upload a video for you Mac users.
116
+
117
+ 30
118
+ 00:01:55,720 --> 00:02:00,340
119
+ However it's a standard installation procedure and all the settings are the same but generally for Mac
120
+
121
+ 31
122
+ 00:02:00,400 --> 00:02:01,370
123
+ and Windows.
124
+
125
+ 32
126
+ 00:02:01,420 --> 00:02:04,150
127
+ So this should relate to you as well.
128
+
129
+ 33
130
+ 00:02:04,810 --> 00:02:06,810
131
+ So you leave all the details here as well.
132
+
133
+ 34
134
+ 00:02:06,840 --> 00:02:08,480
135
+ It's best next.
136
+
137
+ 35
138
+ 00:02:08,500 --> 00:02:17,590
139
+ Next again you can optionally can do this if you want find this is fine press yes install.
140
+
141
+ 36
142
+ 00:02:17,610 --> 00:02:18,150
143
+ Do we go
144
+
145
+ 37
146
+ 00:02:23,490 --> 00:02:23,840
147
+ OK.
148
+
149
+ 38
150
+ 00:02:23,870 --> 00:02:27,600
151
+ So we've finished installation process and now you can start.
152
+
153
+ 39
154
+ 00:02:27,650 --> 00:02:31,130
155
+ You'll get your rocks off and I'll show you how to get everything set up.
156
+
157
+ 40
158
+ 00:02:31,130 --> 00:02:31,470
159
+ All right.
160
+
161
+ 41
162
+ 00:02:31,470 --> 00:02:37,550
163
+ So here we are on a virtual box manage your software so please ignore these two virtual machines I have
164
+
165
+ 42
166
+ 00:02:37,550 --> 00:02:37,820
167
+ a break.
168
+
169
+ 43
170
+ 00:02:37,820 --> 00:02:42,930
171
+ No yours is going to be blank right now since semanticists and you're going to be using this.
172
+
173
+ 44
174
+ 00:02:42,950 --> 00:02:46,730
175
+ So go to file and import appliance.
176
+
177
+ 45
178
+ 00:02:46,940 --> 00:02:52,470
179
+ And now go to the directory where you downloaded your virtual box start of the file.
180
+
181
+ 46
182
+ 00:02:52,500 --> 00:02:53,330
183
+ All right.
184
+
185
+ 47
186
+ 00:02:53,460 --> 00:02:57,260
187
+ So go to open and press next.
188
+
189
+ 48
190
+ 00:02:57,260 --> 00:03:03,070
191
+ Now I have it set to my virtual machine to use force use and use it to exit from.
192
+
193
+ 49
194
+ 00:03:03,170 --> 00:03:05,370
195
+ That is basically of course CPI.
196
+
197
+ 50
198
+ 00:03:05,390 --> 00:03:10,550
199
+ So I have it logical cause and I have and get 16 gig of ram in my system.
200
+
201
+ 51
202
+ 00:03:10,550 --> 00:03:17,110
203
+ So you're going to want to put it to at least half of what your host system can support.
204
+
205
+ 52
206
+ 00:03:17,120 --> 00:03:20,690
207
+ So if you have a dual call you can use two here.
208
+
209
+ 53
210
+ 00:03:20,750 --> 00:03:24,330
211
+ And if you have it is a Frahm you can use for gigs here.
212
+
213
+ 54
214
+ 00:03:24,680 --> 00:03:27,960
215
+ So we can make those adjustments and leave everything else the same.
216
+
217
+ 55
218
+ 00:03:27,960 --> 00:03:32,830
219
+ It's fine and good to import.
220
+
221
+ 56
222
+ 00:03:33,020 --> 00:03:37,840
223
+ And now this time is basically a lie it's going to take maybe about 10 20 minutes.
224
+
225
+ 57
226
+ 00:03:37,840 --> 00:03:42,820
227
+ So I'm going to pause the video here and when it's done we can start off with your machine and I'll
228
+
229
+ 58
230
+ 00:03:42,820 --> 00:03:43,780
231
+ show you how to use it.
232
+
233
+ 59
234
+ 00:03:43,780 --> 00:03:47,580
235
+ It's quite easy and we'll get done it and set up whole code.
236
+
237
+ 60
238
+ 00:03:47,710 --> 00:03:48,960
239
+ And that should be at
240
+
241
+ 61
242
+ 00:03:51,780 --> 00:03:52,140
243
+ OK.
244
+
245
+ 62
246
+ 00:03:52,160 --> 00:03:53,000
247
+ So that's it.
248
+
249
+ 63
250
+ 00:03:53,000 --> 00:03:56,430
251
+ We have just installed virtual machine.
252
+
253
+ 64
254
+ 00:03:56,460 --> 00:04:00,050
255
+ So now this is the new one here because I already have it working here.
256
+
257
+ 65
258
+ 00:04:00,050 --> 00:04:02,960
259
+ It came up as under school one knows is going to be this one.
260
+
261
+ 66
262
+ 00:04:02,960 --> 00:04:04,980
263
+ So either way it's the same machine.
264
+
265
+ 67
266
+ 00:04:05,000 --> 00:04:09,230
267
+ So let's just start this one fresh one I just installed.
268
+
269
+ 68
270
+ 00:04:10,160 --> 00:04:21,270
271
+ It takes about maybe tity seconds to boot up at least on my system.
272
+
273
+ 69
274
+ 00:04:21,430 --> 00:04:24,340
275
+ This is all normal by the way it's a pudding's screenful and Ventoux
276
+
277
+ 70
278
+ 00:04:33,510 --> 00:04:34,120
279
+ There we go.
280
+
281
+ 71
282
+ 00:04:34,320 --> 00:04:35,650
283
+ So we're loaded.
284
+
285
+ 72
286
+ 00:04:35,880 --> 00:04:36,770
287
+ Oh yes.
288
+
289
+ 73
290
+ 00:04:36,780 --> 00:04:37,670
291
+ This is pretty cool.
292
+
293
+ 74
294
+ 00:04:37,670 --> 00:04:41,470
295
+ So now you're actually running Ubuntu within Windows.
296
+
297
+ 75
298
+ 00:04:41,850 --> 00:04:45,660
299
+ And it's because it's using of which lies those features on your CD.
300
+
301
+ 76
302
+ 00:04:46,020 --> 00:04:47,430
303
+ It's going to run pretty quick.
304
+
305
+ 77
306
+ 00:04:47,430 --> 00:04:49,460
307
+ You're not going to notice much of a slowdown.
308
+
309
+ 78
310
+ 00:04:49,470 --> 00:04:52,110
311
+ So what I want to do is two things I want to show you.
312
+
313
+ 79
314
+ 00:04:52,110 --> 00:04:55,580
315
+ First of all your password for your virtual machine.
316
+
317
+ 80
318
+ 00:04:55,800 --> 00:05:01,950
319
+ Whenever you do sudo or any sort of flick system settings we change users and stuff the password for
320
+
321
+ 81
322
+ 00:05:01,950 --> 00:05:05,960
323
+ this user is 1 1 2 2 treat 3 4 4.
324
+
325
+ 82
326
+ 00:05:06,180 --> 00:05:11,820
327
+ It's going to be in the touch resources of your file and I'm going to display it on screen as well.
328
+
329
+ 83
330
+ 00:05:11,940 --> 00:05:13,260
331
+ So you don't forget it.
332
+
333
+ 84
334
+ 00:05:13,290 --> 00:05:17,130
335
+ So let's just see something here.
336
+
337
+ 85
338
+ 00:05:17,130 --> 00:05:23,090
339
+ So what I want to show you is that we have our old machine this is all file structure here.
340
+
341
+ 86
342
+ 00:05:23,160 --> 00:05:25,640
343
+ So this is a home directory right now.
344
+
345
+ 87
346
+ 00:05:25,890 --> 00:05:28,690
347
+ We have a bunch of things installed here right now.
348
+
349
+ 88
350
+ 00:05:29,190 --> 00:05:33,380
351
+ Basically this these things are a lot of deepening libraries.
352
+
353
+ 89
354
+ 00:05:33,420 --> 00:05:38,550
355
+ And I have some instructions as well you don't need to use these right now but what I want to show you
356
+
357
+ 90
358
+ 00:05:38,550 --> 00:05:42,310
359
+ is to basically set up your code and train models.
360
+
361
+ 91
362
+ 00:05:42,330 --> 00:05:42,880
363
+ OK.
364
+
365
+ 92
366
+ 00:05:43,170 --> 00:05:44,300
367
+ So stay tuned.
368
+
369
+ 93
370
+ 00:05:44,650 --> 00:05:44,950
371
+ OK.
372
+
373
+ 94
374
+ 00:05:44,970 --> 00:05:53,210
375
+ So what I'm going to need you to do is open Firefox that's a web browser on the buntu by default and
376
+
377
+ 95
378
+ 00:05:53,210 --> 00:06:01,800
379
+ what to do is when Firefox boots up I want you to go to your dummy account when you come and with the
380
+
381
+ 96
382
+ 00:06:01,840 --> 00:06:07,460
383
+ attached resources to this lesson which is also the deep linning code library.
384
+
385
+ 97
386
+ 00:06:07,540 --> 00:06:11,950
387
+ Let's just wait for it to boot up here I think my system has a lot of stuff running right now.
388
+
389
+ 98
390
+ 00:06:12,110 --> 00:06:14,820
391
+ But either way just go to your dummy dot com
392
+
393
+ 99
394
+ 00:06:17,690 --> 00:06:19,650
395
+ and is like Friday so we're on right now.
396
+
397
+ 100
398
+ 00:06:19,650 --> 00:06:20,750
399
+ Nice.
400
+
401
+ 101
402
+ 00:06:20,780 --> 00:06:24,290
403
+ So what to do is basically log into your account.
404
+
405
+ 102
406
+ 00:06:24,470 --> 00:06:25,840
407
+ However you've done it before.
408
+
409
+ 103
410
+ 00:06:25,880 --> 00:06:32,300
411
+ Whichever method you use and download the resources for this for the course that's the code and Treen
412
+
413
+ 104
414
+ 00:06:32,300 --> 00:06:38,270
415
+ model is zip file and download it and it's going to save in your Downloads folder right here.
416
+
417
+ 105
418
+ 00:06:38,270 --> 00:06:39,470
419
+ This is the same file.
420
+
421
+ 106
422
+ 00:06:40,010 --> 00:06:42,620
423
+ And this is a file once you extract.
424
+
425
+ 107
426
+ 00:06:42,740 --> 00:06:50,720
427
+ So double click and this file is going to take a bit to 10 seconds or so to open it's a big file.
428
+
429
+ 108
430
+ 00:06:50,750 --> 00:06:53,990
431
+ So it's reasonable.
432
+
433
+ 109
434
+ 00:06:54,180 --> 00:07:00,120
435
+ And this file here I want you to extract it to your home directory here.
436
+
437
+ 110
438
+ 00:07:00,660 --> 00:07:02,370
439
+ So let's go ahead and do that
440
+
441
+ 111
442
+ 00:07:05,210 --> 00:07:08,020
443
+ let's just reduce this clutter in the background.
444
+
445
+ 112
446
+ 00:07:09,010 --> 00:07:13,630
447
+ And this takes about a minute actually less much less 10 minutes to extract
448
+
449
+ 113
450
+ 00:07:22,690 --> 00:07:31,350
451
+ and when it's done you will see all the code that we need for this course is in a convenient location.
452
+
453
+ 114
454
+ 00:07:31,480 --> 00:07:33,130
455
+ So just quit this
456
+
457
+ 115
458
+ 00:07:35,680 --> 00:07:36,270
459
+ close
460
+
461
+ 116
462
+ 00:07:39,700 --> 00:07:40,850
463
+ and doing something.
464
+
465
+ 117
466
+ 00:07:40,880 --> 00:07:42,750
467
+ But that's fine.
468
+
469
+ 118
470
+ 00:07:42,760 --> 00:07:48,510
471
+ What I want to do is go to home and now you see just Eritrea de-planing CV and it is also a shortcut
472
+
473
+ 119
474
+ 00:07:48,630 --> 00:07:50,530
475
+ to do the same directory.
476
+
477
+ 120
478
+ 00:07:50,850 --> 00:07:52,770
479
+ And let's go back to it here.
480
+
481
+ 121
482
+ 00:07:52,830 --> 00:07:55,620
483
+ This is decomp.
484
+
485
+ 122
486
+ 00:07:55,620 --> 00:08:00,020
487
+ The contents of this are actually sortable and list here.
488
+
489
+ 123
490
+ 00:08:00,030 --> 00:08:05,360
491
+ These are this is all the code entery Mullins is going to need the discourse and it's conveniently extracted
492
+
493
+ 124
494
+ 00:08:05,360 --> 00:08:07,210
495
+ now within your virtual machine.
496
+
497
+ 125
498
+ 00:08:07,220 --> 00:08:09,070
499
+ So now let's go back to Python.
500
+
501
+ 126
502
+ 00:08:09,080 --> 00:08:10,460
503
+ Sorry Otunnu.
504
+
505
+ 127
506
+ 00:08:10,880 --> 00:08:13,150
507
+ So let's play this.
508
+
509
+ 128
510
+ 00:08:13,160 --> 00:08:16,580
511
+ So what if we wanted to test our code.
512
+
513
+ 129
514
+ 00:08:16,880 --> 00:08:26,090
515
+ So what do you do you go to you press source type source activate C-v that's what the plumbing environment
516
+
517
+ 130
518
+ 00:08:26,090 --> 00:08:27,100
519
+ for computer vision.
520
+
521
+ 131
522
+ 00:08:27,440 --> 00:08:28,680
523
+ And it's instantly done.
524
+
525
+ 132
526
+ 00:08:28,730 --> 00:08:30,860
527
+ And now you type python.
528
+
529
+ 133
530
+ 00:08:31,420 --> 00:08:34,890
531
+ Notebook and magic happens.
532
+
533
+ 134
534
+ 00:08:34,890 --> 00:08:39,390
535
+ What this does it starts a local server that hosts your it on the browser.
536
+
537
+ 135
538
+ 00:08:39,900 --> 00:08:45,240
539
+ So now this is where we run all our Python code and I'm going is going to do a brief tutorial on that
540
+
541
+ 136
542
+ 00:08:45,260 --> 00:08:45,870
543
+ here.
544
+
545
+ 137
546
+ 00:08:46,130 --> 00:08:51,100
547
+ So now just good de-planing CV that's just good chapter for getting started.
548
+
549
+ 138
550
+ 00:08:51,450 --> 00:08:54,990
551
+ And let's run our Actually let's run our life sketching
552
+
553
+ 139
554
+ 00:08:59,390 --> 00:09:03,020
555
+ because this loads much quicker and we don't need any files for this.
556
+
557
+ 140
558
+ 00:09:03,020 --> 00:09:06,870
559
+ So let's run this block of code.
560
+
561
+ 141
562
+ 00:09:07,070 --> 00:09:11,450
563
+ I put on the books are basically just blocks of code we can run and all the variables everything is
564
+
565
+ 142
566
+ 00:09:11,450 --> 00:09:13,460
567
+ stored inside of it.
568
+
569
+ 143
570
+ 00:09:13,910 --> 00:09:19,400
571
+ So you press shift and still control and to shift and to run it here.
572
+
573
+ 144
574
+ 00:09:19,430 --> 00:09:25,820
575
+ This takes about maybe five seconds or less every print division up and see that we're we're using 3.4.
576
+
577
+ 145
578
+ 00:09:26,150 --> 00:09:29,440
579
+ If you press Allt and you get a new line below.
580
+
581
+ 146
582
+ 00:09:29,480 --> 00:09:34,690
583
+ So that's sometimes convenient sometimes not shift into just runs it with our precinct producing this
584
+
585
+ 147
586
+ 00:09:34,700 --> 00:09:36,020
587
+ new cell here.
588
+
589
+ 148
590
+ 00:09:36,170 --> 00:09:38,890
591
+ And to delete that espress cut.
592
+
593
+ 149
594
+ 00:09:38,960 --> 00:09:46,460
595
+ So now let's run our sketch now before we do this we're going to have to go with devices here webcams
596
+
597
+ 150
598
+ 00:09:46,490 --> 00:09:48,930
599
+ and take off integrated webcam.
600
+
601
+ 151
602
+ 00:09:49,280 --> 00:09:54,490
603
+ And now let's press shift into to run this block of gold.
604
+
605
+ 152
606
+ 00:09:54,530 --> 00:09:55,050
607
+ There we go.
608
+
609
+ 153
610
+ 00:09:55,130 --> 00:09:59,320
611
+ So this is our browser all lives this is me chatting here.
612
+
613
+ 154
614
+ 00:09:59,540 --> 00:10:00,590
615
+ This is my microphone.
616
+
617
+ 155
618
+ 00:10:00,590 --> 00:10:01,910
619
+ The door is in the back.
620
+
621
+ 156
622
+ 00:10:01,910 --> 00:10:04,450
623
+ So this is pretty cool isn't it.
624
+
625
+ 157
626
+ 00:10:04,850 --> 00:10:08,480
627
+ So that's close us and that is it.
628
+
629
+ 158
630
+ 00:10:08,480 --> 00:10:10,270
631
+ So we have all of the code here.
632
+
633
+ 159
634
+ 00:10:10,310 --> 00:10:16,020
635
+ Everything is preinstalled all the terror Karris tends to flow a bunch of supporting libraries.
636
+
637
+ 160
638
+ 00:10:16,080 --> 00:10:21,410
639
+ The lib is installed a lot of things are stored Actually it probably takes about an hour to do even
640
+
641
+ 161
642
+ 00:10:21,600 --> 00:10:23,400
643
+ with some sometimes to install everything.
644
+
645
+ 162
646
+ 00:10:23,750 --> 00:10:24,350
647
+ So that's it.
648
+
649
+ 163
650
+ 00:10:24,350 --> 00:10:27,930
651
+ You have your full video machine running here.
3. Installation Guide/3.1 Download Your Deep Learning Virtual Machine HERE.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1XK_rbOwAweGBeaFX_EF9b6j6Fs8OvFB6/view?usp=sharing";</script>
3. Installation Guide/3.1 Slides.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/14ji_XyZEIR6jyQoX9CfAtMrlouF-zmmL/view?usp=sharing";</script>
3. Installation Guide/3.2 Code.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1YKJnrir2-lRoJaSs89pxG2RUsJD_Ig09/view?usp=sharing";</script>
3. Installation Guide/3.2 Slides.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/14ji_XyZEIR6jyQoX9CfAtMrlouF-zmmL/view?usp=sharing";</script>
3. Installation Guide/3.3 Code.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1YKJnrir2-lRoJaSs89pxG2RUsJD_Ig09/view?usp=sharing";</script>
3. Installation Guide/3.3 Download Your Deep Learning Virtual Machine HERE.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1XK_rbOwAweGBeaFX_EF9b6j6Fs8OvFB6/view?usp=sharing";</script>
3. Installation Guide/3.4 MIRROR - Download Your Deep Learning Virtual Machine HERE.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://1drv.ms/u/s!AkTkTuTv8A66axQZQWaLla8zKgU";</script>
3. Installation Guide/4. Optional - Troubleshooting Guide for VM Setup & for resolving some MacOS Issues.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <h4><strong>Troubleshooting guide</strong></h4><p><br></p><p>If you get any issues (happens on various systems and configurations, please look at the guide below for the possible solutions). If not, please post a screenshot of your error in the class forum and we'll try to assist.</p><p><strong>Summary of this document.</strong></p><ol><li><p>ERROR # 1: Failed to Import Appliance</p></li><li><p>ERROR #&nbsp;2: Failed to open a session for the virtual machine Deep Learning CV Ubuntu</p></li><li><p>ERROR&nbsp;#&nbsp;3 - (May appear as Warnings) Broken links to shared folder found?</p></li><li><p>ERROR&nbsp;# 4 - On Macs it's advised you install the feature the feature pack for virtualbox</p></li><li><p>ERROR&nbsp;#&nbsp;5 (Common) - VirtualBox fails with “Implementation of the USB 2.0 controller not found” after upgrading</p></li><li><p><br></p></li><li><p>MAC&nbsp;OS&nbsp;Guide - Installing VirtualBox Extension Pack on MacOS (required to access webcam)</p></li></ol><p><br></p><p><strong>NOTE 2019 Update:</strong></p><p>With newer versions of windows there are more Virtualbox issues.</p><p>The following need to be turned off in windows features for VM to run:</p><ul><li><p>&lt;any&gt; Guard</p></li><li><p>Containers</p></li><li><p>Hyper-V</p></li><li><p>Virtual Machine Platform</p></li><li><p>Windows Hypervisor Platform</p></li><li><p>Windows Sandbox</p></li><li><p>Windows Subsystem for Linux (WSL)</p></li></ul><h4><strong>ERROR # 1: Failed to Import Appliance</strong></h4><p><br></p><p> If you get the following error:</p><p><br></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-14_02-04-35-b413486faaa3f008838cbdb1c7a870c9.jpg"></figure><p>1. Move the .vmdk file to your second (bigger) disk. To do this click on <strong><em>"Global tools"</em></strong>, choose the correct vmdk file and select move. It requires few minutes to do it.</p><p>2. Create a new virtual machine clicking on the new button.</p><p>3. Select Linux -&gt; Ubuntu 64 bit.</p><p>4. Choose either 8192 or 4096 megabytes of RAM. Rule of thumb is half of your installed RAM.</p><p>5. On the next screen choose <strong><em>"Use an existing fixed virtual disk"</em></strong> and selected the just created .vmdk file</p><p><br></p><p>If for some reason this doesn't work. Try this:</p><p>1. Go the properties:</p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-14_02-28-46-c332017914bd027e9c6dca365cdc0901.jpg"></figure><p>2. Click on remove to delete your virtual hard disk</p><p>3. Do the procedure of importing the virtual machine again from the start (only once) and you will get the error again.&nbsp; </p><p>4. Then try the above procedure again</p><p><br></p><p><strong>Thanks to student PierLuigi for this guide!</strong></p><p><br></p><h4><strong>Alternative option:</strong></h4><p><br></p><ul><li><p>Import OVA using <a href="https://www.vmware.com/" rel="noopener noreferrer" target="_blank">VMWare</a> </p></li></ul><p><br></p><h4><strong>ERROR #&nbsp;2: Failed to open a session for the virtual machine Deep Learning CV Ubuntu</strong></h4><p>Implementation of the USB 2.0 controller not found!</p><p>Because the USB 2.0 controller state is part of the saved VM state, the VM cannot be started. To fix this problem, either install the <strong>'Oracle VM VirtualBox Extension Pack'</strong> or disable USB 2.0 support in the VM settings.</p><p>Note! This error could also mean that an incompatible version of the <strong>'Oracle VM VirtualBox Extension Pack'</strong> is installed (VERR_NOT_FOUND).</p><p>Result Code: E_FAIL (0x80004005)Component: ConsoleWrapInterface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}</p><p><strong>SOLUTION HERE</strong>&nbsp;- <a href="https://www.virtualbox.org/ticket/8182" rel="noopener noreferrer" target="_blank">https://www.virtualbox.org/ticket/8182</a></p><p>Simply:</p><ul><li><p>Right click on the vm image &gt; USB &gt; USB 1.1</p></li></ul><p><br></p><p><br></p><h4><strong>ERROR&nbsp;#&nbsp;3 - Broken links to shared folder </strong></h4><p>Ignore those messages and if needed setup your own shared folder (see Lecture 9).</p><h4></h4><h4><strong>ERROR&nbsp;# 4 - On Macs it's advised you install the feature the feature pack for virtualbox</strong></h4><p><br></p><h4><strong>ERROR&nbsp;#&nbsp;5 (Common) - </strong><a href="https://askubuntu.com/questions/453393/virtualbox-fails-with-implementation-of-the-usb-2-0-controller-not-found-after" rel="noopener noreferrer" target="_blank"><strong>VirtualBox fails with “Implementation of the USB 2.0 controller not found” after upgrading</strong></a></h4><p><br></p><p><strong> Disable USB Controller</strong> when first loading if USB&nbsp;Error. Uncheck the Enable USB&nbsp;Controller box. </p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-04-27_15-14-37-2a9596158be15bdd38eb67e9348089f8.png"></figure><p><br></p><p>Then after loaded up, install the Guest Additions by finding it here from the top toolbar menu.</p><p><img alt="Image result for install guest additions virtualbox" src="https://www.liberiangeek.net/wp-content/uploads/2013/01/ubuntu_guest_machine_virtualbox_thumb.png"></p><p><br></p><p><strong>ERROR 6: E_FAIL(0x8004005)</strong></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2019-08-30_16-54-20-055edba0a3aa98e4deac73d849f0e3a3.png"></figure><p>Solution: <a href="http://youtube.com/watch?reload=9&amp;v=u0AWnCr80Ws" rel="noopener noreferrer" target="_blank">youtube.com/watch?reload=9&amp;v=u0AWnCr80Ws</a></p><h4><strong>Installing VirtualBox Extension Pack on MacOS (required to access webcam)</strong></h4><p>Credit for these steps goes to one of your yellow students, Erwin. </p><p><br></p><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-9c9f231990eb2ee9e6cd3d68ec15f8c4.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-c31318cf7ab409e52d353ca5ecc176bd.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-99ac6eebfe76a5881d32f6ae31c1839a.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-36e8f6edbb68720a47b8ff8f0d4b1431.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-a3e46f9d1bd5a43346c5dc4bc0b2846c.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-e44b63802431cc75e76bbcf781cd7942.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-17670b449a0e1a61c518806c39f9a090.png"></figure><figure><img src="https://udemy-images.s3.amazonaws.com:443/redactor/raw/2018-12-21_17-47-28-b41c3f03a46225655b3fcf63478bcff7.png"></figure><p><br></p>
4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.1 - Handwritten Digit Classification Demo (MNIST).ipynb ADDED
@@ -0,0 +1,656 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "### Let's load a Handwritten Digit classifier we'll be building very soon!"
8
+ ]
9
+ },
10
+ {
11
+ "cell_type": "code",
12
+ "metadata": {
13
+ "ExecuteTime": {
14
+ "end_time": "2025-03-13T16:02:55.674903Z",
15
+ "start_time": "2025-03-13T16:02:41.962641Z"
16
+ }
17
+ },
18
+ "source": [
19
+ "import cv2\n",
20
+ "import numpy as np\n",
21
+ "import tensorflow as tf\n",
22
+ "import tensorflow.keras\n",
23
+ "from tensorflow.keras.datasets import mnist\n",
24
+ "from tensorflow.keras.models import load_model\n",
25
+ "\n"
26
+ ],
27
+ "outputs": [
28
+ {
29
+ "name": "stderr",
30
+ "output_type": "stream",
31
+ "text": [
32
+ "2025-03-13 21:32:43.419874: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
33
+ "2025-03-13 21:32:43.428343: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
34
+ "2025-03-13 21:32:43.459714: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
35
+ "WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n",
36
+ "E0000 00:00:1741881763.520586 24229 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
37
+ "E0000 00:00:1741881763.535534 24229 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
38
+ "2025-03-13 21:32:43.581360: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
39
+ "To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
40
+ ]
41
+ }
42
+ ],
43
+ "execution_count": 1
44
+ },
45
+ {
46
+ "metadata": {
47
+ "ExecuteTime": {
48
+ "end_time": "2025-03-13T15:53:32.218716Z",
49
+ "start_time": "2025-03-13T15:53:31.940957Z"
50
+ }
51
+ },
52
+ "cell_type": "code",
53
+ "source": "classifier = load_model('mnist_simple_cnn.h5' )",
54
+ "outputs": [
55
+ {
56
+ "name": "stderr",
57
+ "output_type": "stream",
58
+ "text": [
59
+ "2025-03-13 21:23:31.971561: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)\n",
60
+ "WARNING:absl:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n"
61
+ ]
62
+ }
63
+ ],
64
+ "execution_count": 2
65
+ },
66
+ {
67
+ "metadata": {
68
+ "jupyter": {
69
+ "is_executing": true
70
+ },
71
+ "ExecuteTime": {
72
+ "start_time": "2025-03-13T15:53:54.412494Z"
73
+ }
74
+ },
75
+ "cell_type": "code",
76
+ "source": [
77
+ "print(classifier.summary())\n",
78
+ "# loads the MNIST dataset\n",
79
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
80
+ "\n",
81
+ "def draw_test(name, pred, input_im):\n",
82
+ " BLACK = [0,0,0]\n",
83
+ " expanded_image = cv2.copyMakeBorder(input_im, 0, 0, 0, imageL.shape[0] ,cv2.BORDER_CONSTANT,value=BLACK)\n",
84
+ " expanded_image = cv2.cvtColor(expanded_image, cv2.COLOR_GRAY2BGR)\n",
85
+ " cv2.putText(expanded_image, str(pred), (152, 70) , cv2.FONT_HERSHEY_COMPLEX_SMALL,4, (0,255,0), 2)\n",
86
+ " cv2.imshow(name, expanded_image)\n",
87
+ "\n",
88
+ "for i in range(0,10):\n",
89
+ " rand = np.random.randint(0,len(x_test))\n",
90
+ " input_im = x_test[rand]\n",
91
+ "\n",
92
+ " imageL = cv2.resize(input_im, None, fx=4, fy=4, interpolation = cv2.INTER_CUBIC)\n",
93
+ " input_im = input_im.reshape(1,28,28,1)\n",
94
+ "\n",
95
+ " ## Get Prediction\n",
96
+ " res = str(classifier.predict(input_im, 1, verbose = 0)[0])\n",
97
+ " draw_test(\"Prediction\", res, imageL)\n",
98
+ " cv2.waitKey(0)\n",
99
+ "\n",
100
+ "\n",
101
+ "cv2.destroyAllWindows()\n"
102
+ ],
103
+ "outputs": [
104
+ {
105
+ "data": {
106
+ "text/plain": [
107
+ "\u001B[1mModel: \"sequential_3\"\u001B[0m\n"
108
+ ],
109
+ "text/html": [
110
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential_3\"</span>\n",
111
+ "</pre>\n"
112
+ ]
113
+ },
114
+ "metadata": {},
115
+ "output_type": "display_data"
116
+ },
117
+ {
118
+ "data": {
119
+ "text/plain": [
120
+ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
121
+ "┃\u001B[1m \u001B[0m\u001B[1mLayer (type) \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m Param #\u001B[0m\u001B[1m \u001B[0m┃\n",
122
+ "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
123
+ "│ conv2d_2 (\u001B[38;5;33mConv2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m26\u001B[0m, \u001B[38;5;34m26\u001B[0m, \u001B[38;5;34m32\u001B[0m) │ \u001B[38;5;34m320\u001B[0m │\n",
124
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
125
+ "│ conv2d_3 (\u001B[38;5;33mConv2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m24\u001B[0m, \u001B[38;5;34m24\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m18,496\u001B[0m │\n",
126
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
127
+ "│ max_pooling2d_1 (\u001B[38;5;33mMaxPooling2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
128
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
129
+ "│ dropout_2 (\u001B[38;5;33mDropout\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
130
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
131
+ "│ flatten_1 (\u001B[38;5;33mFlatten\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m9216\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
132
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
133
+ "│ dense_2 (\u001B[38;5;33mDense\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m128\u001B[0m) │ \u001B[38;5;34m1,179,776\u001B[0m │\n",
134
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
135
+ "│ dropout_3 (\u001B[38;5;33mDropout\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m128\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
136
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
137
+ "│ dense_3 (\u001B[38;5;33mDense\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m10\u001B[0m) │ \u001B[38;5;34m1,290\u001B[0m │\n",
138
+ "└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
139
+ ],
140
+ "text/html": [
141
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
142
+ "┃<span style=\"font-weight: bold\"> Layer (type) </span>┃<span style=\"font-weight: bold\"> Output Shape </span>┃<span style=\"font-weight: bold\"> Param # </span>┃\n",
143
+ "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
144
+ "│ conv2d_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Conv2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">26</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">26</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">320</span> │\n",
145
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
146
+ "│ conv2d_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Conv2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">24</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">24</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">18,496</span> │\n",
147
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
148
+ "│ max_pooling2d_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">MaxPooling2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
149
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
150
+ "│ dropout_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
151
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
152
+ "│ flatten_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Flatten</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">9216</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
153
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
154
+ "│ dense_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">128</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">1,179,776</span> │\n",
155
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
156
+ "│ dropout_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">128</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
157
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
158
+ "│ dense_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">10</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">1,290</span> │\n",
159
+ "└──���──────────────────────────────┴────────────────────────┴───────────────┘\n",
160
+ "</pre>\n"
161
+ ]
162
+ },
163
+ "metadata": {},
164
+ "output_type": "display_data"
165
+ },
166
+ {
167
+ "data": {
168
+ "text/plain": [
169
+ "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m1,199,884\u001B[0m (4.58 MB)\n"
170
+ ],
171
+ "text/html": [
172
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,199,884</span> (4.58 MB)\n",
173
+ "</pre>\n"
174
+ ]
175
+ },
176
+ "metadata": {},
177
+ "output_type": "display_data"
178
+ },
179
+ {
180
+ "data": {
181
+ "text/plain": [
182
+ "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m1,199,882\u001B[0m (4.58 MB)\n"
183
+ ],
184
+ "text/html": [
185
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,199,882</span> (4.58 MB)\n",
186
+ "</pre>\n"
187
+ ]
188
+ },
189
+ "metadata": {},
190
+ "output_type": "display_data"
191
+ },
192
+ {
193
+ "data": {
194
+ "text/plain": [
195
+ "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
196
+ ],
197
+ "text/html": [
198
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
199
+ "</pre>\n"
200
+ ]
201
+ },
202
+ "metadata": {},
203
+ "output_type": "display_data"
204
+ },
205
+ {
206
+ "data": {
207
+ "text/plain": [
208
+ "\u001B[1m Optimizer params: \u001B[0m\u001B[38;5;34m2\u001B[0m (12.00 B)\n"
209
+ ],
210
+ "text/html": [
211
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Optimizer params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">2</span> (12.00 B)\n",
212
+ "</pre>\n"
213
+ ]
214
+ },
215
+ "metadata": {},
216
+ "output_type": "display_data"
217
+ },
218
+ {
219
+ "name": "stdout",
220
+ "output_type": "stream",
221
+ "text": [
222
+ "None\n"
223
+ ]
224
+ }
225
+ ],
226
+ "execution_count": null
227
+ },
228
+ {
229
+ "metadata": {
230
+ "ExecuteTime": {
231
+ "end_time": "2025-03-13T16:05:45.557030Z",
232
+ "start_time": "2025-03-13T16:05:45.541940Z"
233
+ }
234
+ },
235
+ "cell_type": "code",
236
+ "source": "import numpy",
237
+ "outputs": [],
238
+ "execution_count": 2
239
+ },
240
+ {
241
+ "cell_type": "markdown",
242
+ "metadata": {},
243
+ "source": [
244
+ "### Testing our classifier on a real image"
245
+ ]
246
+ },
247
+ {
248
+ "cell_type": "code",
249
+ "metadata": {
250
+ "jupyter": {
251
+ "is_executing": true
252
+ },
253
+ "ExecuteTime": {
254
+ "start_time": "2025-03-13T16:05:47.832541Z"
255
+ }
256
+ },
257
+ "source": [
258
+ "import numpy as np\n",
259
+ "import cv2\n",
260
+ "from preprocessors import x_cord_contour, makeSquare, resize_to_pixel\n",
261
+ " \n",
262
+ "image = cv2.imread('images/numbers.jpg')\n",
263
+ "gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)\n",
264
+ "cv2.imshow(\"image\", image)\n",
265
+ "cv2.waitKey(0)\n",
266
+ "\n",
267
+ "# Blur image then find edges using Canny \n",
268
+ "blurred = cv2.GaussianBlur(gray, (5, 5), 0)\n",
269
+ "#cv2.imshow(\"blurred\", blurred)\n",
270
+ "#cv2.waitKey(0)\n",
271
+ "\n",
272
+ "edged = cv2.Canny(blurred, 30, 150)\n",
273
+ "#cv2.imshow(\"edged\", edged)\n",
274
+ "#cv2.waitKey(0)\n",
275
+ "\n",
276
+ "# Find Contours\n",
277
+ "contours, _ = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)\n",
278
+ "\n",
279
+ "#Sort out contours left to right by using their x cordinates\n",
280
+ "contours = sorted(contours, key = x_cord_contour, reverse = False)\n",
281
+ "\n",
282
+ "# Create empty array to store entire number\n",
283
+ "full_number = []\n",
284
+ "\n",
285
+ "# loop over the contours\n",
286
+ "for c in contours:\n",
287
+ " # compute the bounding box for the rectangle\n",
288
+ " (x, y, w, h) = cv2.boundingRect(c) \n",
289
+ "\n",
290
+ " if w >= 5 and h >= 25:\n",
291
+ " roi = blurred[y:y + h, x:x + w]\n",
292
+ " ret, roi = cv2.threshold(roi, 127, 255,cv2.THRESH_BINARY_INV)\n",
293
+ " roi = makeSquare(roi)\n",
294
+ " roi = resize_to_pixel(28, roi)\n",
295
+ " cv2.imshow(\"ROI\", roi)\n",
296
+ " roi = roi / 255.0 \n",
297
+ " roi = roi.reshape(1,28,28,1) \n",
298
+ "\n",
299
+ " ## Get Prediction\n",
300
+ " res = str(classifier.predict_classes(roi, 1, verbose = 0)[0])\n",
301
+ " full_number.append(res)\n",
302
+ " cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 255), 2)\n",
303
+ " cv2.putText(image, res, (x , y + 155), cv2.FONT_HERSHEY_COMPLEX, 2, (255, 0, 0), 2)\n",
304
+ " cv2.imshow(\"image\", image)\n",
305
+ " cv2.waitKey(0) \n",
306
+ " \n",
307
+ "cv2.destroyAllWindows()\n",
308
+ "print (\"The number is: \" + ''.join(full_number))"
309
+ ],
310
+ "outputs": [],
311
+ "execution_count": null
312
+ },
313
+ {
314
+ "cell_type": "markdown",
315
+ "metadata": {},
316
+ "source": [
317
+ "### Training this Model"
318
+ ]
319
+ },
320
+ {
321
+ "cell_type": "code",
322
+ "metadata": {
323
+ "ExecuteTime": {
324
+ "end_time": "2025-03-13T13:12:31.511869Z",
325
+ "start_time": "2025-03-13T12:29:08.572403Z"
326
+ }
327
+ },
328
+ "source": [
329
+ "from tensorflow.keras.datasets import mnist\n",
330
+ "from tensorflow.keras.utils import to_categorical\n",
331
+ "from tensorflow.keras.optimizers import Adadelta\n",
332
+ "from tensorflow.keras.datasets import mnist\n",
333
+ "from tensorflow.keras.models import Sequential\n",
334
+ "from tensorflow.keras.layers import Dense, Dropout, Flatten\n",
335
+ "from tensorflow.keras.layers import Conv2D, MaxPooling2D ,Input\n",
336
+ "from tensorflow.keras import backend as K\n",
337
+ "\n",
338
+ "# Training Parameters\n",
339
+ "batch_size = 128\n",
340
+ "epochs = 20\n",
341
+ "\n",
342
+ "# loads the MNIST dataset\n",
343
+ "(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
344
+ "\n",
345
+ "# Lets store the number of rows and columns\n",
346
+ "img_rows = x_train[0].shape[0]\n",
347
+ "img_cols = x_train[1].shape[0]\n",
348
+ "\n",
349
+ "# Getting our date in the right 'shape' needed for Keras\n",
350
+ "# We need to add a 4th dimenion to our date thereby changing our\n",
351
+ "# Our original image shape of (60000,28,28) to (60000,28,28,1)\n",
352
+ "x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n",
353
+ "x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n",
354
+ "\n",
355
+ "# store the shape of a single image \n",
356
+ "input_shape = (img_rows, img_cols, 1)\n",
357
+ "\n",
358
+ "# change our image type to float32 data type\n",
359
+ "x_train = x_train.astype('float32')\n",
360
+ "x_test = x_test.astype('float32')\n",
361
+ "\n",
362
+ "# Normalize our data by changing the range from (0 to 255) to (0 to 1)\n",
363
+ "x_train /= 255\n",
364
+ "x_test /= 255\n",
365
+ "\n",
366
+ "print('x_train shape:', x_train.shape)\n",
367
+ "print(x_train.shape[0], 'train samples')\n",
368
+ "print(x_test.shape[0], 'test samples')\n",
369
+ "\n",
370
+ "# Now we one hot encode outputs\n",
371
+ "y_train = to_categorical(y_train)\n",
372
+ "y_test = to_categorical(y_test)\n",
373
+ "\n",
374
+ "# Let's count the number columns in our hot encoded matrix \n",
375
+ "print (\"Number of Classes: \" + str(y_test.shape[1]))\n",
376
+ "\n",
377
+ "num_classes = y_test.shape[1]\n",
378
+ "num_pixels = x_train.shape[1] * x_train.shape[2]\n",
379
+ "\n",
380
+ "# create model\n",
381
+ "model = Sequential()\n",
382
+ "\n",
383
+ "model.add(Input(shape=(28,28,1)))\n",
384
+ "model.add(Conv2D(32, kernel_size=(3, 3),\n",
385
+ " activation='relu',\n",
386
+ " input_shape=input_shape))\n",
387
+ "model.add(Conv2D(64, (3, 3), activation='relu'))\n",
388
+ "model.add(MaxPooling2D(pool_size=(2, 2)))\n",
389
+ "model.add(Dropout(0.25))\n",
390
+ "model.add(Flatten())\n",
391
+ "model.add(Dense(128, activation='relu'))\n",
392
+ "model.add(Dropout(0.5))\n",
393
+ "model.add(Dense(num_classes, activation='softmax'))\n",
394
+ "\n",
395
+ "model.compile(loss = 'categorical_crossentropy',\n",
396
+ " optimizer = Adadelta(),\n",
397
+ " metrics = ['accuracy'])\n",
398
+ "\n",
399
+ "print(model.summary())\n",
400
+ "\n",
401
+ "history = model.fit(x_train, y_train,\n",
402
+ " batch_size=batch_size,\n",
403
+ " epochs=epochs,\n",
404
+ " verbose=1,\n",
405
+ " validation_data=(x_test, y_test))\n",
406
+ "\n",
407
+ "score = model.evaluate(x_test, y_test, verbose=0)\n",
408
+ "print('Test loss:', score[0])\n",
409
+ "print('Test accuracy:', score[1])"
410
+ ],
411
+ "outputs": [
412
+ {
413
+ "name": "stdout",
414
+ "output_type": "stream",
415
+ "text": [
416
+ "x_train shape: (60000, 28, 28, 1)\n",
417
+ "60000 train samples\n",
418
+ "10000 test samples\n",
419
+ "Number of Classes: 10\n"
420
+ ]
421
+ },
422
+ {
423
+ "data": {
424
+ "text/plain": [
425
+ "\u001B[1mModel: \"sequential_3\"\u001B[0m\n"
426
+ ],
427
+ "text/html": [
428
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential_3\"</span>\n",
429
+ "</pre>\n"
430
+ ]
431
+ },
432
+ "metadata": {},
433
+ "output_type": "display_data"
434
+ },
435
+ {
436
+ "data": {
437
+ "text/plain": [
438
+ "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
439
+ "┃\u001B[1m \u001B[0m\u001B[1mLayer (type) \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m Param #\u001B[0m\u001B[1m \u001B[0m┃\n",
440
+ "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
441
+ "│ conv2d_2 (\u001B[38;5;33mConv2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m26\u001B[0m, \u001B[38;5;34m26\u001B[0m, \u001B[38;5;34m32\u001B[0m) │ \u001B[38;5;34m320\u001B[0m │\n",
442
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
443
+ "│ conv2d_3 (\u001B[38;5;33mConv2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m24\u001B[0m, \u001B[38;5;34m24\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m18,496\u001B[0m │\n",
444
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
445
+ "│ max_pooling2d_1 (\u001B[38;5;33mMaxPooling2D\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
446
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
447
+ "│ dropout_2 (\u001B[38;5;33mDropout\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m12\u001B[0m, \u001B[38;5;34m64\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
448
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
449
+ "│ flatten_1 (\u001B[38;5;33mFlatten\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m9216\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
450
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
451
+ "│ dense_2 (\u001B[38;5;33mDense\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m128\u001B[0m) │ \u001B[38;5;34m1,179,776\u001B[0m │\n",
452
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
453
+ "│ dropout_3 (\u001B[38;5;33mDropout\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m128\u001B[0m) │ \u001B[38;5;34m0\u001B[0m │\n",
454
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
455
+ "│ dense_3 (\u001B[38;5;33mDense\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m10\u001B[0m) │ \u001B[38;5;34m1,290\u001B[0m │\n",
456
+ "└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
457
+ ],
458
+ "text/html": [
459
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
460
+ "┃<span style=\"font-weight: bold\"> Layer (type) </span>┃<span style=\"font-weight: bold\"> Output Shape </span>┃<span style=\"font-weight: bold\"> Param # </span>┃\n",
461
+ "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
462
+ "│ conv2d_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Conv2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">26</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">26</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">320</span> │\n",
463
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
464
+ "│ conv2d_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Conv2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">24</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">24</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">18,496</span> │\n",
465
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
466
+ "│ max_pooling2d_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">MaxPooling2D</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
467
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
468
+ "│ dropout_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">12</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
469
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
470
+ "│ flatten_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Flatten</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">9216</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
471
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
472
+ "│ dense_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">128</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">1,179,776</span> │\n",
473
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
474
+ "│ dropout_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">128</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
475
+ "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
476
+ "│ dense_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">10</span>) │ <span style=\"color: #00af00; text-decoration-color: #00af00\">1,290</span> │\n",
477
+ "└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
478
+ "</pre>\n"
479
+ ]
480
+ },
481
+ "metadata": {},
482
+ "output_type": "display_data"
483
+ },
484
+ {
485
+ "data": {
486
+ "text/plain": [
487
+ "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m1,199,882\u001B[0m (4.58 MB)\n"
488
+ ],
489
+ "text/html": [
490
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,199,882</span> (4.58 MB)\n",
491
+ "</pre>\n"
492
+ ]
493
+ },
494
+ "metadata": {},
495
+ "output_type": "display_data"
496
+ },
497
+ {
498
+ "data": {
499
+ "text/plain": [
500
+ "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m1,199,882\u001B[0m (4.58 MB)\n"
501
+ ],
502
+ "text/html": [
503
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,199,882</span> (4.58 MB)\n",
504
+ "</pre>\n"
505
+ ]
506
+ },
507
+ "metadata": {},
508
+ "output_type": "display_data"
509
+ },
510
+ {
511
+ "data": {
512
+ "text/plain": [
513
+ "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
514
+ ],
515
+ "text/html": [
516
+ "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
517
+ "</pre>\n"
518
+ ]
519
+ },
520
+ "metadata": {},
521
+ "output_type": "display_data"
522
+ },
523
+ {
524
+ "name": "stdout",
525
+ "output_type": "stream",
526
+ "text": [
527
+ "None\n",
528
+ "Epoch 1/20\n"
529
+ ]
530
+ },
531
+ {
532
+ "name": "stderr",
533
+ "output_type": "stream",
534
+ "text": [
535
+ "2025-03-13 17:59:09.869544: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.\n"
536
+ ]
537
+ },
538
+ {
539
+ "name": "stdout",
540
+ "output_type": "stream",
541
+ "text": [
542
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m110s\u001B[0m 229ms/step - accuracy: 0.1052 - loss: 2.3000 - val_accuracy: 0.2693 - val_loss: 2.2488\n",
543
+ "Epoch 2/20\n",
544
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m162s\u001B[0m 271ms/step - accuracy: 0.2265 - loss: 2.2399 - val_accuracy: 0.4505 - val_loss: 2.1755\n",
545
+ "Epoch 3/20\n",
546
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m101s\u001B[0m 215ms/step - accuracy: 0.3401 - loss: 2.1674 - val_accuracy: 0.5460 - val_loss: 2.0756\n",
547
+ "Epoch 4/20\n",
548
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m98s\u001B[0m 208ms/step - accuracy: 0.4168 - loss: 2.0707 - val_accuracy: 0.6309 - val_loss: 1.9400\n",
549
+ "Epoch 5/20\n",
550
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m101s\u001B[0m 216ms/step - accuracy: 0.4859 - loss: 1.9388 - val_accuracy: 0.7004 - val_loss: 1.7650\n",
551
+ "Epoch 6/20\n",
552
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m148s\u001B[0m 228ms/step - accuracy: 0.5514 - loss: 1.7740 - val_accuracy: 0.7517 - val_loss: 1.5543\n",
553
+ "Epoch 7/20\n",
554
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m106s\u001B[0m 225ms/step - accuracy: 0.5989 - loss: 1.5901 - val_accuracy: 0.7860 - val_loss: 1.3342\n",
555
+ "Epoch 8/20\n",
556
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m141s\u001B[0m 224ms/step - accuracy: 0.6395 - loss: 1.4047 - val_accuracy: 0.8053 - val_loss: 1.1347\n",
557
+ "Epoch 9/20\n",
558
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m106s\u001B[0m 226ms/step - accuracy: 0.6621 - loss: 1.2518 - val_accuracy: 0.8213 - val_loss: 0.9769\n",
559
+ "Epoch 10/20\n",
560
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m118s\u001B[0m 252ms/step - accuracy: 0.6848 - loss: 1.1299 - val_accuracy: 0.8310 - val_loss: 0.8569\n",
561
+ "Epoch 11/20\n",
562
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m140s\u001B[0m 298ms/step - accuracy: 0.7070 - loss: 1.0295 - val_accuracy: 0.8393 - val_loss: 0.7659\n",
563
+ "Epoch 12/20\n",
564
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m139s\u001B[0m 297ms/step - accuracy: 0.7191 - loss: 0.9594 - val_accuracy: 0.8467 - val_loss: 0.6961\n",
565
+ "Epoch 13/20\n",
566
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m139s\u001B[0m 297ms/step - accuracy: 0.7353 - loss: 0.8954 - val_accuracy: 0.8541 - val_loss: 0.6418\n",
567
+ "Epoch 14/20\n",
568
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m141s\u001B[0m 301ms/step - accuracy: 0.7501 - loss: 0.8407 - val_accuracy: 0.8598 - val_loss: 0.5981\n",
569
+ "Epoch 15/20\n",
570
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m145s\u001B[0m 309ms/step - accuracy: 0.7608 - loss: 0.8010 - val_accuracy: 0.8637 - val_loss: 0.5631\n",
571
+ "Epoch 16/20\n",
572
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m140s\u001B[0m 299ms/step - accuracy: 0.7708 - loss: 0.7630 - val_accuracy: 0.8675 - val_loss: 0.5345\n",
573
+ "Epoch 17/20\n",
574
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m140s\u001B[0m 294ms/step - accuracy: 0.7818 - loss: 0.7294 - val_accuracy: 0.8720 - val_loss: 0.5091\n",
575
+ "Epoch 18/20\n",
576
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m136s\u001B[0m 290ms/step - accuracy: 0.7874 - loss: 0.7058 - val_accuracy: 0.8745 - val_loss: 0.4890\n",
577
+ "Epoch 19/20\n",
578
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m141s\u001B[0m 301ms/step - accuracy: 0.7911 - loss: 0.6918 - val_accuracy: 0.8777 - val_loss: 0.4711\n",
579
+ "Epoch 20/20\n",
580
+ "\u001B[1m469/469\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m144s\u001B[0m 306ms/step - accuracy: 0.7922 - loss: 0.6763 - val_accuracy: 0.8803 - val_loss: 0.4562\n",
581
+ "Test loss: 0.4561978578567505\n",
582
+ "Test accuracy: 0.880299985408783\n"
583
+ ]
584
+ }
585
+ ],
586
+ "execution_count": 19
587
+ },
588
+ {
589
+ "metadata": {},
590
+ "cell_type": "code",
591
+ "outputs": [],
592
+ "execution_count": null,
593
+ "source": ""
594
+ },
595
+ {
596
+ "metadata": {
597
+ "ExecuteTime": {
598
+ "end_time": "2025-03-13T13:19:19.771353Z",
599
+ "start_time": "2025-03-13T13:19:19.412004Z"
600
+ }
601
+ },
602
+ "cell_type": "code",
603
+ "source": "model.save(\"mnist_simple_cnn.h5\")",
604
+ "outputs": [
605
+ {
606
+ "name": "stderr",
607
+ "output_type": "stream",
608
+ "text": [
609
+ "WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`. \n"
610
+ ]
611
+ }
612
+ ],
613
+ "execution_count": 20
614
+ },
615
+ {
616
+ "metadata": {
617
+ "ExecuteTime": {
618
+ "end_time": "2025-03-13T13:20:22.752001Z",
619
+ "start_time": "2025-03-13T13:20:14.810410Z"
620
+ }
621
+ },
622
+ "cell_type": "code",
623
+ "source": "score = model.evaluate(x_test, y_test, verbose=0)\n",
624
+ "outputs": [],
625
+ "execution_count": 22
626
+ },
627
+ {
628
+ "metadata": {},
629
+ "cell_type": "code",
630
+ "outputs": [],
631
+ "execution_count": null,
632
+ "source": ""
633
+ }
634
+ ],
635
+ "metadata": {
636
+ "kernelspec": {
637
+ "display_name": "Python 3 (ipykernel)",
638
+ "language": "python",
639
+ "name": "python3"
640
+ },
641
+ "language_info": {
642
+ "codemirror_mode": {
643
+ "name": "ipython",
644
+ "version": 3
645
+ },
646
+ "file_extension": ".py",
647
+ "mimetype": "text/x-python",
648
+ "name": "python",
649
+ "nbconvert_exporter": "python",
650
+ "pygments_lexer": "ipython3",
651
+ "version": "3.10.11"
652
+ }
653
+ },
654
+ "nbformat": 4,
655
+ "nbformat_minor": 2
656
+ }
4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.2 - Image Classifier - CIFAR10.ipynb ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "### Let's run a simple image classifier, using the CIFAR 10 dataset of 10 image categories\n",
8
+ "* CIFAR's 10 categories:\n",
9
+ " * airplane\n",
10
+ " * automobile\n",
11
+ " * bird\n",
12
+ " * cat\n",
13
+ " * deer\n",
14
+ " * dog\n",
15
+ " * frog\n",
16
+ " * horse\n",
17
+ " * ship\n",
18
+ " * truck"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "code",
23
+ "metadata": {
24
+ "ExecuteTime": {
25
+ "end_time": "2025-03-13T16:11:52.999533Z",
26
+ "start_time": "2025-03-13T16:11:47.171043Z"
27
+ }
28
+ },
29
+ "source": [
30
+ "import cv2\n",
31
+ "import numpy as np\n",
32
+ "from tensorflow.keras.models import load_model\n",
33
+ "from tensorflow.keras.datasets import cifar10 \n",
34
+ "\n",
35
+ "img_row, img_height, img_depth = 32,32,3\n",
36
+ "\n",
37
+ "classifier = load_model('cifar_simple_cnn.h5')\n",
38
+ "\n",
39
+ "# Loads the CIFAR dataset\n",
40
+ "(x_train, y_train), (x_test, y_test) = cifar10.load_data()\n",
41
+ "color = True \n",
42
+ "scale = 8\n",
43
+ "\n",
44
+ "def draw_test(name, res, input_im, scale, img_row, img_height):\n",
45
+ " BLACK = [0,0,0]\n",
46
+ " res = int(res)\n",
47
+ " if res == 0:\n",
48
+ " pred = \"airplane\"\n",
49
+ " if res == 1:\n",
50
+ " pred = \"automobile\"\n",
51
+ " if res == 2:\n",
52
+ " pred = \"bird\"\n",
53
+ " if res == 3:\n",
54
+ " pred = \"cat\"\n",
55
+ " if res == 4:\n",
56
+ " pred = \"deer\"\n",
57
+ " if res == 5:\n",
58
+ " pred = \"dog\"\n",
59
+ " if res == 6:\n",
60
+ " pred = \"frog\"\n",
61
+ " if res == 7:\n",
62
+ " pred = \"horse\"\n",
63
+ " if res == 8:\n",
64
+ " pred = \"ship\"\n",
65
+ " if res == 9:\n",
66
+ " pred = \"truck\"\n",
67
+ " \n",
68
+ " expanded_image = cv2.copyMakeBorder(input_im, 0, 0, 0, imageL.shape[0]*2 ,cv2.BORDER_CONSTANT,value=BLACK)\n",
69
+ " if color == False:\n",
70
+ " expanded_image = cv2.cvtColor(expanded_image, cv2.COLOR_GRAY2BGR)\n",
71
+ " cv2.putText(expanded_image, str(pred), (300, 80) , cv2.FONT_HERSHEY_COMPLEX_SMALL,4, (0,255,0), 2)\n",
72
+ " cv2.imshow(name, expanded_image)\n",
73
+ "\n",
74
+ "\n",
75
+ "for i in range(0,10):\n",
76
+ " rand = np.random.randint(0,len(x_test))\n",
77
+ " input_im = x_test[rand]\n",
78
+ " imageL = cv2.resize(input_im, None, fx=scale, fy=scale, interpolation = cv2.INTER_CUBIC) \n",
79
+ " input_im = input_im.reshape(1,img_row, img_height, img_depth) \n",
80
+ " \n",
81
+ " ## Get Prediction\n",
82
+ " res = str(classifier.predict_classes(input_im, 1, verbose = 0)[0])\n",
83
+ " \n",
84
+ " draw_test(\"Prediction\", res, imageL, scale, img_row, img_height) \n",
85
+ " cv2.waitKey(0)\n",
86
+ "\n",
87
+ "cv2.destroyAllWindows()"
88
+ ],
89
+ "outputs": [
90
+ {
91
+ "name": "stderr",
92
+ "output_type": "stream",
93
+ "text": [
94
+ "2025-03-13 21:41:47.742478: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
95
+ "2025-03-13 21:41:47.746738: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.\n",
96
+ "2025-03-13 21:41:47.759288: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
97
+ "WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n",
98
+ "E0000 00:00:1741882307.783990 24522 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
99
+ "E0000 00:00:1741882307.790137 24522 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
100
+ "2025-03-13 21:41:47.814207: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
101
+ "To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
102
+ "/home/newton/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.\n",
103
+ " super().__init__(activity_regularizer=activity_regularizer, **kwargs)\n"
104
+ ]
105
+ },
106
+ {
107
+ "ename": "ValueError",
108
+ "evalue": "Kernel shape must have the same length as input, but received kernel of shape (3, 3, 3, 32) and input of shape (None, None, 32, 32, 3).",
109
+ "output_type": "error",
110
+ "traceback": [
111
+ "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
112
+ "\u001B[0;31mValueError\u001B[0m Traceback (most recent call last)",
113
+ "Cell \u001B[0;32mIn[1], line 8\u001B[0m\n\u001B[1;32m 4\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;21;01mtensorflow\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mkeras\u001B[39;00m\u001B[38;5;21;01m.\u001B[39;00m\u001B[38;5;21;01mdatasets\u001B[39;00m \u001B[38;5;28;01mimport\u001B[39;00m cifar10 \n\u001B[1;32m 6\u001B[0m img_row, img_height, img_depth \u001B[38;5;241m=\u001B[39m \u001B[38;5;241m32\u001B[39m,\u001B[38;5;241m32\u001B[39m,\u001B[38;5;241m3\u001B[39m\n\u001B[0;32m----> 8\u001B[0m classifier \u001B[38;5;241m=\u001B[39m load_model(\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mcifar_simple_cnn.h5\u001B[39m\u001B[38;5;124m'\u001B[39m)\n\u001B[1;32m 10\u001B[0m \u001B[38;5;66;03m# Loads the CIFAR dataset\u001B[39;00m\n\u001B[1;32m 11\u001B[0m (x_train, y_train), (x_test, y_test) \u001B[38;5;241m=\u001B[39m cifar10\u001B[38;5;241m.\u001B[39mload_data()\n",
114
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/saving/saving_api.py:196\u001B[0m, in \u001B[0;36mload_model\u001B[0;34m(filepath, custom_objects, compile, safe_mode)\u001B[0m\n\u001B[1;32m 189\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m saving_lib\u001B[38;5;241m.\u001B[39mload_model(\n\u001B[1;32m 190\u001B[0m filepath,\n\u001B[1;32m 191\u001B[0m custom_objects\u001B[38;5;241m=\u001B[39mcustom_objects,\n\u001B[1;32m 192\u001B[0m \u001B[38;5;28mcompile\u001B[39m\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mcompile\u001B[39m,\n\u001B[1;32m 193\u001B[0m safe_mode\u001B[38;5;241m=\u001B[39msafe_mode,\n\u001B[1;32m 194\u001B[0m )\n\u001B[1;32m 195\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mstr\u001B[39m(filepath)\u001B[38;5;241m.\u001B[39mendswith((\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m.h5\u001B[39m\u001B[38;5;124m\"\u001B[39m, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m.hdf5\u001B[39m\u001B[38;5;124m\"\u001B[39m)):\n\u001B[0;32m--> 196\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m legacy_h5_format\u001B[38;5;241m.\u001B[39mload_model_from_hdf5(\n\u001B[1;32m 197\u001B[0m filepath, custom_objects\u001B[38;5;241m=\u001B[39mcustom_objects, \u001B[38;5;28mcompile\u001B[39m\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mcompile\u001B[39m\n\u001B[1;32m 198\u001B[0m )\n\u001B[1;32m 199\u001B[0m \u001B[38;5;28;01melif\u001B[39;00m \u001B[38;5;28mstr\u001B[39m(filepath)\u001B[38;5;241m.\u001B[39mendswith(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m.keras\u001B[39m\u001B[38;5;124m\"\u001B[39m):\n\u001B[1;32m 200\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;167;01mValueError\u001B[39;00m(\n\u001B[1;32m 201\u001B[0m \u001B[38;5;124mf\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mFile not found: filepath=\u001B[39m\u001B[38;5;132;01m{\u001B[39;00mfilepath\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m. \u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 202\u001B[0m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mPlease ensure the file is an accessible `.keras` \u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 203\u001B[0m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mzip file.\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 204\u001B[0m )\n",
115
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/legacy/saving/legacy_h5_format.py:133\u001B[0m, in \u001B[0;36mload_model_from_hdf5\u001B[0;34m(filepath, custom_objects, compile)\u001B[0m\n\u001B[1;32m 130\u001B[0m model_config \u001B[38;5;241m=\u001B[39m json_utils\u001B[38;5;241m.\u001B[39mdecode(model_config)\n\u001B[1;32m 132\u001B[0m \u001B[38;5;28;01mwith\u001B[39;00m saving_options\u001B[38;5;241m.\u001B[39mkeras_option_scope(use_legacy_config\u001B[38;5;241m=\u001B[39m\u001B[38;5;28;01mTrue\u001B[39;00m):\n\u001B[0;32m--> 133\u001B[0m model \u001B[38;5;241m=\u001B[39m saving_utils\u001B[38;5;241m.\u001B[39mmodel_from_config(\n\u001B[1;32m 134\u001B[0m model_config, custom_objects\u001B[38;5;241m=\u001B[39mcustom_objects\n\u001B[1;32m 135\u001B[0m )\n\u001B[1;32m 137\u001B[0m \u001B[38;5;66;03m# set weights\u001B[39;00m\n\u001B[1;32m 138\u001B[0m load_weights_from_hdf5_group(f[\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mmodel_weights\u001B[39m\u001B[38;5;124m\"\u001B[39m], model)\n",
116
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/legacy/saving/saving_utils.py:85\u001B[0m, in \u001B[0;36mmodel_from_config\u001B[0;34m(config, custom_objects)\u001B[0m\n\u001B[1;32m 81\u001B[0m \u001B[38;5;66;03m# TODO(nkovela): Swap find and replace args during Keras 3.0 release\u001B[39;00m\n\u001B[1;32m 82\u001B[0m \u001B[38;5;66;03m# Replace keras refs with keras\u001B[39;00m\n\u001B[1;32m 83\u001B[0m config \u001B[38;5;241m=\u001B[39m _find_replace_nested_dict(config, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mkeras.\u001B[39m\u001B[38;5;124m\"\u001B[39m, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mkeras.\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[0;32m---> 85\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m serialization\u001B[38;5;241m.\u001B[39mdeserialize_keras_object(\n\u001B[1;32m 86\u001B[0m config,\n\u001B[1;32m 87\u001B[0m module_objects\u001B[38;5;241m=\u001B[39mMODULE_OBJECTS\u001B[38;5;241m.\u001B[39mALL_OBJECTS,\n\u001B[1;32m 88\u001B[0m custom_objects\u001B[38;5;241m=\u001B[39mcustom_objects,\n\u001B[1;32m 89\u001B[0m printable_module_name\u001B[38;5;241m=\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mlayer\u001B[39m\u001B[38;5;124m\"\u001B[39m,\n\u001B[1;32m 90\u001B[0m )\n",
117
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/legacy/saving/serialization.py:495\u001B[0m, in \u001B[0;36mdeserialize_keras_object\u001B[0;34m(identifier, module_objects, custom_objects, printable_module_name)\u001B[0m\n\u001B[1;32m 490\u001B[0m cls_config \u001B[38;5;241m=\u001B[39m _find_replace_nested_dict(\n\u001B[1;32m 491\u001B[0m cls_config, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mkeras.\u001B[39m\u001B[38;5;124m\"\u001B[39m, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mkeras.\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 492\u001B[0m )\n\u001B[1;32m 494\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mcustom_objects\u001B[39m\u001B[38;5;124m\"\u001B[39m \u001B[38;5;129;01min\u001B[39;00m arg_spec\u001B[38;5;241m.\u001B[39margs:\n\u001B[0;32m--> 495\u001B[0m deserialized_obj \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mcls\u001B[39m\u001B[38;5;241m.\u001B[39mfrom_config(\n\u001B[1;32m 496\u001B[0m cls_config,\n\u001B[1;32m 497\u001B[0m custom_objects\u001B[38;5;241m=\u001B[39m{\n\u001B[1;32m 498\u001B[0m \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mobject_registration\u001B[38;5;241m.\u001B[39mGLOBAL_CUSTOM_OBJECTS,\n\u001B[1;32m 499\u001B[0m \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mcustom_objects,\n\u001B[1;32m 500\u001B[0m },\n\u001B[1;32m 501\u001B[0m )\n\u001B[1;32m 502\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 503\u001B[0m \u001B[38;5;28;01mwith\u001B[39;00m object_registration\u001B[38;5;241m.\u001B[39mCustomObjectScope(custom_objects):\n",
118
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/models/sequential.py:359\u001B[0m, in \u001B[0;36mSequential.from_config\u001B[0;34m(cls, config, custom_objects)\u001B[0m\n\u001B[1;32m 354\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 355\u001B[0m layer \u001B[38;5;241m=\u001B[39m serialization_lib\u001B[38;5;241m.\u001B[39mdeserialize_keras_object(\n\u001B[1;32m 356\u001B[0m layer_config,\n\u001B[1;32m 357\u001B[0m custom_objects\u001B[38;5;241m=\u001B[39mcustom_objects,\n\u001B[1;32m 358\u001B[0m )\n\u001B[0;32m--> 359\u001B[0m model\u001B[38;5;241m.\u001B[39madd(layer)\n\u001B[1;32m 360\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m (\n\u001B[1;32m 361\u001B[0m \u001B[38;5;129;01mnot\u001B[39;00m model\u001B[38;5;241m.\u001B[39m_functional\n\u001B[1;32m 362\u001B[0m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mbuild_input_shape\u001B[39m\u001B[38;5;124m\"\u001B[39m \u001B[38;5;129;01min\u001B[39;00m \u001B[38;5;28mlocals\u001B[39m()\n\u001B[1;32m 363\u001B[0m \u001B[38;5;129;01mand\u001B[39;00m build_input_shape\n\u001B[1;32m 364\u001B[0m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(build_input_shape, (\u001B[38;5;28mtuple\u001B[39m, \u001B[38;5;28mlist\u001B[39m))\n\u001B[1;32m 365\u001B[0m ):\n\u001B[1;32m 366\u001B[0m model\u001B[38;5;241m.\u001B[39mbuild(build_input_shape)\n",
119
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/models/sequential.py:122\u001B[0m, in \u001B[0;36mSequential.add\u001B[0;34m(self, layer, rebuild)\u001B[0m\n\u001B[1;32m 120\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers\u001B[38;5;241m.\u001B[39mappend(layer)\n\u001B[1;32m 121\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m rebuild:\n\u001B[0;32m--> 122\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_maybe_rebuild()\n\u001B[1;32m 123\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 124\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mbuilt \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mFalse\u001B[39;00m\n",
120
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/models/sequential.py:141\u001B[0m, in \u001B[0;36mSequential._maybe_rebuild\u001B[0;34m(self)\u001B[0m\n\u001B[1;32m 139\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers[\u001B[38;5;241m0\u001B[39m], InputLayer) \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mlen\u001B[39m(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers) \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m1\u001B[39m:\n\u001B[1;32m 140\u001B[0m input_shape \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers[\u001B[38;5;241m0\u001B[39m]\u001B[38;5;241m.\u001B[39mbatch_shape\n\u001B[0;32m--> 141\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mbuild(input_shape)\n\u001B[1;32m 142\u001B[0m \u001B[38;5;28;01melif\u001B[39;00m \u001B[38;5;28mhasattr\u001B[39m(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers[\u001B[38;5;241m0\u001B[39m], \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124minput_shape\u001B[39m\u001B[38;5;124m\"\u001B[39m) \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mlen\u001B[39m(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers) \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m1\u001B[39m:\n\u001B[1;32m 143\u001B[0m \u001B[38;5;66;03m# We can build the Sequential model if the first layer has the\u001B[39;00m\n\u001B[1;32m 144\u001B[0m \u001B[38;5;66;03m# `input_shape` property. This is most commonly found in Functional\u001B[39;00m\n\u001B[1;32m 145\u001B[0m \u001B[38;5;66;03m# model.\u001B[39;00m\n\u001B[1;32m 146\u001B[0m input_shape \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers[\u001B[38;5;241m0\u001B[39m]\u001B[38;5;241m.\u001B[39minput_shape\n",
121
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/layers/layer.py:228\u001B[0m, in \u001B[0;36mLayer.__new__.<locals>.build_wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n\u001B[1;32m 226\u001B[0m \u001B[38;5;28;01mwith\u001B[39;00m obj\u001B[38;5;241m.\u001B[39m_open_name_scope():\n\u001B[1;32m 227\u001B[0m obj\u001B[38;5;241m.\u001B[39m_path \u001B[38;5;241m=\u001B[39m current_path()\n\u001B[0;32m--> 228\u001B[0m original_build_method(\u001B[38;5;241m*\u001B[39margs, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs)\n\u001B[1;32m 229\u001B[0m \u001B[38;5;66;03m# Record build config.\u001B[39;00m\n\u001B[1;32m 230\u001B[0m signature \u001B[38;5;241m=\u001B[39m inspect\u001B[38;5;241m.\u001B[39msignature(original_build_method)\n",
122
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/models/sequential.py:187\u001B[0m, in \u001B[0;36mSequential.build\u001B[0;34m(self, input_shape)\u001B[0m\n\u001B[1;32m 185\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m layer \u001B[38;5;129;01min\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_layers[\u001B[38;5;241m1\u001B[39m:]:\n\u001B[1;32m 186\u001B[0m \u001B[38;5;28;01mtry\u001B[39;00m:\n\u001B[0;32m--> 187\u001B[0m x \u001B[38;5;241m=\u001B[39m layer(x)\n\u001B[1;32m 188\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mNotImplementedError\u001B[39;00m:\n\u001B[1;32m 189\u001B[0m \u001B[38;5;66;03m# Can happen if shape inference is not implemented.\u001B[39;00m\n\u001B[1;32m 190\u001B[0m \u001B[38;5;66;03m# TODO: consider reverting inbound nodes on layers processed.\u001B[39;00m\n\u001B[1;32m 191\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m\n",
123
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/utils/traceback_utils.py:122\u001B[0m, in \u001B[0;36mfilter_traceback.<locals>.error_handler\u001B[0;34m(*args, **kwargs)\u001B[0m\n\u001B[1;32m 119\u001B[0m filtered_tb \u001B[38;5;241m=\u001B[39m _process_traceback_frames(e\u001B[38;5;241m.\u001B[39m__traceback__)\n\u001B[1;32m 120\u001B[0m \u001B[38;5;66;03m# To get the full stack trace, call:\u001B[39;00m\n\u001B[1;32m 121\u001B[0m \u001B[38;5;66;03m# `keras.config.disable_traceback_filtering()`\u001B[39;00m\n\u001B[0;32m--> 122\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m e\u001B[38;5;241m.\u001B[39mwith_traceback(filtered_tb) \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m\n\u001B[1;32m 123\u001B[0m \u001B[38;5;28;01mfinally\u001B[39;00m:\n\u001B[1;32m 124\u001B[0m \u001B[38;5;28;01mdel\u001B[39;00m filtered_tb\n",
124
+ "File \u001B[0;32m~/miniconda3/envs/dl/lib/python3.12/site-packages/keras/src/ops/operation_utils.py:184\u001B[0m, in \u001B[0;36mcompute_conv_output_shape\u001B[0;34m(input_shape, filters, kernel_size, strides, padding, data_format, dilation_rate)\u001B[0m\n\u001B[1;32m 182\u001B[0m kernel_shape \u001B[38;5;241m=\u001B[39m kernel_size \u001B[38;5;241m+\u001B[39m (input_shape[\u001B[38;5;241m1\u001B[39m], filters)\n\u001B[1;32m 183\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mlen\u001B[39m(kernel_shape) \u001B[38;5;241m!=\u001B[39m \u001B[38;5;28mlen\u001B[39m(input_shape):\n\u001B[0;32m--> 184\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;167;01mValueError\u001B[39;00m(\n\u001B[1;32m 185\u001B[0m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mKernel shape must have the same length as input, but received \u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 186\u001B[0m \u001B[38;5;124mf\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mkernel of shape \u001B[39m\u001B[38;5;132;01m{\u001B[39;00mkernel_shape\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m and \u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 187\u001B[0m \u001B[38;5;124mf\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124minput of shape \u001B[39m\u001B[38;5;132;01m{\u001B[39;00minput_shape\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m.\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 188\u001B[0m )\n\u001B[1;32m 189\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(dilation_rate, \u001B[38;5;28mint\u001B[39m):\n\u001B[1;32m 190\u001B[0m dilation_rate \u001B[38;5;241m=\u001B[39m (dilation_rate,) \u001B[38;5;241m*\u001B[39m \u001B[38;5;28mlen\u001B[39m(spatial_shape)\n",
125
+ "\u001B[0;31mValueError\u001B[0m: Kernel shape must have the same length as input, but received kernel of shape (3, 3, 3, 32) and input of shape (None, None, 32, 32, 3)."
126
+ ]
127
+ }
128
+ ],
129
+ "execution_count": 1
130
+ },
131
+ {
132
+ "cell_type": "code",
133
+ "execution_count": null,
134
+ "metadata": {},
135
+ "outputs": [],
136
+ "source": []
137
+ },
138
+ {
139
+ "cell_type": "code",
140
+ "execution_count": null,
141
+ "metadata": {},
142
+ "outputs": [],
143
+ "source": []
144
+ }
145
+ ],
146
+ "metadata": {
147
+ "kernelspec": {
148
+ "display_name": "Python 3",
149
+ "language": "python",
150
+ "name": "python3"
151
+ },
152
+ "language_info": {
153
+ "codemirror_mode": {
154
+ "name": "ipython",
155
+ "version": 3
156
+ },
157
+ "file_extension": ".py",
158
+ "mimetype": "text/x-python",
159
+ "name": "python",
160
+ "nbconvert_exporter": "python",
161
+ "pygments_lexer": "ipython3",
162
+ "version": "3.7.4"
163
+ }
164
+ },
165
+ "nbformat": 4,
166
+ "nbformat_minor": 2
167
+ }
4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/4.3. Live Sketching.ipynb ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "metadata": {
6
+ "ExecuteTime": {
7
+ "end_time": "2025-03-13T12:18:54.614963Z",
8
+ "start_time": "2025-03-13T12:18:52.126661Z"
9
+ }
10
+ },
11
+ "source": [
12
+ "#import keras\n",
13
+ "import cv2\n",
14
+ "import numpy as np\n",
15
+ "import matplotlib\n",
16
+ "print (cv2.__version__)"
17
+ ],
18
+ "outputs": [
19
+ {
20
+ "name": "stdout",
21
+ "output_type": "stream",
22
+ "text": [
23
+ "4.11.0\n"
24
+ ]
25
+ }
26
+ ],
27
+ "execution_count": 1
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "metadata": {
32
+ "ExecuteTime": {
33
+ "end_time": "2025-03-13T12:25:16.378140Z",
34
+ "start_time": "2025-03-13T12:25:15.608514Z"
35
+ }
36
+ },
37
+ "source": [
38
+ "import cv2\n",
39
+ "import numpy as np\n",
40
+ "\n",
41
+ "# Our sketch generating function\n",
42
+ "def sketch(image):\n",
43
+ " # Convert image to grayscale\n",
44
+ " img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)\n",
45
+ " \n",
46
+ " # Clean up image using Guassian Blur\n",
47
+ " img_gray_blur = cv2.GaussianBlur(img_gray, (5,5), 0)\n",
48
+ " \n",
49
+ " # Extract edges\n",
50
+ " canny_edges = cv2.Canny(img_gray_blur, 20, 50)\n",
51
+ " \n",
52
+ " # Do an invert binarize the image \n",
53
+ " ret, mask = cv2.threshold(canny_edges, 70, 255, cv2.THRESH_BINARY_INV)\n",
54
+ " return mask\n",
55
+ "\n",
56
+ "\n",
57
+ "# Initialize webcam, cap is the object provided by VideoCapture\n",
58
+ "# It contains a boolean indicating if it was sucessful (ret)\n",
59
+ "# It also contains the images collected from the webcam (frame)\n",
60
+ "cap = cv2.VideoCapture(2)\n",
61
+ "\n",
62
+ "while True:\n",
63
+ " ret, frame = cap.read()\n",
64
+ " cv2.imshow('Our Live Sketcher', sketch(frame))\n",
65
+ " if cv2.waitKey(1) == 13 or cv2.waitKey(1) == 27: #13 is the Enter Key\n",
66
+ " break\n",
67
+ " \n",
68
+ "# Release camera and close windows\n",
69
+ "cap.release()\n",
70
+ "cv2.destroyAllWindows() "
71
+ ],
72
+ "outputs": [
73
+ {
74
+ "name": "stderr",
75
+ "output_type": "stream",
76
+ "text": [
77
+ "[ WARN:0@382.189] global cap_v4l.cpp:913 open VIDEOIO(V4L2:/dev/video2): can't open camera by index\n",
78
+ "[ERROR:0@382.862] global obsensor_uvc_stream_channel.cpp:158 getStreamChannelGroup Camera index out of range\n",
79
+ "[ WARN:0@382.866] global cap_v4l.cpp:803 requestBuffers VIDEOIO(V4L2:/dev/video2): failed VIDIOC_REQBUFS: errno=19 (No such device)\n"
80
+ ]
81
+ },
82
+ {
83
+ "ename": "error",
84
+ "evalue": "OpenCV(4.11.0) /io/opencv/modules/imgproc/src/color.cpp:199: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'\n",
85
+ "output_type": "error",
86
+ "traceback": [
87
+ "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
88
+ "\u001B[0;31merror\u001B[0m Traceback (most recent call last)",
89
+ "Cell \u001B[0;32mIn[6], line 27\u001B[0m\n\u001B[1;32m 25\u001B[0m \u001B[38;5;28;01mwhile\u001B[39;00m \u001B[38;5;28;01mTrue\u001B[39;00m:\n\u001B[1;32m 26\u001B[0m ret, frame \u001B[38;5;241m=\u001B[39m cap\u001B[38;5;241m.\u001B[39mread()\n\u001B[0;32m---> 27\u001B[0m cv2\u001B[38;5;241m.\u001B[39mimshow(\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mOur Live Sketcher\u001B[39m\u001B[38;5;124m'\u001B[39m, sketch(frame))\n\u001B[1;32m 28\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m cv2\u001B[38;5;241m.\u001B[39mwaitKey(\u001B[38;5;241m1\u001B[39m) \u001B[38;5;241m==\u001B[39m \u001B[38;5;241m13\u001B[39m \u001B[38;5;129;01mor\u001B[39;00m cv2\u001B[38;5;241m.\u001B[39mwaitKey(\u001B[38;5;241m1\u001B[39m) \u001B[38;5;241m==\u001B[39m \u001B[38;5;241m27\u001B[39m: \u001B[38;5;66;03m#13 is the Enter Key\u001B[39;00m\n\u001B[1;32m 29\u001B[0m \u001B[38;5;28;01mbreak\u001B[39;00m\n",
90
+ "Cell \u001B[0;32mIn[6], line 7\u001B[0m, in \u001B[0;36msketch\u001B[0;34m(image)\u001B[0m\n\u001B[1;32m 5\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21msketch\u001B[39m(image):\n\u001B[1;32m 6\u001B[0m \u001B[38;5;66;03m# Convert image to grayscale\u001B[39;00m\n\u001B[0;32m----> 7\u001B[0m img_gray \u001B[38;5;241m=\u001B[39m cv2\u001B[38;5;241m.\u001B[39mcvtColor(image, cv2\u001B[38;5;241m.\u001B[39mCOLOR_BGR2GRAY)\n\u001B[1;32m 9\u001B[0m \u001B[38;5;66;03m# Clean up image using Guassian Blur\u001B[39;00m\n\u001B[1;32m 10\u001B[0m img_gray_blur \u001B[38;5;241m=\u001B[39m cv2\u001B[38;5;241m.\u001B[39mGaussianBlur(img_gray, (\u001B[38;5;241m5\u001B[39m,\u001B[38;5;241m5\u001B[39m), \u001B[38;5;241m0\u001B[39m)\n",
91
+ "\u001B[0;31merror\u001B[0m: OpenCV(4.11.0) /io/opencv/modules/imgproc/src/color.cpp:199: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'\n"
92
+ ]
93
+ }
94
+ ],
95
+ "execution_count": 6
96
+ },
97
+ {
98
+ "cell_type": "code",
99
+ "execution_count": null,
100
+ "metadata": {},
101
+ "outputs": [],
102
+ "source": []
103
+ }
104
+ ],
105
+ "metadata": {
106
+ "kernelspec": {
107
+ "display_name": "Python 3",
108
+ "language": "python",
109
+ "name": "python3"
110
+ },
111
+ "language_info": {
112
+ "codemirror_mode": {
113
+ "name": "ipython",
114
+ "version": 3
115
+ },
116
+ "file_extension": ".py",
117
+ "mimetype": "text/x-python",
118
+ "name": "python",
119
+ "nbconvert_exporter": "python",
120
+ "pygments_lexer": "ipython3",
121
+ "version": "3.7.4"
122
+ }
123
+ },
124
+ "nbformat": 4,
125
+ "nbformat_minor": 2
126
+ }
4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/Test - Imports Keras, OpenCV and tests webcam.ipynb ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "metadata": {
6
+ "ExecuteTime": {
7
+ "end_time": "2025-03-13T12:26:00.769321Z",
8
+ "start_time": "2025-03-13T12:25:51.814601Z"
9
+ }
10
+ },
11
+ "source": [
12
+ "import cv2\n",
13
+ "import numpy as np\n",
14
+ "\n",
15
+ "# Our sketch generating function\n",
16
+ "def sketch(image):\n",
17
+ " # Convert image to grayscale\n",
18
+ " img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)\n",
19
+ " \n",
20
+ " # Clean up image using Gaussian Blur\n",
21
+ " img_gray_blur = cv2.GaussianBlur(img_gray, (5,5), 0)\n",
22
+ " \n",
23
+ " # Extract edges\n",
24
+ " canny_edges = cv2.Canny(img_gray_blur, 10, 70)\n",
25
+ " \n",
26
+ " # Do an invert binarize the image \n",
27
+ " ret, mask = cv2.threshold(canny_edges, 70, 255, cv2.THRESH_BINARY_INV)\n",
28
+ " return mask\n",
29
+ "\n",
30
+ "\n",
31
+ "# Initialize webcam, cap is the object provided by VideoCapture\n",
32
+ "# It contains a boolean indicating if it was sucessful (ret)\n",
33
+ "# It also contains the images collected from the webcam (frame)\n",
34
+ "cap = cv2.VideoCapture(3)\n",
35
+ "\n",
36
+ "while True:\n",
37
+ " ret, frame = cap.read()\n",
38
+ " cv2.imshow('Our Live Sketcher', sketch(frame))\n",
39
+ " if cv2.waitKey(1) == 13: #13 is the Enter Key\n",
40
+ " break\n",
41
+ " \n",
42
+ "# Release camera and close windows\n",
43
+ "cap.release()\n",
44
+ "cv2.destroyAllWindows() "
45
+ ],
46
+ "outputs": [],
47
+ "execution_count": 1
48
+ },
49
+ {
50
+ "cell_type": "code",
51
+ "metadata": {
52
+ "ExecuteTime": {
53
+ "end_time": "2025-02-24T14:10:10.316586Z",
54
+ "start_time": "2025-02-24T14:10:10.209728Z"
55
+ }
56
+ },
57
+ "source": "cap = cv2.VideoCapture(0)",
58
+ "outputs": [
59
+ {
60
+ "name": "stderr",
61
+ "output_type": "stream",
62
+ "text": [
63
+ "[ WARN:0@1028.418] global cap_v4l.cpp:913 open VIDEOIO(V4L2:/dev/video0): can't open camera by index\n",
64
+ "[ERROR:0@1028.522] global obsensor_uvc_stream_channel.cpp:158 getStreamChannelGroup Camera index out of range\n"
65
+ ]
66
+ }
67
+ ],
68
+ "execution_count": 24
69
+ },
70
+ {
71
+ "metadata": {
72
+ "ExecuteTime": {
73
+ "end_time": "2025-02-24T14:10:11.179764Z",
74
+ "start_time": "2025-02-24T14:10:11.162834Z"
75
+ }
76
+ },
77
+ "cell_type": "code",
78
+ "source": "cap.read()\n",
79
+ "outputs": [
80
+ {
81
+ "data": {
82
+ "text/plain": [
83
+ "(False, None)"
84
+ ]
85
+ },
86
+ "execution_count": 25,
87
+ "metadata": {},
88
+ "output_type": "execute_result"
89
+ }
90
+ ],
91
+ "execution_count": 25
92
+ },
93
+ {
94
+ "metadata": {},
95
+ "cell_type": "code",
96
+ "outputs": [],
97
+ "execution_count": null,
98
+ "source": "cv2.imshow('Our Live Sketcher', sketch(frame))"
99
+ }
100
+ ],
101
+ "metadata": {
102
+ "kernelspec": {
103
+ "display_name": "Python 3",
104
+ "language": "python",
105
+ "name": "python3"
106
+ },
107
+ "language_info": {
108
+ "codemirror_mode": {
109
+ "name": "ipython",
110
+ "version": 3
111
+ },
112
+ "file_extension": ".py",
113
+ "mimetype": "text/x-python",
114
+ "name": "python",
115
+ "nbconvert_exporter": "python",
116
+ "pygments_lexer": "ipython3",
117
+ "version": "3.7.4"
118
+ }
119
+ },
120
+ "nbformat": 4,
121
+ "nbformat_minor": 2
122
+ }
4. Get Started! Handwritting Recognition, Simple Object Classification & OpenCV Demo/preprocessors.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import cv2
3
+
4
+ def x_cord_contour(contour):
5
+ # This function take a contour from findContours
6
+ # it then outputs the x centroid coordinates
7
+ M = cv2.moments(contour)
8
+ return (int(M['m10']/M['m00']))
9
+
10
+
11
+ def makeSquare(not_square):
12
+ # This function takes an image and makes the dimenions square
13
+ # It adds black pixels as the padding where needed
14
+
15
+ BLACK = [0,0,0]
16
+ img_dim = not_square.shape
17
+ height = img_dim[0]
18
+ width = img_dim[1]
19
+ #print("Height = ", height, "Width = ", width)
20
+ if (height == width):
21
+ square = not_square
22
+ return square
23
+ else:
24
+ doublesize = cv2.resize(not_square,(2*width, 2*height), interpolation = cv2.INTER_CUBIC)
25
+ height = height * 2
26
+ width = width * 2
27
+ #print("New Height = ", height, "New Width = ", width)
28
+ if (height > width):
29
+ pad = int((height - width)/2)
30
+ #print("Padding = ", pad)
31
+ doublesize_square = cv2.copyMakeBorder(doublesize,0,0,pad,pad,cv2.BORDER_CONSTANT,value=BLACK)
32
+ else:
33
+ pad = (width - height)/2
34
+ #print("Padding = ", pad)
35
+ doublesize_square = cv2.copyMakeBorder(doublesize,pad,pad,0,0,\
36
+ cv2.BORDER_CONSTANT,value=BLACK)
37
+ doublesize_square_dim = doublesize_square.shape
38
+ #print("Sq Height = ", doublesize_square_dim[0], "Sq Width = ", doublesize_square_dim[1])
39
+ return doublesize_square
40
+
41
+
42
+ def resize_to_pixel(dimensions, image):
43
+ # This function then re-sizes an image to the specificied dimenions
44
+
45
+ buffer_pix = 4
46
+ dimensions = dimensions - buffer_pix
47
+ squared = image
48
+ r = float(dimensions) / squared.shape[1]
49
+ dim = (dimensions, int(squared.shape[0] * r))
50
+ resized = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
51
+ img_dim2 = resized.shape
52
+ height_r = img_dim2[0]
53
+ width_r = img_dim2[1]
54
+ BLACK = [0,0,0]
55
+ if (height_r > width_r):
56
+ resized = cv2.copyMakeBorder(resized,0,0,0,1,cv2.BORDER_CONSTANT,value=BLACK)
57
+ if (height_r < width_r):
58
+ resized = cv2.copyMakeBorder(resized,1,0,0,0,cv2.BORDER_CONSTANT,value=BLACK)
59
+ p = 2
60
+ ReSizedImg = cv2.copyMakeBorder(resized,p,p,p,p,cv2.BORDER_CONSTANT,value=BLACK)
61
+ img_dim = ReSizedImg.shape
62
+ height = img_dim[0]
63
+ width = img_dim[1]
64
+ #print("Padded Height = ", height, "Width = ", width)
65
+ return ReSizedImg
4. Handwriting Recognition/1. Get Started! Handwriting Recognition, Simple Object Classification OpenCV Demo.srt ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:00,570 --> 00:00:06,360
3
+ Hi and welcome to chapter 4 where basically the goal of this chapter is to get you up and running quickly
4
+
5
+ 2
6
+ 00:00:06,420 --> 00:00:08,130
7
+ using a developing machine.
8
+
9
+ 3
10
+ 00:00:08,130 --> 00:00:14,140
11
+ So you get to actually test some already trained classify models and actually play with open C.v.
12
+
13
+ 4
14
+ 00:00:14,240 --> 00:00:21,990
15
+ That's the section is broken down into tree chapters or subchapters I should say the first one we do.
16
+
17
+ 5
18
+ 00:00:22,020 --> 00:00:24,380
19
+ We play with our handwriting classifier.
20
+
21
+ 6
22
+ 00:00:24,420 --> 00:00:30,600
23
+ Second one we use an image classifier and Atid one we use open Sivi to basically turn what your webcam
24
+
25
+ 7
26
+ 00:00:30,600 --> 00:00:34,060
27
+ sees which is going to be you into a Live sketch.
28
+
29
+ 8
30
+ 00:00:34,080 --> 00:00:36,130
31
+ So stay tuned and let's get started.
32
+
33
+ 9
34
+ 00:00:42,120 --> 00:00:43,960
35
+ So.
4. Handwriting Recognition/2. Experiment with a Handwriting Classifier.srt ADDED
@@ -0,0 +1,363 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:00,600 --> 00:00:00,970
3
+ OK.
4
+
5
+ 2
6
+ 00:00:01,040 --> 00:00:06,930
7
+ So as I said in chapter four point one we're going to use an already trained and you'll understand what
8
+
9
+ 3
10
+ 00:00:06,930 --> 00:00:08,660
11
+ training is later on.
12
+
13
+ 4
14
+ 00:00:09,030 --> 00:00:13,110
15
+ But you're going to use something that we actually create in this course in the next couple of chapters
16
+
17
+ 5
18
+ 00:00:13,630 --> 00:00:15,700
19
+ to classify handwritten digits.
20
+
21
+ 6
22
+ 00:00:15,810 --> 00:00:16,870
23
+ So it's going to be pretty cool.
24
+
25
+ 7
26
+ 00:00:16,890 --> 00:00:17,960
27
+ So let's get started.
28
+
29
+ 8
30
+ 00:00:20,430 --> 00:00:20,770
31
+ OK.
32
+
33
+ 9
34
+ 00:00:20,820 --> 00:00:29,930
35
+ So let's bring up our development environment and that source activates C-v.
36
+
37
+ 10
38
+ 00:00:30,040 --> 00:00:31,930
39
+ And now let's bring up I put it in the books.
40
+
41
+ 11
42
+ 00:00:31,960 --> 00:00:40,410
43
+ This type python notebook and then we go to English and launch and we should have basically O home full
44
+
45
+ 12
46
+ 00:00:40,420 --> 00:00:41,410
47
+ of information here.
48
+
49
+ 13
50
+ 00:00:41,490 --> 00:00:48,190
51
+ So let's go to deep learning which is where all our projects on 90 percent of our projects are located.
52
+
53
+ 14
54
+ 00:00:48,620 --> 00:00:50,770
55
+ And let's go to chapter 4 of this chapter.
56
+
57
+ 15
58
+ 00:00:50,780 --> 00:00:52,240
59
+ You can see all the chapters are here.
60
+
61
+ 16
62
+ 00:00:54,050 --> 00:00:58,580
63
+ And it's up in this one and it should bring up this.
64
+
65
+ 17
66
+ 00:00:58,590 --> 00:01:03,790
67
+ This is called an AI Python book and it has left some simple notes here some comments here.
68
+
69
+ 18
70
+ 00:01:04,180 --> 00:01:09,810
71
+ This one I'm not going to actually teach much from discourse here is this code here sorry.
72
+
73
+ 19
74
+ 00:01:09,820 --> 00:01:11,470
75
+ I'm just going to actually execute it.
76
+
77
+ 20
78
+ 00:01:11,800 --> 00:01:13,660
79
+ And we get to see what it does.
80
+
81
+ 21
82
+ 00:01:13,930 --> 00:01:19,740
83
+ So basically before I begin let me just briefly briefly discuss decode these imports here.
84
+
85
+ 22
86
+ 00:01:19,750 --> 00:01:26,920
87
+ We're importing open CV non-pay which is array processing library and carious which I'll teach you about
88
+
89
+ 23
90
+ 00:01:26,920 --> 00:01:27,550
91
+ later on.
92
+
93
+ 24
94
+ 00:01:27,610 --> 00:01:33,550
95
+ This is basically the framework the deep leting framework we use to basically control operate tensor
96
+
97
+ 25
98
+ 00:01:33,550 --> 00:01:35,570
99
+ flow which runs in the background.
100
+
101
+ 26
102
+ 00:01:35,920 --> 00:01:38,440
103
+ And that does all the playing stuff.
104
+
105
+ 27
106
+ 00:01:38,560 --> 00:01:43,240
107
+ So we are loading the amnesty to set here and that's how we are putting in putting this from Karris
108
+
109
+ 28
110
+ 00:01:43,240 --> 00:01:45,330
111
+ here loading our model.
112
+
113
+ 29
114
+ 00:01:45,700 --> 00:01:51,790
115
+ And then this is a function we use to basically draw the prediction what our model makes over the actual
116
+
117
+ 30
118
+ 00:01:51,790 --> 00:01:52,440
119
+ digits.
120
+
121
+ 31
122
+ 00:01:52,480 --> 00:01:55,800
123
+ And you know what let's just actually run this and see how it works.
124
+
125
+ 32
126
+ 00:01:57,860 --> 00:02:02,240
127
+ It's loading districts here means it's loading.
128
+
129
+ 33
130
+ 00:02:02,330 --> 00:02:03,270
131
+ There we go.
132
+
133
+ 34
134
+ 00:02:03,290 --> 00:02:10,200
135
+ So now we have basically what this does it brings up a number from a tested asset test it said it's
136
+
137
+ 35
138
+ 00:02:10,220 --> 00:02:12,690
139
+ basically images of the machine learning.
140
+
141
+ 36
142
+ 00:02:12,700 --> 00:02:18,940
143
+ Sorry the deepening algorithm has not seen and basically tells you that it never saw digit before in
144
+
145
+ 37
146
+ 00:02:18,940 --> 00:02:22,790
147
+ its life but it tells you it's able to figure out that this is a 4.
148
+
149
+ 38
150
+ 00:02:23,230 --> 00:02:27,890
151
+ So now if you press enter it goes to in image and it basically does is up to 10 images.
152
+
153
+ 39
154
+ 00:02:27,920 --> 00:02:35,950
155
+ You can go up to 50 100 to tell us and if you want although I don't recommend going up to 2000 this
156
+
157
+ 40
158
+ 00:02:35,950 --> 00:02:38,240
159
+ is a two actually is pretty good.
160
+
161
+ 41
162
+ 00:02:38,240 --> 00:02:38,930
163
+ It's an 8.
164
+
165
+ 42
166
+ 00:02:38,960 --> 00:02:44,330
167
+ Good 1 1 6 8 tree 5 6.
168
+
169
+ 43
170
+ 00:02:44,480 --> 00:02:45,100
171
+ Perfect.
172
+
173
+ 44
174
+ 00:02:45,170 --> 00:02:52,250
175
+ So it appears this muddleheaded tree in here called this simple CNN is actually quite good at reading
176
+
177
+ 45
178
+ 00:02:53,030 --> 00:02:54,190
179
+ models and numbers.
180
+
181
+ 46
182
+ 00:02:54,190 --> 00:02:59,210
183
+ Sorry So can this work on the real numbers now.
184
+
185
+ 47
186
+ 00:02:59,550 --> 00:03:00,730
187
+ What do you mean by real numbers.
188
+
189
+ 48
190
+ 00:03:00,750 --> 00:03:02,220
191
+ We'll let's find out.
192
+
193
+ 49
194
+ 00:03:02,220 --> 00:03:04,520
195
+ Let's run this.
196
+
197
+ 50
198
+ 00:03:04,620 --> 00:03:11,300
199
+ If I didn't mention it to you before enter and shift on a keyboard together basically executes decode
200
+
201
+ 51
202
+ 00:03:11,430 --> 00:03:14,360
203
+ in the cell block I should have mentioned that the beginning.
204
+
205
+ 52
206
+ 00:03:14,400 --> 00:03:15,830
207
+ But you know you know.
208
+
209
+ 53
210
+ 00:03:16,170 --> 00:03:16,430
211
+ OK.
212
+
213
+ 54
214
+ 00:03:16,440 --> 00:03:17,330
215
+ So this is numbers.
216
+
217
+ 55
218
+ 00:03:17,400 --> 00:03:23,550
219
+ I actually wrote and took a picture with my phone and basically what I have to could have done here.
220
+
221
+ 56
222
+ 00:03:23,550 --> 00:03:30,660
223
+ It basically uses open C-v to extract the digits here and then it preprocessors it to basically make
224
+
225
+ 57
226
+ 00:03:30,660 --> 00:03:35,390
227
+ it look as similar as the trading data for the amnesty to set.
228
+
229
+ 58
230
+ 00:03:36,480 --> 00:03:38,290
231
+ And let's see.
232
+
233
+ 59
234
+ 00:03:38,290 --> 00:03:40,680
235
+ So basically what it did it took out this one.
236
+
237
+ 60
238
+ 00:03:41,030 --> 00:03:47,130
239
+ And you previously saw how the amnesty to set images were basically grayscale and white on black background.
240
+
241
+ 61
242
+ 00:03:47,450 --> 00:03:50,150
243
+ So it picked this out extracted it here.
244
+
245
+ 62
246
+ 00:03:50,410 --> 00:03:56,730
247
+ This is a con. We drew over it here and immediately it classifies this as one and it puts one here.
248
+
249
+ 63
250
+ 00:03:56,870 --> 00:03:57,630
251
+ So that's pretty good.
252
+
253
+ 64
254
+ 00:03:57,710 --> 00:04:00,100
255
+ Let's see if it gets a tree correct.
256
+
257
+ 65
258
+ 00:04:00,200 --> 00:04:00,780
259
+ Bingo.
260
+
261
+ 66
262
+ 00:04:00,780 --> 00:04:02,890
263
+ Got the tree exactly right.
264
+
265
+ 67
266
+ 00:04:03,160 --> 00:04:03,730
267
+ The fire.
268
+
269
+ 68
270
+ 00:04:03,740 --> 00:04:06,330
271
+ Exactly right 4 0.
272
+
273
+ 69
274
+ 00:04:06,650 --> 00:04:08,090
275
+ Perfect.
276
+
277
+ 70
278
+ 00:04:08,130 --> 00:04:11,460
279
+ So all of this is my open see the good.
280
+
281
+ 71
282
+ 00:04:11,870 --> 00:04:18,410
283
+ And if you want to understand how this code works going explain it's you Torley here but this should
284
+
285
+ 72
286
+ 00:04:18,410 --> 00:04:19,440
287
+ be fined.
288
+
289
+ 73
290
+ 00:04:20,120 --> 00:04:24,970
291
+ But I'm actually going to tell you that if you do want to learn this look at my open 72 reel.
292
+
293
+ 74
294
+ 00:04:25,130 --> 00:04:31,220
295
+ It'll teach you all these things what Gaussian Blue see edges contours how these things work how to
296
+
297
+ 75
298
+ 00:04:31,220 --> 00:04:36,680
299
+ use functions and you're sorting a Kontos here and how they actually sort contours and bring them up
300
+
301
+ 76
302
+ 00:04:36,680 --> 00:04:42,090
303
+ and extract them and then feed them to classify it which is what we're doing here to see him falsifier
304
+
305
+ 77
306
+ 00:04:42,170 --> 00:04:43,230
307
+ we used up here.
308
+
309
+ 78
310
+ 00:04:44,150 --> 00:04:46,790
311
+ So we're pretty good.
312
+
313
+ 79
314
+ 00:04:47,030 --> 00:04:53,400
315
+ You just actually use a handwritten digital classifier that I built and you're going to build it soon.
316
+
317
+ 80
318
+ 00:04:53,570 --> 00:04:56,150
319
+ Actually perhaps you can bill it right now.
320
+
321
+ 81
322
+ 00:04:56,150 --> 00:04:58,530
323
+ This is a code where I'm not going to go through in detail.
324
+
325
+ 82
326
+ 00:04:58,550 --> 00:05:05,420
327
+ As I said but this is a code that actually allows you to try and classify it is not as complicated as
328
+
329
+ 83
330
+ 00:05:05,420 --> 00:05:06,220
331
+ it looks here.
332
+
333
+ 84
334
+ 00:05:06,740 --> 00:05:11,400
335
+ If you're unfamiliar with it I expect you to be a little confuse what the hell is that size what are
336
+
337
+ 85
338
+ 00:05:11,450 --> 00:05:15,120
339
+ you POCs what is going on here blah blah blah.
340
+
341
+ 86
342
+ 00:05:15,440 --> 00:05:19,000
343
+ But don't worry all of these things will be made very simple soon.
344
+
345
+ 87
346
+ 00:05:19,400 --> 00:05:22,970
347
+ And basically you understand what these lines mean here.
348
+
349
+ 88
350
+ 00:05:23,150 --> 00:05:26,210
351
+ What this means here is actually the accuracy of our model.
352
+
353
+ 89
354
+ 00:05:26,530 --> 00:05:29,350
355
+ Ninety nine point zero sum which is pretty good.
356
+
357
+ 90
358
+ 00:05:29,350 --> 00:05:32,120
359
+ Not topical lanes to default but pretty good.
360
+
361
+ 91
362
+ 00:05:32,480 --> 00:05:37,790
363
+ And stay tuned you're going to learn how to do this in the upcoming chapters.
4. Handwriting Recognition/3. Experiment with a Image Classifier.srt ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:01,380 --> 00:00:07,690
3
+ So in Section Four point two we're actually going to use probably a little more complicated image Josephina
4
+
5
+ 2
6
+ 00:00:08,190 --> 00:00:12,380
7
+ one that classifies 10 different objects and you'll see those objects very soon.
8
+
9
+ 3
10
+ 00:00:12,450 --> 00:00:13,630
11
+ So let's get started.
12
+
13
+ 4
14
+ 00:00:16,130 --> 00:00:23,900
15
+ OK so now let's open section 4.2 which I called Image toxify Safar 10 probably couldn't come up with
16
+
17
+ 5
18
+ 00:00:23,900 --> 00:00:24,870
19
+ a cool name.
20
+
21
+ 6
22
+ 00:00:25,150 --> 00:00:25,460
23
+ OK.
24
+
25
+ 7
26
+ 00:00:25,490 --> 00:00:28,180
27
+ So I mentioned we're going to classify 10 objects.
28
+
29
+ 8
30
+ 00:00:28,220 --> 00:00:35,240
31
+ What this model surfactant is it's a pretty large model data dataset that has tons of pictures of airplanes
32
+
33
+ 9
34
+ 00:00:35,240 --> 00:00:39,920
35
+ automobiles birds cats dogs dogs frogs horses ships and trucks.
36
+
37
+ 10
38
+ 00:00:39,920 --> 00:00:45,140
39
+ Not entirely sure how they chose these categories of images but maybe it was just for purely research
40
+
41
+ 11
42
+ 00:00:45,150 --> 00:00:53,590
43
+ purposes to build image classifiers to detect how well it can pick out these images from data of intrusive
44
+
45
+ 12
46
+ 00:00:53,690 --> 00:00:55,580
47
+ images that it receives.
48
+
49
+ 13
50
+ 00:00:55,580 --> 00:01:02,360
51
+ So again this is a model I trained and I don't believe this model is actually super accurate but let's
52
+
53
+ 14
54
+ 00:01:02,360 --> 00:01:04,570
55
+ run this model and let's find out.
56
+
57
+ 15
58
+ 00:01:07,200 --> 00:01:11,510
59
+ This area here before but probably I mean the ethics that let's find out
60
+
61
+ 16
62
+ 00:01:15,200 --> 00:01:16,240
63
+ didn't happen again.
64
+
65
+ 17
66
+ 00:01:16,490 --> 00:01:17,090
67
+ Get it.
68
+
69
+ 18
70
+ 00:01:17,330 --> 00:01:21,010
71
+ So now says this is a ship and this is clearly a call.
72
+
73
+ 19
74
+ 00:01:21,050 --> 00:01:24,840
75
+ So we immediately know this model isn't great but it got the idea right.
76
+
77
+ 20
78
+ 00:01:24,860 --> 00:01:31,090
79
+ At least things as they're called Shock right shock again because the frog right.
80
+
81
+ 21
82
+ 00:01:31,310 --> 00:01:32,170
83
+ Looks like a frog.
84
+
85
+ 22
86
+ 00:01:32,180 --> 00:01:37,260
87
+ But it was a ship it's called that automobile bed.
88
+
89
+ 23
90
+ 00:01:37,380 --> 00:01:40,050
91
+ Definitely not a truck looks like a cat.
92
+
93
+ 24
94
+ 00:01:40,050 --> 00:01:41,530
95
+ This is a truck.
96
+
97
+ 25
98
+ 00:01:41,530 --> 00:01:43,070
99
+ So let's roll it one more time.
100
+
101
+ 26
102
+ 00:01:47,900 --> 00:01:50,440
103
+ Remember to run the press shift and enter together.
104
+
105
+ 27
106
+ 00:01:50,870 --> 00:01:55,580
107
+ And it usually pops up in the window here and windows it pops up on top of the screen which is quite
108
+
109
+ 28
110
+ 00:01:55,580 --> 00:01:55,980
111
+ nice.
112
+
113
+ 29
114
+ 00:01:56,000 --> 00:02:00,470
115
+ An event that comes up in the back room so you kind of have to look for there but it kind of doesn't
116
+
117
+ 30
118
+ 00:02:00,470 --> 00:02:01,610
119
+ shimmy shake thing here.
120
+
121
+ 31
122
+ 00:02:01,700 --> 00:02:03,740
123
+ So if you're paying attention you should have seen it.
124
+
125
+ 32
126
+ 00:02:03,950 --> 00:02:05,380
127
+ So go to shipwright.
128
+
129
+ 33
130
+ 00:02:05,600 --> 00:02:07,140
131
+ It's not a truck it's a frog.
132
+
133
+ 34
134
+ 00:02:07,380 --> 00:02:08,860
135
+ It's not a dog it's a cat.
136
+
137
+ 35
138
+ 00:02:08,960 --> 00:02:10,160
139
+ It's not a ship it's a bird.
140
+
141
+ 36
142
+ 00:02:10,160 --> 00:02:11,630
143
+ This is a dog.
144
+
145
+ 37
146
+ 00:02:11,630 --> 00:02:12,470
147
+ This is a deer.
148
+
149
+ 38
150
+ 00:02:12,470 --> 00:02:13,580
151
+ This is an airplane.
152
+
153
+ 39
154
+ 00:02:13,580 --> 00:02:15,300
155
+ This is called a truck.
156
+
157
+ 40
158
+ 00:02:15,300 --> 00:02:19,430
159
+ And that show entirely what they would have labeled it as Truck Show truck.
160
+
161
+ 41
162
+ 00:02:19,430 --> 00:02:20,300
163
+ Sure.
164
+
165
+ 42
166
+ 00:02:20,300 --> 00:02:24,280
167
+ So as I said later and it isn't a very good model.
168
+
169
+ 43
170
+ 00:02:24,290 --> 00:02:30,110
171
+ I think the accuracy I got on this one was 60 or 70 percent actually could have gone a lot higher if
172
+
173
+ 44
174
+ 00:02:30,110 --> 00:02:33,800
175
+ I left it for longer probably didn't want to.
176
+
177
+ 45
178
+ 00:02:34,430 --> 00:02:39,830
179
+ But this gives you a good idea of limitations of training a model for a short period of time which was
180
+
181
+ 46
182
+ 00:02:39,830 --> 00:02:41,650
183
+ this and this was called Safar simple.
184
+
185
+ 47
186
+ 00:02:41,770 --> 00:02:44,380
187
+ I think I use a very simple CNN for this one.
188
+
189
+ 48
190
+ 00:02:44,780 --> 00:02:48,750
191
+ And later on in the course you're actually going to use far more complicated models.
192
+
193
+ 49
194
+ 00:02:48,750 --> 00:02:52,400
195
+ We're going to train for a longer time to get even higher accuracies on these.
4. Handwriting Recognition/4. OpenCV Demo – Live Sketch with Webcam.srt ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:00,480 --> 00:00:06,330
3
+ OK and no welcome back to Chapter Four point tree where we're now going to use open Sivy a bit more.
4
+
5
+ 2
6
+ 00:00:06,420 --> 00:00:11,800
7
+ Now I didn't mention to you maybe I did last two chapters but we did use open C-v slightly.
8
+
9
+ 3
10
+ 00:00:11,880 --> 00:00:16,830
11
+ We used up Sivy to basically draw or put a text on the images itself.
12
+
13
+ 4
14
+ 00:00:16,830 --> 00:00:20,530
15
+ So when I said prediction was this I'll put a number on them.
16
+
17
+ 5
18
+ 00:00:20,820 --> 00:00:26,610
19
+ And then this image basically I was I was using I was done using open C.v.
20
+
21
+ 6
22
+ 00:00:26,970 --> 00:00:30,780
23
+ So that's actually use up and see if it has something a little more cool which is giving you a live
24
+
25
+ 7
26
+ 00:00:30,780 --> 00:00:35,300
27
+ sketch tree or a webcam.
28
+
29
+ 8
30
+ 00:00:35,300 --> 00:00:37,380
31
+ OK so now let's bring up this program.
32
+
33
+ 9
34
+ 00:00:37,430 --> 00:00:43,460
35
+ This I in that book called Lives sketching and it comes up here.
36
+
37
+ 10
38
+ 00:00:43,490 --> 00:00:46,960
39
+ So now you actually need this code here.
40
+
41
+ 11
42
+ 00:00:46,960 --> 00:00:48,430
43
+ It just lets it into actually.
44
+
45
+ 12
46
+ 00:00:48,440 --> 00:00:50,610
47
+ So you can test you imports.
48
+
49
+ 13
50
+ 00:00:50,630 --> 00:00:52,300
51
+ It's not really necessary.
52
+
53
+ 14
54
+ 00:00:52,580 --> 00:00:57,020
55
+ And this is the open secret code here that actually will bring up old sketch.
56
+
57
+ 15
58
+ 00:00:57,330 --> 00:01:02,060
59
+ Now firstly I'm going to run this for the first time and I'm going to show you why it's not going to
60
+
61
+ 16
62
+ 00:01:02,060 --> 00:01:02,700
63
+ work.
64
+
65
+ 17
66
+ 00:01:03,680 --> 00:01:04,430
67
+ There we go.
68
+
69
+ 18
70
+ 00:01:04,790 --> 00:01:11,600
71
+ And the area it says here basically this it is C++ code is reporting that it was supposed to get something
72
+
73
+ 19
74
+ 00:01:12,020 --> 00:01:14,450
75
+ but it was empty in dysfunction.
76
+
77
+ 20
78
+ 00:01:14,900 --> 00:01:21,690
79
+ And basically as you can see from the traceback on Python here this function CVT color it basically
80
+
81
+ 21
82
+ 00:01:21,690 --> 00:01:24,010
83
+ it didn't get an image because it was empty.
84
+
85
+ 22
86
+ 00:01:24,060 --> 00:01:24,630
87
+ All right.
88
+
89
+ 23
90
+ 00:01:24,890 --> 00:01:30,590
91
+ And the reason for that fall is because in Ubuntu individual machine we actually have to be initialized
92
+
93
+ 24
94
+ 00:01:30,660 --> 00:01:32,200
95
+ up come first.
96
+
97
+ 25
98
+ 00:01:32,210 --> 00:01:34,010
99
+ Now that's basically a one click process.
100
+
101
+ 26
102
+ 00:01:34,010 --> 00:01:40,950
103
+ You just go to devices right here and you see something called webcams here integrated webcam basically
104
+
105
+ 27
106
+ 00:01:40,950 --> 00:01:42,620
107
+ by looking at it turns it on.
108
+
109
+ 28
110
+ 00:01:42,780 --> 00:01:44,690
111
+ And no we run this code again.
112
+
113
+ 29
114
+ 00:01:46,450 --> 00:01:46,950
115
+ There we go.
116
+
117
+ 30
118
+ 00:01:46,990 --> 00:01:49,260
119
+ So this is me here slouching on my chair.
120
+
121
+ 31
122
+ 00:01:49,270 --> 00:01:50,270
123
+ Sorry about that.
124
+
125
+ 32
126
+ 00:01:50,680 --> 00:01:54,690
127
+ And this is me here in a sketch waving and stuff to you.
128
+
129
+ 33
130
+ 00:01:54,930 --> 00:01:57,190
131
+ You can actually play with a troubles if you want.
132
+
133
+ 34
134
+ 00:01:57,190 --> 00:01:59,160
135
+ After But it is pretty cool.
136
+
137
+ 35
138
+ 00:01:59,170 --> 00:02:01,560
139
+ Kind of doesn't sketchy outline of me.
140
+
141
+ 36
142
+ 00:02:01,930 --> 00:02:10,840
143
+ If I move my laptop around and it's this is my bocher room right here this is my bedroom in the back.
144
+
145
+ 37
146
+ 00:02:10,840 --> 00:02:12,800
147
+ All right let's close this here.
148
+
149
+ 38
150
+ 00:02:13,300 --> 00:02:16,380
151
+ And as I said you can actually play with it open TV functions.
152
+
153
+ 39
154
+ 00:02:16,540 --> 00:02:19,240
155
+ So it opens the function that does the edges here.
156
+
157
+ 40
158
+ 00:02:19,240 --> 00:02:21,460
159
+ Is this why Heckel cutting edges.
160
+
161
+ 41
162
+ 00:02:21,450 --> 00:02:25,080
163
+ So I've got exactly what this 10 and 70 do.
164
+
165
+ 42
166
+ 00:02:25,270 --> 00:02:31,260
167
+ But if I do remember correctly it was manipulating basically the threshold sensitivity.
168
+
169
+ 43
170
+ 00:02:31,270 --> 00:02:35,430
171
+ So let's try it this is a range from five to 570.
172
+
173
+ 44
174
+ 00:02:35,440 --> 00:02:37,400
175
+ So let's try from 10 to 70.
176
+
177
+ 45
178
+ 00:02:37,610 --> 00:02:45,820
179
+ Let's try 5 210 and I'm pretty sure you're going to get more lines and I'll sketch Let's find out.
180
+
181
+ 46
182
+ 00:02:46,200 --> 00:02:50,070
183
+ No actually we go to less Landreneau it gets the opposite.
184
+
185
+ 47
186
+ 00:02:50,130 --> 00:02:51,990
187
+ So now let's undo that.
188
+
189
+ 48
190
+ 00:02:53,920 --> 00:02:54,790
191
+ I tense empty.
192
+
193
+ 49
194
+ 00:02:54,800 --> 00:02:58,530
195
+ And let's try to get to 50 and let's put it that way.
196
+
197
+ 50
198
+ 00:02:59,310 --> 00:03:02,390
199
+ See what happens to.
200
+
201
+ 51
202
+ 00:03:02,650 --> 00:03:03,550
203
+ There we go.
204
+
205
+ 52
206
+ 00:03:03,820 --> 00:03:08,710
207
+ A lot more lines in R6 no image you can see the actual lines in my hand.
208
+
209
+ 53
210
+ 00:03:08,710 --> 00:03:10,020
211
+ This is actually pretty cool.
212
+
213
+ 54
214
+ 00:03:10,360 --> 00:03:14,020
215
+ So that was play open see if you demo here.
216
+
217
+ 55
218
+ 00:03:14,250 --> 00:03:19,580
219
+ These are a lot of different open TV functions going on here is Gustin blue changing image from great
220
+
221
+ 56
222
+ 00:03:19,650 --> 00:03:20,220
223
+ color.
224
+
225
+ 57
226
+ 00:03:21,100 --> 00:03:22,190
227
+ I mean call it the gray.
228
+
229
+ 58
230
+ 00:03:22,200 --> 00:03:23,150
231
+ Sorry.
232
+
233
+ 59
234
+ 00:03:23,380 --> 00:03:28,960
235
+ Some thresholding if you really do want to learn what's going on here which is not again not necessarily
236
+
237
+ 60
238
+ 00:03:29,570 --> 00:03:37,510
239
+ what a deep leading part of course please feel free to do my looping CV of course it's included free
240
+
241
+ 61
242
+ 00:03:37,620 --> 00:03:39,860
243
+ in the scores trial is if it's free.
244
+
245
+ 62
246
+ 00:03:40,180 --> 00:03:43,100
247
+ So go go ahead and enjoy if you want to it.
248
+
249
+ 63
250
+ 00:03:43,240 --> 00:03:46,180
251
+ If not let's get started with learning about neural nets.
5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/1.1 Code Download.html ADDED
@@ -0,0 +1 @@
 
 
1
+ <script type="text/javascript">window.location = "https://drive.google.com/file/d/1qMOZHzl29ib3oJcnd5a4uY0f5TsQd7Y2/view?usp=sharing";</script>
5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/10. Transformations, Affine And Non-Affine - The Many Ways We Can Change Images.srt ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:01,340 --> 00:00:07,350
3
+ Hi and welcome to Section 3 of this course where we take a look at Image manipulations so image manipulations
4
+
5
+ 2
6
+ 00:00:07,350 --> 00:00:09,740
7
+ cover a wide range of topics as you can see here.
8
+
9
+ 3
10
+ 00:00:09,780 --> 00:00:13,930
11
+ However they're all very important in understanding the foundations of women's CV.
12
+
13
+ 4
14
+ 00:00:14,190 --> 00:00:19,500
15
+ So we're going to take a look at some simple transforms then do some simple operations which includes
16
+
17
+ 5
18
+ 00:00:19,590 --> 00:00:25,590
19
+ the important thresholding in education which actually forms a foundation of the first mini project
20
+
21
+ 6
22
+ 00:00:25,590 --> 00:00:28,800
23
+ you're going to complete which is making a Live sketch of yourself.
24
+
25
+ 7
26
+ 00:00:31,030 --> 00:00:33,000
27
+ So let's talk a bit about transmission's.
28
+
29
+ 8
30
+ 00:00:33,070 --> 00:00:35,280
31
+ So what exactly are transmission's.
32
+
33
+ 9
34
+ 00:00:35,440 --> 00:00:40,660
35
+ Simply put they're geometric distortions in acted upon an image and all these distortions can be symbolic
36
+
37
+ 10
38
+ 00:00:40,690 --> 00:00:43,130
39
+ can be things like resizing rotation.
40
+
41
+ 11
42
+ 00:00:43,120 --> 00:00:48,460
43
+ Translation However the more complicated distortions such as perspective issues arising from different
44
+
45
+ 12
46
+ 00:00:48,460 --> 00:00:50,750
47
+ views that image was captured from.
48
+
49
+ 13
50
+ 00:00:51,160 --> 00:00:56,400
51
+ So we would classify transmission's into two main categories and computer vision division fine and none
52
+
53
+ 14
54
+ 00:00:56,550 --> 00:00:58,430
55
+ of.
56
+
57
+ 15
58
+ 00:00:58,460 --> 00:01:04,220
59
+ So let's talk a bit about fine and I find transmission's now I find transmission's include things such
60
+
61
+ 16
62
+ 00:01:04,220 --> 00:01:09,320
63
+ as Skilling rotation and translation and a key point to notice is that lines that were Parlow in the
64
+
65
+ 17
66
+ 00:01:09,320 --> 00:01:13,220
67
+ original images are also Pardoe and these transformed images.
68
+
69
+ 18
70
+ 00:01:13,220 --> 00:01:18,230
71
+ So where do you skill that image routed or translate parallelism between lines.
72
+
73
+ 19
74
+ 00:01:18,230 --> 00:01:24,890
75
+ All of this Benty and not so in French transmissions which are also known as projective and demography
76
+
77
+ 20
78
+ 00:01:24,890 --> 00:01:32,950
79
+ transforms as you can see here these lines are no longer parallel in the strands from dementia.
80
+
81
+ 21
82
+ 00:01:33,160 --> 00:01:40,200
83
+ However in nunna I find transmissions culinarily an instance is minty and what is called In the early.
84
+
85
+ 22
86
+ 00:01:40,420 --> 00:01:46,540
87
+ What it means is that if do all the points on display here remain still on this line and this line and
88
+
89
+ 23
90
+ 00:01:46,540 --> 00:01:52,830
91
+ vice versa it's also important to note that none of French transmission's are very common in computer
92
+
93
+ 24
94
+ 00:01:52,830 --> 00:01:56,490
95
+ vision and they actually originate from different camera angles.
96
+
97
+ 25
98
+ 00:01:56,490 --> 00:01:59,420
99
+ So imagine you were facing the square from a top down view.
100
+
101
+ 26
102
+ 00:01:59,760 --> 00:02:05,800
103
+ However you begin to move the camera this way you can imagine your viewpoint with no excuse to you've
104
+
105
+ 27
106
+ 00:02:05,850 --> 00:02:07,760
107
+ actually be seeing these two points closer together.
108
+
109
+ 28
110
+ 00:02:07,800 --> 00:02:08,710
111
+ Well these two fit.
112
+
113
+ 29
114
+ 00:02:08,730 --> 00:02:14,730
115
+ But that is how a non alpha transmissions come about and they always and can be division that we correct
116
+
117
+ 30
118
+ 00:02:14,730 --> 00:02:16,640
119
+ these which we will come to shortly.
120
+
121
+ 31
122
+ 00:02:18,180 --> 00:02:21,840
123
+ So let's No actually implement some of these transmissions in open sea be called.
5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/11. Image Translations - Moving Images Up, Down. Left And Right.srt ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:00,780 --> 00:00:02,500
3
+ So let's talk a bit about transitions.
4
+
5
+ 2
6
+ 00:00:02,550 --> 00:00:07,080
7
+ Transitions are actually very simple and it's basically moving an image in one direction can be left
8
+
9
+ 3
10
+ 00:00:07,080 --> 00:00:09,400
11
+ right up down or even diagonally.
12
+
13
+ 4
14
+ 00:00:09,400 --> 00:00:13,020
15
+ If you implement an x and y traslation at the same time.
16
+
17
+ 5
18
+ 00:00:13,380 --> 00:00:19,950
19
+ So to perform a translation we actually use Open see these C-v to walk a fine function but that function
20
+
21
+ 6
22
+ 00:00:19,950 --> 00:00:23,250
23
+ requires what we call the translation matrix.
24
+
25
+ 7
26
+ 00:00:23,250 --> 00:00:29,280
27
+ So we are getting into too much high school geometry a translation matrix basically is is in this form
28
+
29
+ 8
30
+ 00:00:29,780 --> 00:00:33,020
31
+ and takes an x and y value as these elements here.
32
+
33
+ 9
34
+ 00:00:33,450 --> 00:00:39,570
35
+ Now what misrepresents here is a shift along the x axis horizontally and Y is shift along the y axis
36
+
37
+ 10
38
+ 00:00:39,570 --> 00:00:40,370
39
+ vertically.
40
+
41
+ 11
42
+ 00:00:40,590 --> 00:00:43,180
43
+ And these are the directions each shift takes place.
44
+
45
+ 12
46
+ 00:00:43,360 --> 00:00:50,130
47
+ So images all the start of being at the top left corner and retranslated in any direction using the
48
+
49
+ 13
50
+ 00:00:50,130 --> 00:00:51,680
51
+ transition matrix here.
52
+
53
+ 14
54
+ 00:00:51,750 --> 00:00:53,660
55
+ So let's implement this and quite quickly.
56
+
57
+ 15
58
+ 00:00:55,760 --> 00:00:58,750
59
+ So let's implement transitions using open C-v now.
60
+
61
+ 16
62
+ 00:00:58,760 --> 00:01:00,230
63
+ So we're going through line by line.
64
+
65
+ 17
66
+ 00:01:00,290 --> 00:01:02,580
67
+ I'll quickly show you what's being done here.
68
+
69
+ 18
70
+ 00:01:02,600 --> 00:01:06,940
71
+ The important thing to note is that are using to see V2 warp and function.
72
+
73
+ 19
74
+ 00:01:07,020 --> 00:01:12,920
75
+ And that's been implemented down here so quickly going through the image as we've done before we extract
76
+
77
+ 20
78
+ 00:01:12,920 --> 00:01:17,390
79
+ the height and the width of the image using non-pay see function taking only the first two elements
80
+
81
+ 21
82
+ 00:01:17,540 --> 00:01:21,050
83
+ of the ship three that it retains.
84
+
85
+ 22
86
+ 00:01:21,290 --> 00:01:25,340
87
+ Next I have a line here where we extract quarter of the height and width of do it.
88
+
89
+ 23
90
+ 00:01:25,340 --> 00:01:29,660
91
+ That's going to be a T x and y value in our translation Matrix.
92
+
93
+ 24
94
+ 00:01:29,710 --> 00:01:35,400
95
+ That's the direction or sorry the amount of pixels we're going to shift to image.
96
+
97
+ 25
98
+ 00:01:35,990 --> 00:01:42,140
99
+ And we actually use not by fluke to the two that actually defines the read data type for where translations
100
+
101
+ 26
102
+ 00:01:42,290 --> 00:01:43,680
103
+ matrix.
104
+
105
+ 27
106
+ 00:01:44,060 --> 00:01:48,660
107
+ And by using some square brackets here we actually created a T matrix here.
108
+
109
+ 28
110
+ 00:01:48,770 --> 00:01:53,600
111
+ It may or may not be important for you understand this but just take note of the form of this t matrix
112
+
113
+ 29
114
+ 00:01:53,600 --> 00:01:54,260
115
+ here.
116
+
117
+ 30
118
+ 00:01:54,770 --> 00:01:57,270
119
+ So the warp find function takes away image.
120
+
121
+ 31
122
+ 00:01:57,270 --> 00:02:04,370
123
+ So if we look at it in the matrix that we created and all within a height as a table and it actually
124
+
125
+ 32
126
+ 00:02:04,820 --> 00:02:06,590
127
+ returns the translated image.
128
+
129
+ 33
130
+ 00:02:06,650 --> 00:02:09,230
131
+ So let's actually run this and see what it looks like.
132
+
133
+ 34
134
+ 00:02:11,040 --> 00:02:20,650
135
+ Tulla this is a translated image here as you can see it's shifted image of an extraction a quarter of
136
+
137
+ 35
138
+ 00:02:20,650 --> 00:02:22,840
139
+ the initial dimensions.
140
+
141
+ 36
142
+ 00:02:23,050 --> 00:02:27,740
143
+ And similarly for the White image and we're just done with.
144
+
145
+ 37
146
+ 00:02:27,760 --> 00:02:32,110
147
+ So it's important to know that we should just take a look at a T matrix to give give you an understanding
148
+
149
+ 38
150
+ 00:02:32,110 --> 00:02:33,620
151
+ of what we have done here.
152
+
153
+ 39
154
+ 00:02:34,150 --> 00:02:36,870
155
+ So this is exactly what we wanted in 0 2 metrics.
156
+
157
+ 40
158
+ 00:02:36,890 --> 00:02:44,800
159
+ It's 1 0 in disorder anti-X and the way this being a quarter of the height and a quarter of the weight
160
+
161
+ 41
162
+ 00:02:45,100 --> 00:02:45,720
163
+ respectively.
5. OpenCV Tutorial - Learn Classic Computer Vision & Face Detection (OPTIONAL)/12. Rotations - How To Spin Your Image Around And Do Horizontal Flipping.srt ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1
2
+ 00:00:02,190 --> 00:00:05,050
3
+ So let's talk a bit about rotations now rotations.
4
+
5
+ 2
6
+ 00:00:05,220 --> 00:00:06,180
7
+ Pretty simple as well.
8
+
9
+ 3
10
+ 00:00:06,180 --> 00:00:09,590
11
+ HOLOVATY also more Quick's that I'll teach you quickly.
12
+
13
+ 4
14
+ 00:00:09,720 --> 00:00:15,980
15
+ So using a favorite politician again Donald Trump issued exactly what a rotation is.
16
+
17
+ 5
18
+ 00:00:15,990 --> 00:00:20,660
19
+ So as you can see it rotates about a point and this point here in this image is the center.
20
+
21
+ 6
22
+ 00:00:20,970 --> 00:00:26,010
23
+ And just to set the image around that point so imagine that point as a pivot.
24
+
25
+ 7
26
+ 00:00:26,010 --> 00:00:29,710
27
+ So similarly in no translation transformation.
28
+
29
+ 8
30
+ 00:00:30,160 --> 00:00:31,270
31
+ We had a matrix.
32
+
33
+ 9
34
+ 00:00:31,290 --> 00:00:36,650
35
+ Now we have an M matrix which is the shorthand used for the rotation matrix.
36
+
37
+ 10
38
+ 00:00:36,660 --> 00:00:38,390
39
+ Here is the angle of rotation.
40
+
41
+ 11
42
+ 00:00:38,430 --> 00:00:42,920
43
+ As you can see it's an anti-clockwise angle measured from this point here.
44
+
45
+ 12
46
+ 00:00:43,650 --> 00:00:50,520
47
+ And one important thing to note is that the open Sivy function which is get tradition which is 2D actually
48
+
49
+ 13
50
+ 00:00:50,520 --> 00:00:55,580
51
+ does scaling an irritation at the same time which you will see in the code.
52
+
53
+ 14
54
+ 00:00:56,010 --> 00:01:01,460
55
+ So let's implement some mortician's using open C-v superstitions are very similar to translations in
56
+
57
+ 15
58
+ 00:01:01,470 --> 00:01:04,290
59
+ that we actually have to generate a transformation matrix.
60
+
61
+ 16
62
+ 00:01:04,290 --> 00:01:09,420
63
+ What we call the rotation matrix here and to do that we actually use an open CB function called Get
64
+
65
+ 17
66
+ 00:01:09,420 --> 00:01:11,450
67
+ a rotation matrix 2d.
68
+
69
+ 18
70
+ 00:01:11,450 --> 00:01:13,840
71
+ Now this function is fairly simple as well.
72
+
73
+ 19
74
+ 00:01:13,890 --> 00:01:21,900
75
+ What you give it is this argument is sense of x and y Center which corresponds to width and height that's
76
+
77
+ 20
78
+ 00:01:21,900 --> 00:01:23,120
79
+ the center point of the image.
80
+
81
+ 21
82
+ 00:01:23,130 --> 00:01:28,950
83
+ In most cases however you can and you are free to retaliate among among any point in image for whatever
84
+
85
+ 22
86
+ 00:01:28,950 --> 00:01:32,780
87
+ reason that maybe this is a theater which is an angle of addition.
88
+
89
+ 23
90
+ 00:01:32,790 --> 00:01:38,420
91
+ Remember it's anti-clockwise and this is a scaling factor which will bring or bring up shortly.
92
+
93
+ 24
94
+ 00:01:38,820 --> 00:01:44,030
95
+ And we use this rotation matrix again and it happens if you walk a fine function.
96
+
97
+ 25
98
+ 00:01:44,060 --> 00:01:46,810
99
+ So let's run this could see what it looks like.
100
+
101
+ 26
102
+ 00:01:46,820 --> 00:01:47,250
103
+ There we go.
104
+
105
+ 27
106
+ 00:01:47,260 --> 00:01:52,610
107
+ So that's that image rotated 90 degrees anti-clockwise and you may have noticed that a top and bottom
108
+
109
+ 28
110
+ 00:01:52,610 --> 00:01:57,370
111
+ parts of the image are cropped mainly because the canvas séjour means a seam.
112
+
113
+ 29
114
+ 00:01:57,440 --> 00:01:59,660
115
+ So there are two ways we can fix that one.
116
+
117
+ 30
118
+ 00:02:00,110 --> 00:02:01,430
119
+ Let's adjust the scale here.
120
+
121
+ 31
122
+ 00:02:01,440 --> 00:02:05,610
123
+ So let's put this point five wiggle.
124
+
125
+ 32
126
+ 00:02:05,700 --> 00:02:08,520
127
+ However it is a lot of black space around this image.
128
+
129
+ 33
130
+ 00:02:08,910 --> 00:02:15,450
131
+ Now to avoid that you would actually have to actually put the specific width and height of the new rotated
132
+
133
+ 34
134
+ 00:02:15,450 --> 00:02:16,210
135
+ image.
136
+
137
+ 35
138
+ 00:02:16,320 --> 00:02:19,880
139
+ So whatever you anticipate it to be you can program it here.
140
+
141
+ 36
142
+ 00:02:20,490 --> 00:02:23,190
143
+ However that's not often what is not.
144
+
145
+ 37
146
+ 00:02:23,190 --> 00:02:27,510
147
+ It's a bit tedious to do sometimes unless you have a special case for it.
148
+
149
+ 38
150
+ 00:02:27,510 --> 00:02:33,420
151
+ There's another way we can do it and that's by using the transpose function here not the transpose function
152
+
153
+ 39
154
+ 00:02:33,640 --> 00:02:34,990
155
+ that's run the school and see what happens.
156
+
157
+ 40
158
+ 00:02:35,020 --> 00:02:39,640
159
+ This is actually transposes image 100 percent.
160
+
161
+ 41
162
+ 00:02:39,660 --> 00:02:44,530
163
+ No you may not see the bottom because it's cut off from the screen casting off here.
164
+
165
+ 42
166
+ 00:02:44,970 --> 00:02:52,340
167
+ Over that image is fully done and here anti-clockwise and there's no black space around it.
168
+
169
+ 43
170
+ 00:02:55,230 --> 00:02:58,960
171
+ So this is just another easy and convenient way for teaching images.
172
+
173
+ 44
174
+ 00:02:58,980 --> 00:03:04,470
175
+ However when he does it in 90 degrees increments at one time so he can play around with us and actually
176
+
177
+ 45
178
+ 00:03:04,470 --> 00:03:10,230
179
+ do some cool rotations without having to actually program it with a new with a new height of the new
180
+
181
+ 46
182
+ 00:03:10,230 --> 00:03:10,860
183
+ canvas.