Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

.ipynb_checkpoints/quick_start_pytorch-checkpoint.ipynb +991 -0
quick_start_pytorch.ipynb +991 -0

.ipynb_checkpoints/quick_start_pytorch-checkpoint.ipynb ADDED Viewed

	@@ -0,0 +1,991 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a4090294-3349-4815-96f4-98010b657359",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "# Paperspace Gradient: PyTorch Quick Start\n",
+    "Last modified: Sep 27th 2022"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "4936c59a-8535-43cf-a527-e9323b2b658e",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Purpose and intended audience\n",
+    "\n",
+    "This Quick Start tutorial demonstrates PyTorch usage in a Gradient Notebook. It is aimed at users who are relatviely new to PyTorch, although you will need to be familiar with Python to understand PyTorch code.\n",
+    "\n",
+    "We use PyTorch to\n",
+    "\n",
+    "- Build a neural network that classifies FashionMNIST images\n",
+    "- Train and evaluate the network\n",
+    "- Save the model\n",
+    "- Perform predictions\n",
+    "\n",
+    "followed by some next steps that you can take to proceed with using Gradient.\n",
+    "\n",
+    "The material is based on the original [PyTorch Quick Start](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) at the time of writing this notebook.\n",
+    "\n",
+    "See the end of the notebook for the original copyright notice."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a55c3131-9437-483d-9c19-a165fbf8b6d4",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Check that you are on a GPU machine\n",
+    "\n",
+    "The notebook is designed to run on a Gradient GPU machine (as opposed to a CPU-only machine). The machine type, e.g., A4000, can be seen by clicking on the Machine icon on the left-hand navigation bar in the Gradient Notebook interface. It will say if it is CPU or GPU.\n",
+    "\n",
+    "![quick_start_pytorch_images/example_instance_type.png](quick_start_pytorch_images/example_instance_type.png)\n",
+    "\n",
+    "The *Creating models* section below also determines whether or not a GPU is available for us to use.\n",
+    "\n",
+    "If the machine type is CPU, you can change it by clicking *Stop Machine*, then the machine type displayed to get a drop-down list. Select a GPU machine and start up the Notebook again.\n",
+    "\n",
+    "For help with machines, see the Gradient documentation on [machine types](https://docs.paperspace.com/gradient/machines/) or [starting a Gradient Notebook](https://docs.paperspace.com/gradient/explore-train-deploy/notebooks)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "28402a66-a8c4-4672-9592-cc530b58d439",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Working with data\n",
+    "\n",
+    "PyTorch has two [primitives to work with data](https://pytorch.org/docs/stable/data.html):\n",
+    "``torch.utils.data.DataLoader`` and ``torch.utils.data.Dataset``.\n",
+    "``Dataset`` stores the samples and their corresponding labels, and ``DataLoader`` wraps an iterable around\n",
+    "the ``Dataset``."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:04.965047Z",
+     "iopub.status.busy": "2022-09-27T20:36:04.964421Z",
+     "iopub.status.idle": "2022-09-27T20:36:06.330541Z",
+     "shell.execute_reply": "2022-09-27T20:36:06.329333Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:04.965047Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 2,
+     "id": "2bab3caa-e156-4635-bc21-53031ebea60d",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "from torch import nn\n",
+    "from torch.utils.data import DataLoader\n",
+    "from torchvision import datasets\n",
+    "from torchvision.transforms import ToTensor, Lambda, Compose"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0dfb0116-56cd-4795-bc5e-79baad627726",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "PyTorch offers domain-specific libraries such as [TorchText](https://pytorch.org/text/stable/index.html),\n",
+    "[TorchVision](https://pytorch.org/vision/stable/index.html), and [TorchAudio](https://pytorch.org/audio/stable/index.html),\n",
+    "all of which include datasets. For this tutorial, we will be using a TorchVision dataset.\n",
+    "\n",
+    "The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like\n",
+    "CIFAR, COCO ([full list here](https://pytorch.org/vision/stable/datasets.html)). In this tutorial, we\n",
+    "use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and\n",
+    "``target_transform`` to modify the samples and labels respectively."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:06.332087Z",
+     "iopub.status.busy": "2022-09-27T20:36:06.331786Z",
+     "iopub.status.idle": "2022-09-27T20:36:33.429172Z",
+     "shell.execute_reply": "2022-09-27T20:36:33.428023Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:06.332087Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 3,
+     "id": "631deddf-30f0-45f1-84ab-e5f4c510c500",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "30b312514a6a4edcb608e92ecda0a385",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/26421880 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "4e8ad8c0dd9d4eae8d6d3a59677e7a99",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/29515 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "2465426f4de748bf955849e5bfeb5384",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/4422102 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "4f090bef77ea43f49f503faf722b1e67",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/5148 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Download training data from open datasets\n",
+    "training_data = datasets.FashionMNIST(\n",
+    "    root=\"data\",\n",
+    "    train=True,\n",
+    "    download=True,\n",
+    "    transform=ToTensor(),\n",
+    ")\n",
+    "\n",
+    "# Download test data from open datasets\n",
+    "test_data = datasets.FashionMNIST(\n",
+    "    root=\"data\",\n",
+    "    train=False,\n",
+    "    download=True,\n",
+    "    transform=ToTensor(),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0ace6ebf-b493-4b75-9bfa-dc48bc676b21",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "We pass the ``Dataset`` as an argument to ``DataLoader``. This wraps an iterable over our dataset, and supports\n",
+    "automatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e., each element\n",
+    "in the dataloader iterable will return a batch of 64 features and labels."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:33.430736Z",
+     "iopub.status.busy": "2022-09-27T20:36:33.430441Z",
+     "iopub.status.idle": "2022-09-27T20:36:33.449430Z",
+     "shell.execute_reply": "2022-09-27T20:36:33.448119Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:33.430708Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 4,
+     "id": "8e65f970-dce8-460c-b5f2-9cbee0c14900",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Shape of X [N, C, H, W]:  torch.Size([64, 1, 28, 28])\n",
+      "Shape of y:  torch.Size([64]) torch.int64\n"
+     ]
+    }
+   ],
+   "source": [
+    "batch_size = 64\n",
+    "\n",
+    "# Create data loaders\n",
+    "train_dataloader = DataLoader(training_data, batch_size=batch_size)\n",
+    "test_dataloader = DataLoader(test_data, batch_size=batch_size)\n",
+    "\n",
+    "for X, y in test_dataloader:\n",
+    "    print(\"Shape of X [N, C, H, W]: \", X.shape)\n",
+    "    print(\"Shape of y: \", y.shape, y.dtype)\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "f9d1b1f7-0850-4676-93b6-902f78be237d",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [loading data in PyTorch](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "d9cc95fe-194b-4a6f-b01d-91510dfcfb00",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Creating models, including GPU\n",
+    "\n",
+    "To define a neural network in PyTorch, we create a class that inherits\n",
+    "from [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html). We define the layers of the network\n",
+    "in the ``__init__`` function and specify how data will pass through the network in the ``forward`` function. To accelerate\n",
+    "operations in the neural network, we move it to the GPU if available."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:33.453700Z",
+     "iopub.status.busy": "2022-09-27T20:36:33.453070Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.334541Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.329047Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:33.453700Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 5,
+     "id": "d58d5484-8ca0-4400-91c5-d0e71cf89c12",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Using cuda device\n",
+      "NeuralNetwork(\n",
+      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
+      "  (linear_relu_stack): Sequential(\n",
+      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
+      "    (1): ReLU()\n",
+      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
+      "    (3): ReLU()\n",
+      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
+      "  )\n",
+      ")\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Get cpu or gpu device for training\n",
+    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+    "print(\"Using {} device\".format(device))\n",
+    "\n",
+    "# Define model\n",
+    "class NeuralNetwork(nn.Module):\n",
+    "    def __init__(self):\n",
+    "        super(NeuralNetwork, self).__init__()\n",
+    "        self.flatten = nn.Flatten()\n",
+    "        self.linear_relu_stack = nn.Sequential(\n",
+    "            nn.Linear(28*28, 512),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Linear(512, 512),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Linear(512, 10)\n",
+    "        )\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        x = self.flatten(x)\n",
+    "        logits = self.linear_relu_stack(x)\n",
+    "        return logits\n",
+    "\n",
+    "model = NeuralNetwork().to(device)\n",
+    "print(model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "7ee591d8-e529-481b-8107-e84454893bd2",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [building neural networks in PyTorch](https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "b6db5b4f-80b9-4f9e-8feb-76d0ef1e346f",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Optimizing the model parameters\n",
+    "\n",
+    "To train a model, we need a [loss function](https://pytorch.org/docs/stable/nn.html#loss-functions)\n",
+    "and an [optimizer](https://pytorch.org/docs/stable/optim.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.340252Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.339874Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.345985Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.344793Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.340209Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 6,
+     "id": "8c22a532-16e0-440d-888e-d879e5f53c7c",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "loss_fn = nn.CrossEntropyLoss()\n",
+    "optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "5efe3473-ecf7-411c-a13b-ba54f5c257a6",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and\n",
+    "backpropagates the prediction error to adjust the model's parameters."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.350028Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.349717Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.357590Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.356224Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.350001Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 7,
+     "id": "3d1af6c1-299b-4572-902a-c5e52ce0a7d2",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def train(dataloader, model, loss_fn, optimizer):\n",
+    "    size = len(dataloader.dataset)\n",
+    "    model.train()\n",
+    "    for batch, (X, y) in enumerate(dataloader):\n",
+    "        X, y = X.to(device), y.to(device)\n",
+    "\n",
+    "        # Compute prediction error\n",
+    "        pred = model(X)\n",
+    "        loss = loss_fn(pred, y)\n",
+    "\n",
+    "        # Backpropagation\n",
+    "        optimizer.zero_grad()\n",
+    "        loss.backward()\n",
+    "        optimizer.step()\n",
+    "\n",
+    "        if batch % 100 == 0:\n",
+    "            loss, current = loss.item(), batch * len(X)\n",
+    "            print(f\"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "f86e28f0-bb94-4443-a673-f6d3461d4e94",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "We also check the model's performance against the test dataset to ensure it is learning."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.362383Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.362293Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.370320Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.369013Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.362345Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 8,
+     "id": "112d81e3-cdf8-4b1e-afca-6344be54f5e5",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def test(dataloader, model, loss_fn):\n",
+    "    size = len(dataloader.dataset)\n",
+    "    num_batches = len(dataloader)\n",
+    "    model.eval()\n",
+    "    test_loss, correct = 0, 0\n",
+    "    with torch.no_grad():\n",
+    "        for X, y in dataloader:\n",
+    "            X, y = X.to(device), y.to(device)\n",
+    "            pred = model(X)\n",
+    "            test_loss += loss_fn(pred, y).item()\n",
+    "            correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n",
+    "    test_loss /= num_batches\n",
+    "    correct /= size\n",
+    "    print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "4e366ecc-735f-42dd-b04e-a94816b94fd8",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "The training process is conducted over several iterations (*epochs*). During each epoch, the model learns\n",
+    "parameters to make better predictions. We print the model's accuracy and loss at each epoch; we'd like to see the\n",
+    "accuracy increase and the loss decrease with every epoch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.374528Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.374285Z",
+     "iopub.status.idle": "2022-09-27T20:37:29.296376Z",
+     "shell.execute_reply": "2022-09-27T20:37:29.295164Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.374502Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 9,
+     "id": "50bf09d9-1318-43ef-92aa-6ee308fcafa1",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 1\n",
+      "-------------------------------\n",
+      "loss: 2.304299  [    0/60000]\n",
+      "loss: 2.290307  [ 6400/60000]\n",
+      "loss: 2.268486  [12800/60000]\n",
+      "loss: 2.256835  [19200/60000]\n",
+      "loss: 2.248106  [25600/60000]\n",
+      "loss: 2.217304  [32000/60000]\n",
+      "loss: 2.215746  [38400/60000]\n",
+      "loss: 2.182278  [44800/60000]\n",
+      "loss: 2.179303  [51200/60000]\n",
+      "loss: 2.150798  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 55.6%, Avg loss: 2.143109 \n",
+      "\n",
+      "Epoch 2\n",
+      "-------------------------------\n",
+      "loss: 2.155640  [    0/60000]\n",
+      "loss: 2.144754  [ 6400/60000]\n",
+      "loss: 2.083586  [12800/60000]\n",
+      "loss: 2.091499  [19200/60000]\n",
+      "loss: 2.045041  [25600/60000]\n",
+      "loss: 1.986636  [32000/60000]\n",
+      "loss: 2.002200  [38400/60000]\n",
+      "loss: 1.927214  [44800/60000]\n",
+      "loss: 1.931510  [51200/60000]\n",
+      "loss: 1.847673  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 59.5%, Avg loss: 1.857198 \n",
+      "\n",
+      "Epoch 3\n",
+      "-------------------------------\n",
+      "loss: 1.893984  [    0/60000]\n",
+      "loss: 1.863075  [ 6400/60000]\n",
+      "loss: 1.748540  [12800/60000]\n",
+      "loss: 1.779858  [19200/60000]\n",
+      "loss: 1.666921  [25600/60000]\n",
+      "loss: 1.633243  [32000/60000]\n",
+      "loss: 1.639619  [38400/60000]\n",
+      "loss: 1.551572  [44800/60000]\n",
+      "loss: 1.578183  [51200/60000]\n",
+      "loss: 1.462901  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 61.7%, Avg loss: 1.489910 \n",
+      "\n",
+      "Epoch 4\n",
+      "-------------------------------\n",
+      "loss: 1.560461  [    0/60000]\n",
+      "loss: 1.525511  [ 6400/60000]\n",
+      "loss: 1.381848  [12800/60000]\n",
+      "loss: 1.445225  [19200/60000]\n",
+      "loss: 1.320462  [25600/60000]\n",
+      "loss: 1.335552  [32000/60000]\n",
+      "loss: 1.336702  [38400/60000]\n",
+      "loss: 1.266305  [44800/60000]\n",
+      "loss: 1.303894  [51200/60000]\n",
+      "loss: 1.202768  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 63.3%, Avg loss: 1.229126 \n",
+      "\n",
+      "Epoch 5\n",
+      "-------------------------------\n",
+      "loss: 1.309631  [    0/60000]\n",
+      "loss: 1.289756  [ 6400/60000]\n",
+      "loss: 1.129725  [12800/60000]\n",
+      "loss: 1.231920  [19200/60000]\n",
+      "loss: 1.100483  [25600/60000]\n",
+      "loss: 1.141074  [32000/60000]\n",
+      "loss: 1.153783  [38400/60000]\n",
+      "loss: 1.090403  [44800/60000]\n",
+      "loss: 1.133582  [51200/60000]\n",
+      "loss: 1.050682  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 64.3%, Avg loss: 1.069880 \n",
+      "\n",
+      "Done!\n"
+     ]
+    }
+   ],
+   "source": [
+    "epochs = 5\n",
+    "for t in range(epochs):\n",
+    "    print(f\"Epoch {t+1}\\n-------------------------------\")\n",
+    "    train(train_dataloader, model, loss_fn, optimizer)\n",
+    "    test(test_dataloader, model, loss_fn)\n",
+    "print(\"Done!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Read more about [Training your model](https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "88e2d48b-f1c2-43b0-956d-673d31e777cc",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Saving models\n",
+    "\n",
+    "A common way to save a model is to serialize the internal state dictionary (containing the model parameters)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:29.304919Z",
+     "iopub.status.busy": "2022-09-27T20:37:29.304520Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.042987Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.041902Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:29.304889Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 10,
+     "id": "5674fda2-6f1d-447c-ac05-d21934c7fe6f",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Saved PyTorch Model State to model.pth\n"
+     ]
+    }
+   ],
+   "source": [
+    "torch.save(model.state_dict(), \"model.pth\")\n",
+    "print(\"Saved PyTorch Model State to model.pth\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "b1e15431-85cf-4788-aa7f-5c12d77f4ac3",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Loading models\n",
+    "\n",
+    "The process for loading a model includes re-creating the model structure and loading\n",
+    "the state dictionary into it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:51.047242Z",
+     "iopub.status.busy": "2022-09-27T20:37:51.046988Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.073115Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.072175Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:51.047216Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 11,
+     "id": "ee2271cf-5092-43ad-afed-b64d2e6aea2c",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<All keys matched successfully>"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model = NeuralNetwork()\n",
+    "model.load_state_dict(torch.load(\"model.pth\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "83cc12b8-fca2-4ea0-91f6-cdd8065d6164",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "This model can now be used to make predictions.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:51.076687Z",
+     "iopub.status.busy": "2022-09-27T20:37:51.076449Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.108217Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.107255Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:51.076661Z"
+    },
+    "gradient": {
+     "editing": true,
+     "execution_count": 12,
+     "id": "efed4977-824f-4816-91c0-05f4e10d8b54",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Predicted: \"Ankle boot\", Actual: \"Ankle boot\"\n"
+     ]
+    }
+   ],
+   "source": [
+    "classes = [\n",
+    "    \"T-shirt/top\",\n",
+    "    \"Trouser\",\n",
+    "    \"Pullover\",\n",
+    "    \"Dress\",\n",
+    "    \"Coat\",\n",
+    "    \"Sandal\",\n",
+    "    \"Shirt\",\n",
+    "    \"Sneaker\",\n",
+    "    \"Bag\",\n",
+    "    \"Ankle boot\",\n",
+    "]\n",
+    "\n",
+    "model.eval()\n",
+    "x, y = test_data[0][0], test_data[0][1]\n",
+    "with torch.no_grad():\n",
+    "    pred = model(x)\n",
+    "    predicted, actual = classes[pred[0].argmax(0)], classes[y]\n",
+    "    print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0b064ce8-bacb-45c2-8ef3-3a45ff7ecd5a",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [Saving & Loading your model](https://pytorch.org/tutorials/beginner/basics/saveloadrun_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "379b3389-034a-4c17-a742-dd7c6a8281ce",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Next steps\n",
+    "\n",
+    "To proceed with PyTorch in Gradient, you can:\n",
+    "    \n",
+    " - Look at other Gradient material, such as our [tutorials](https://docs.paperspace.com/gradient/tutorials/) and [blog](https://blog.paperspace.com)\n",
+    " - Try out further [PyTorch tutorials](https://pytorch.org/tutorials/beginner/basics/intro.html)\n",
+    " - Start writing your own projects, using our [documentation](https://docs.paperspace.com/gradient) when needed\n",
+    " \n",
+    "If you get stuck or need help, [contact support](https://support.paperspace.com), and we will be happy to assist.\n",
+    "\n",
+    "Good luck!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a4d2e55f-6c65-48fe-a9e7-165931791ff2",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Original PyTorch copyright notice\n",
+    "\n",
+    "© Copyright 2021, PyTorch."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}

quick_start_pytorch.ipynb ADDED Viewed

	@@ -0,0 +1,991 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a4090294-3349-4815-96f4-98010b657359",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "# Paperspace Gradient: PyTorch Quick Start\n",
+    "Last modified: Sep 27th 2022"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "4936c59a-8535-43cf-a527-e9323b2b658e",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Purpose and intended audience\n",
+    "\n",
+    "This Quick Start tutorial demonstrates PyTorch usage in a Gradient Notebook. It is aimed at users who are relatviely new to PyTorch, although you will need to be familiar with Python to understand PyTorch code.\n",
+    "\n",
+    "We use PyTorch to\n",
+    "\n",
+    "- Build a neural network that classifies FashionMNIST images\n",
+    "- Train and evaluate the network\n",
+    "- Save the model\n",
+    "- Perform predictions\n",
+    "\n",
+    "followed by some next steps that you can take to proceed with using Gradient.\n",
+    "\n",
+    "The material is based on the original [PyTorch Quick Start](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) at the time of writing this notebook.\n",
+    "\n",
+    "See the end of the notebook for the original copyright notice."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a55c3131-9437-483d-9c19-a165fbf8b6d4",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Check that you are on a GPU machine\n",
+    "\n",
+    "The notebook is designed to run on a Gradient GPU machine (as opposed to a CPU-only machine). The machine type, e.g., A4000, can be seen by clicking on the Machine icon on the left-hand navigation bar in the Gradient Notebook interface. It will say if it is CPU or GPU.\n",
+    "\n",
+    "![quick_start_pytorch_images/example_instance_type.png](quick_start_pytorch_images/example_instance_type.png)\n",
+    "\n",
+    "The *Creating models* section below also determines whether or not a GPU is available for us to use.\n",
+    "\n",
+    "If the machine type is CPU, you can change it by clicking *Stop Machine*, then the machine type displayed to get a drop-down list. Select a GPU machine and start up the Notebook again.\n",
+    "\n",
+    "For help with machines, see the Gradient documentation on [machine types](https://docs.paperspace.com/gradient/machines/) or [starting a Gradient Notebook](https://docs.paperspace.com/gradient/explore-train-deploy/notebooks)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "28402a66-a8c4-4672-9592-cc530b58d439",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Working with data\n",
+    "\n",
+    "PyTorch has two [primitives to work with data](https://pytorch.org/docs/stable/data.html):\n",
+    "``torch.utils.data.DataLoader`` and ``torch.utils.data.Dataset``.\n",
+    "``Dataset`` stores the samples and their corresponding labels, and ``DataLoader`` wraps an iterable around\n",
+    "the ``Dataset``."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:04.965047Z",
+     "iopub.status.busy": "2022-09-27T20:36:04.964421Z",
+     "iopub.status.idle": "2022-09-27T20:36:06.330541Z",
+     "shell.execute_reply": "2022-09-27T20:36:06.329333Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:04.965047Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 2,
+     "id": "2bab3caa-e156-4635-bc21-53031ebea60d",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "from torch import nn\n",
+    "from torch.utils.data import DataLoader\n",
+    "from torchvision import datasets\n",
+    "from torchvision.transforms import ToTensor, Lambda, Compose"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0dfb0116-56cd-4795-bc5e-79baad627726",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "PyTorch offers domain-specific libraries such as [TorchText](https://pytorch.org/text/stable/index.html),\n",
+    "[TorchVision](https://pytorch.org/vision/stable/index.html), and [TorchAudio](https://pytorch.org/audio/stable/index.html),\n",
+    "all of which include datasets. For this tutorial, we will be using a TorchVision dataset.\n",
+    "\n",
+    "The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like\n",
+    "CIFAR, COCO ([full list here](https://pytorch.org/vision/stable/datasets.html)). In this tutorial, we\n",
+    "use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and\n",
+    "``target_transform`` to modify the samples and labels respectively."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:06.332087Z",
+     "iopub.status.busy": "2022-09-27T20:36:06.331786Z",
+     "iopub.status.idle": "2022-09-27T20:36:33.429172Z",
+     "shell.execute_reply": "2022-09-27T20:36:33.428023Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:06.332087Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 3,
+     "id": "631deddf-30f0-45f1-84ab-e5f4c510c500",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "30b312514a6a4edcb608e92ecda0a385",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/26421880 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "4e8ad8c0dd9d4eae8d6d3a59677e7a99",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/29515 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "2465426f4de748bf955849e5bfeb5384",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/4422102 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz\n",
+      "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "4f090bef77ea43f49f503faf722b1e67",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/5148 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Download training data from open datasets\n",
+    "training_data = datasets.FashionMNIST(\n",
+    "    root=\"data\",\n",
+    "    train=True,\n",
+    "    download=True,\n",
+    "    transform=ToTensor(),\n",
+    ")\n",
+    "\n",
+    "# Download test data from open datasets\n",
+    "test_data = datasets.FashionMNIST(\n",
+    "    root=\"data\",\n",
+    "    train=False,\n",
+    "    download=True,\n",
+    "    transform=ToTensor(),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0ace6ebf-b493-4b75-9bfa-dc48bc676b21",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "We pass the ``Dataset`` as an argument to ``DataLoader``. This wraps an iterable over our dataset, and supports\n",
+    "automatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e., each element\n",
+    "in the dataloader iterable will return a batch of 64 features and labels."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:33.430736Z",
+     "iopub.status.busy": "2022-09-27T20:36:33.430441Z",
+     "iopub.status.idle": "2022-09-27T20:36:33.449430Z",
+     "shell.execute_reply": "2022-09-27T20:36:33.448119Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:33.430708Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 4,
+     "id": "8e65f970-dce8-460c-b5f2-9cbee0c14900",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Shape of X [N, C, H, W]:  torch.Size([64, 1, 28, 28])\n",
+      "Shape of y:  torch.Size([64]) torch.int64\n"
+     ]
+    }
+   ],
+   "source": [
+    "batch_size = 64\n",
+    "\n",
+    "# Create data loaders\n",
+    "train_dataloader = DataLoader(training_data, batch_size=batch_size)\n",
+    "test_dataloader = DataLoader(test_data, batch_size=batch_size)\n",
+    "\n",
+    "for X, y in test_dataloader:\n",
+    "    print(\"Shape of X [N, C, H, W]: \", X.shape)\n",
+    "    print(\"Shape of y: \", y.shape, y.dtype)\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "f9d1b1f7-0850-4676-93b6-902f78be237d",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [loading data in PyTorch](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "d9cc95fe-194b-4a6f-b01d-91510dfcfb00",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Creating models, including GPU\n",
+    "\n",
+    "To define a neural network in PyTorch, we create a class that inherits\n",
+    "from [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html). We define the layers of the network\n",
+    "in the ``__init__`` function and specify how data will pass through the network in the ``forward`` function. To accelerate\n",
+    "operations in the neural network, we move it to the GPU if available."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:33.453700Z",
+     "iopub.status.busy": "2022-09-27T20:36:33.453070Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.334541Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.329047Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:33.453700Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 5,
+     "id": "d58d5484-8ca0-4400-91c5-d0e71cf89c12",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Using cuda device\n",
+      "NeuralNetwork(\n",
+      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
+      "  (linear_relu_stack): Sequential(\n",
+      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
+      "    (1): ReLU()\n",
+      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
+      "    (3): ReLU()\n",
+      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
+      "  )\n",
+      ")\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Get cpu or gpu device for training\n",
+    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+    "print(\"Using {} device\".format(device))\n",
+    "\n",
+    "# Define model\n",
+    "class NeuralNetwork(nn.Module):\n",
+    "    def __init__(self):\n",
+    "        super(NeuralNetwork, self).__init__()\n",
+    "        self.flatten = nn.Flatten()\n",
+    "        self.linear_relu_stack = nn.Sequential(\n",
+    "            nn.Linear(28*28, 512),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Linear(512, 512),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Linear(512, 10)\n",
+    "        )\n",
+    "\n",
+    "    def forward(self, x):\n",
+    "        x = self.flatten(x)\n",
+    "        logits = self.linear_relu_stack(x)\n",
+    "        return logits\n",
+    "\n",
+    "model = NeuralNetwork().to(device)\n",
+    "print(model)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "7ee591d8-e529-481b-8107-e84454893bd2",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [building neural networks in PyTorch](https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "b6db5b4f-80b9-4f9e-8feb-76d0ef1e346f",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Optimizing the model parameters\n",
+    "\n",
+    "To train a model, we need a [loss function](https://pytorch.org/docs/stable/nn.html#loss-functions)\n",
+    "and an [optimizer](https://pytorch.org/docs/stable/optim.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.340252Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.339874Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.345985Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.344793Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.340209Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 6,
+     "id": "8c22a532-16e0-440d-888e-d879e5f53c7c",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "loss_fn = nn.CrossEntropyLoss()\n",
+    "optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "5efe3473-ecf7-411c-a13b-ba54f5c257a6",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and\n",
+    "backpropagates the prediction error to adjust the model's parameters."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.350028Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.349717Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.357590Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.356224Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.350001Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 7,
+     "id": "3d1af6c1-299b-4572-902a-c5e52ce0a7d2",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def train(dataloader, model, loss_fn, optimizer):\n",
+    "    size = len(dataloader.dataset)\n",
+    "    model.train()\n",
+    "    for batch, (X, y) in enumerate(dataloader):\n",
+    "        X, y = X.to(device), y.to(device)\n",
+    "\n",
+    "        # Compute prediction error\n",
+    "        pred = model(X)\n",
+    "        loss = loss_fn(pred, y)\n",
+    "\n",
+    "        # Backpropagation\n",
+    "        optimizer.zero_grad()\n",
+    "        loss.backward()\n",
+    "        optimizer.step()\n",
+    "\n",
+    "        if batch % 100 == 0:\n",
+    "            loss, current = loss.item(), batch * len(X)\n",
+    "            print(f\"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "f86e28f0-bb94-4443-a673-f6d3461d4e94",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "We also check the model's performance against the test dataset to ensure it is learning."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.362383Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.362293Z",
+     "iopub.status.idle": "2022-09-27T20:36:35.370320Z",
+     "shell.execute_reply": "2022-09-27T20:36:35.369013Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.362345Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 8,
+     "id": "112d81e3-cdf8-4b1e-afca-6344be54f5e5",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def test(dataloader, model, loss_fn):\n",
+    "    size = len(dataloader.dataset)\n",
+    "    num_batches = len(dataloader)\n",
+    "    model.eval()\n",
+    "    test_loss, correct = 0, 0\n",
+    "    with torch.no_grad():\n",
+    "        for X, y in dataloader:\n",
+    "            X, y = X.to(device), y.to(device)\n",
+    "            pred = model(X)\n",
+    "            test_loss += loss_fn(pred, y).item()\n",
+    "            correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n",
+    "    test_loss /= num_batches\n",
+    "    correct /= size\n",
+    "    print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "4e366ecc-735f-42dd-b04e-a94816b94fd8",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "The training process is conducted over several iterations (*epochs*). During each epoch, the model learns\n",
+    "parameters to make better predictions. We print the model's accuracy and loss at each epoch; we'd like to see the\n",
+    "accuracy increase and the loss decrease with every epoch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:36:35.374528Z",
+     "iopub.status.busy": "2022-09-27T20:36:35.374285Z",
+     "iopub.status.idle": "2022-09-27T20:37:29.296376Z",
+     "shell.execute_reply": "2022-09-27T20:37:29.295164Z",
+     "shell.execute_reply.started": "2022-09-27T20:36:35.374502Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 9,
+     "id": "50bf09d9-1318-43ef-92aa-6ee308fcafa1",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 1\n",
+      "-------------------------------\n",
+      "loss: 2.304299  [    0/60000]\n",
+      "loss: 2.290307  [ 6400/60000]\n",
+      "loss: 2.268486  [12800/60000]\n",
+      "loss: 2.256835  [19200/60000]\n",
+      "loss: 2.248106  [25600/60000]\n",
+      "loss: 2.217304  [32000/60000]\n",
+      "loss: 2.215746  [38400/60000]\n",
+      "loss: 2.182278  [44800/60000]\n",
+      "loss: 2.179303  [51200/60000]\n",
+      "loss: 2.150798  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 55.6%, Avg loss: 2.143109 \n",
+      "\n",
+      "Epoch 2\n",
+      "-------------------------------\n",
+      "loss: 2.155640  [    0/60000]\n",
+      "loss: 2.144754  [ 6400/60000]\n",
+      "loss: 2.083586  [12800/60000]\n",
+      "loss: 2.091499  [19200/60000]\n",
+      "loss: 2.045041  [25600/60000]\n",
+      "loss: 1.986636  [32000/60000]\n",
+      "loss: 2.002200  [38400/60000]\n",
+      "loss: 1.927214  [44800/60000]\n",
+      "loss: 1.931510  [51200/60000]\n",
+      "loss: 1.847673  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 59.5%, Avg loss: 1.857198 \n",
+      "\n",
+      "Epoch 3\n",
+      "-------------------------------\n",
+      "loss: 1.893984  [    0/60000]\n",
+      "loss: 1.863075  [ 6400/60000]\n",
+      "loss: 1.748540  [12800/60000]\n",
+      "loss: 1.779858  [19200/60000]\n",
+      "loss: 1.666921  [25600/60000]\n",
+      "loss: 1.633243  [32000/60000]\n",
+      "loss: 1.639619  [38400/60000]\n",
+      "loss: 1.551572  [44800/60000]\n",
+      "loss: 1.578183  [51200/60000]\n",
+      "loss: 1.462901  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 61.7%, Avg loss: 1.489910 \n",
+      "\n",
+      "Epoch 4\n",
+      "-------------------------------\n",
+      "loss: 1.560461  [    0/60000]\n",
+      "loss: 1.525511  [ 6400/60000]\n",
+      "loss: 1.381848  [12800/60000]\n",
+      "loss: 1.445225  [19200/60000]\n",
+      "loss: 1.320462  [25600/60000]\n",
+      "loss: 1.335552  [32000/60000]\n",
+      "loss: 1.336702  [38400/60000]\n",
+      "loss: 1.266305  [44800/60000]\n",
+      "loss: 1.303894  [51200/60000]\n",
+      "loss: 1.202768  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 63.3%, Avg loss: 1.229126 \n",
+      "\n",
+      "Epoch 5\n",
+      "-------------------------------\n",
+      "loss: 1.309631  [    0/60000]\n",
+      "loss: 1.289756  [ 6400/60000]\n",
+      "loss: 1.129725  [12800/60000]\n",
+      "loss: 1.231920  [19200/60000]\n",
+      "loss: 1.100483  [25600/60000]\n",
+      "loss: 1.141074  [32000/60000]\n",
+      "loss: 1.153783  [38400/60000]\n",
+      "loss: 1.090403  [44800/60000]\n",
+      "loss: 1.133582  [51200/60000]\n",
+      "loss: 1.050682  [57600/60000]\n",
+      "Test Error: \n",
+      " Accuracy: 64.3%, Avg loss: 1.069880 \n",
+      "\n",
+      "Done!\n"
+     ]
+    }
+   ],
+   "source": [
+    "epochs = 5\n",
+    "for t in range(epochs):\n",
+    "    print(f\"Epoch {t+1}\\n-------------------------------\")\n",
+    "    train(train_dataloader, model, loss_fn, optimizer)\n",
+    "    test(test_dataloader, model, loss_fn)\n",
+    "print(\"Done!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Read more about [Training your model](https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "88e2d48b-f1c2-43b0-956d-673d31e777cc",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Saving models\n",
+    "\n",
+    "A common way to save a model is to serialize the internal state dictionary (containing the model parameters)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:29.304919Z",
+     "iopub.status.busy": "2022-09-27T20:37:29.304520Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.042987Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.041902Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:29.304889Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 10,
+     "id": "5674fda2-6f1d-447c-ac05-d21934c7fe6f",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Saved PyTorch Model State to model.pth\n"
+     ]
+    }
+   ],
+   "source": [
+    "torch.save(model.state_dict(), \"model.pth\")\n",
+    "print(\"Saved PyTorch Model State to model.pth\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "b1e15431-85cf-4788-aa7f-5c12d77f4ac3",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Loading models\n",
+    "\n",
+    "The process for loading a model includes re-creating the model structure and loading\n",
+    "the state dictionary into it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:51.047242Z",
+     "iopub.status.busy": "2022-09-27T20:37:51.046988Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.073115Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.072175Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:51.047216Z"
+    },
+    "gradient": {
+     "editing": false,
+     "execution_count": 11,
+     "id": "ee2271cf-5092-43ad-afed-b64d2e6aea2c",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<All keys matched successfully>"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model = NeuralNetwork()\n",
+    "model.load_state_dict(torch.load(\"model.pth\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "83cc12b8-fca2-4ea0-91f6-cdd8065d6164",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "This model can now be used to make predictions.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2022-09-27T20:37:51.076687Z",
+     "iopub.status.busy": "2022-09-27T20:37:51.076449Z",
+     "iopub.status.idle": "2022-09-27T20:37:51.108217Z",
+     "shell.execute_reply": "2022-09-27T20:37:51.107255Z",
+     "shell.execute_reply.started": "2022-09-27T20:37:51.076661Z"
+    },
+    "gradient": {
+     "editing": true,
+     "execution_count": 12,
+     "id": "efed4977-824f-4816-91c0-05f4e10d8b54",
+     "kernelId": ""
+    },
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Predicted: \"Ankle boot\", Actual: \"Ankle boot\"\n"
+     ]
+    }
+   ],
+   "source": [
+    "classes = [\n",
+    "    \"T-shirt/top\",\n",
+    "    \"Trouser\",\n",
+    "    \"Pullover\",\n",
+    "    \"Dress\",\n",
+    "    \"Coat\",\n",
+    "    \"Sandal\",\n",
+    "    \"Shirt\",\n",
+    "    \"Sneaker\",\n",
+    "    \"Bag\",\n",
+    "    \"Ankle boot\",\n",
+    "]\n",
+    "\n",
+    "model.eval()\n",
+    "x, y = test_data[0][0], test_data[0][1]\n",
+    "with torch.no_grad():\n",
+    "    pred = model(x)\n",
+    "    predicted, actual = classes[pred[0].argmax(0)], classes[y]\n",
+    "    print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "0b064ce8-bacb-45c2-8ef3-3a45ff7ecd5a",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "Read more about [Saving & Loading your model](https://pytorch.org/tutorials/beginner/basics/saveloadrun_tutorial.html)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "379b3389-034a-4c17-a742-dd7c6a8281ce",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Next steps\n",
+    "\n",
+    "To proceed with PyTorch in Gradient, you can:\n",
+    "    \n",
+    " - Look at other Gradient material, such as our [tutorials](https://docs.paperspace.com/gradient/tutorials/) and [blog](https://blog.paperspace.com)\n",
+    " - Try out further [PyTorch tutorials](https://pytorch.org/tutorials/beginner/basics/intro.html)\n",
+    " - Start writing your own projects, using our [documentation](https://docs.paperspace.com/gradient) when needed\n",
+    " \n",
+    "If you get stuck or need help, [contact support](https://support.paperspace.com), and we will be happy to assist.\n",
+    "\n",
+    "Good luck!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "gradient": {
+     "editing": false,
+     "id": "a4d2e55f-6c65-48fe-a9e7-165931791ff2",
+     "kernelId": ""
+    }
+   },
+   "source": [
+    "## Original PyTorch copyright notice\n",
+    "\n",
+    "© Copyright 2021, PyTorch."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}