03-31-2020

Following tutorial in this link: https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e

Let's take an example of a simple linear regression model, y = wx + b + e

x is the only input feature in this simple example, and y, in Machine Learning parlance, is the ground truth or the label

w and b are two parameters of this model. e is some Gaussian noise that we've added

Now, let's visualize the training set and the validation set

Train a model using PyTorch

Let's create tensors for our synthetic data

Next, let's create tensors also for the parameters (w and b) that we wish to optimize during training. These tensors require computation of their gradients, so that we can update their values. That’s what the 'requires_grad=True' argument does-- it tells PyTorch to compute gradients

For the data tensors we created above, there is no need for gradients as the data stays the same and we won't be updating them

Autograd in Pytorch

Autograd is PyTorch’s automatic differentiation package-- no need to worry about partial derivatives, chain rule.

We tell PyTorch to compute all gradients using 'backward( )' method on the loss function

The actual gradients can be accessed using 'grad' attribute of the parameters

In this example thus far, we've manually updated the parameters, which is fine for two parameters. But, when there are many parameters, this gets out of hand quickly. This is where PyTorch's optimizers come in handy.

Optimizer

In the code above, although we used PyTorch's optimizer, when it came to the loss function though, we still defined it manually. Next, let's also do this using PyTorch's built-in loss functions.

Loss Computation

At this point, there’s only one piece of code left to change: the predictions.

In the 'init' method, we defined our two parameters, w and b.

Now our manual linear regression code looks as follows: