In our class we will be using Jupyter notebooks and python for most labs and assignments so it is important to be confident with both ahead of time. We will additionally be using a matrix (tensor) manipulation library similar to numpy called pytorch. We will learn during the class why pytorch is much more than this but for the purposes of this tutorial we will just use it as a convenient tool for array/matrix/tensor manipulation.
import torch # This imports the torch library.
import matplotlib.pyplot as plt # This is python's popular plotting library.
# This is to ensure matplotlib plots inline and does not try to open a new window.
%matplotlib inline
The main data structure you have to get yourself familiar during this course is the tensor, or put simply a multidimensional array (not going into the formal mathematical definition here). We will create here a few tensors, manipulate them and display them. The indexing operations inside a tensor in pytorch is similar to indexing in numpy.
myTensor = torch.FloatTensor(7, 7)
myTensor[:, :] = 0 # Assign zeros everywhere in the matrix.
myTensor[3, 3] = 1 # Assign one in position 3, 3
myTensor[:2, :] = 1 # Assign ones on the top 2 rows.
myTensor[-2:, :] = 1 # Assign ones on the bottom 2 rows.
# Show the tensor.
def showTensor(aTensor):
plt.figure()
plt.imshow(aTensor.numpy())
plt.colorbar()
plt.show()
showTensor(myTensor);
If you are familiar with numpy indexing then everything above should look familiar. We can access and modify the tensor using indexing of single elements myTensor[index1, index2], or ranges of indices myTensor[index1_1 : index1_2, index2_1 : index2_2]. Notice that the indices could contain positive or negative values, where negative values index counting from end to beginning. Also notice that we can convert a pytorch tensor to a numpy array easily using the .numpy() method.
Exercise: Use indexing operations to generate the following patterns:
I also strongly recommend you to look over the definitions of the index(), index_add_(), index_fill(), index_fill_(), index_copy_(), index_select(), scatter_(), and gather() operations, as you will see them in other people's code, and maybe you will find them useful for your own projects.
We will also be often needing to extract subsections of a tensor or reshaping the tensor in different ways. Here are examples for a couple of useful operations.
# This creates one-dimensional tensor (array)
myTensor = torch.Tensor([1, 2, 3, 4])
print(myTensor)
print(myTensor.size())
# This copies the one-dimensional array 5 times on dimension 1
extendedTensor = myTensor.repeat(5, 1)
print(extendedTensor)
print(extendedTensor.size())
slicedTensor = extendedTensor[3, :]
slicedTensor[:] = 5
print(slicedTensor)
print(sliceTensor.size())
print(extendedTensor)
Notice how in the above example the slicedTensor still points to the original content of extendedTensor, modifying its values also modifies the contents in extendedTensor. If you wanted to rather create a copy of the values in the slice you can always use the .clone() method.
You can review other tensor operations in pytorch's documentation here: http://pytorch.org/docs/master/tensors.html
Other functions I often find useful besides repeat(), and clone(), are: view(), squeeze(), unsqueeze(), transpose(), and permute().
Excercise: Run some examples to see what view(), squeeze(), unsqueeze(), transpose(), and permute() can do for you.
Basic tensor operations include scalar, tensor multiplication, and addition. It also includes element-wise tensor-tensor operations, and other operations that might be specific to 2D tensors (matrices) such as matrix-matrix multiplication. Two things to notice 1) Basic operators have been overloaded so sometimes it is not needed to explicitly call a torch function 2) Many torch operations have an in-place version that operates in the same space of an input tensor as opposed to returning a new one, we show some examples below:
# Adding a scalar to a tensor:
inputTensor = torch.Tensor([[1, 2], [3, 4]])
print(inputTensor.add(1))
print(inputTensor)
# Adding a scalar to a tensor (in-place):
inputTensor = torch.Tensor([[1, 2], [3, 4]])
print(inputTensor.add_(1)) # In-place operations usually are followed by underscore _
print(inputTensor)
The above code should show the difference between in-place operations, similarly there are mul(), and div() methods.
The following code shows tensor-tensor operations.
myTensor1 = torch.Tensor(3, 2, 2).fill_(2)
myTensor2 = torch.Tensor(3, 2, 2).fill_(3)
print(myTensor1 + myTensor2)
print(myTensor1 * myTensor2)
# .addcmul(c, a, b) performs c + a .* b (Where .* means element-wise multiplication)
print(torch.addcmul(myTensor1, myTensor1, myTensor2))
Other useful tensor-tensor operations are addcmul_() [in place version of addcmul()], matmul() (tensor-tensor multiplication), mm() (matrix multiplication), mv() (matrix-vector multiplication).