Introduction to Numpy and Arrays

Introduction
Requirements
- Knowledge
- Modules
Theory
Summary and Outlook
Literature
Licenses

Introduction

This notebook will summarize important techniques you will need to work efficiently with arrays in python using numpy. You will learn how to initialize arrays, how to access elements in them, how to manipulate them and how python operators deal with arrays in calculation. Array broadcasting in python will also be an important part at the end of the notebook. Working efficiently with arrays in numpy not only saves code and avoids loops but improves the computation time of your algorithms by a huge factor. This huge factor can make the difference between 1min of computation time or a few hours, which will be a great benefit when working with big data sets in machine learning.

Requirements

Knowledge

Since this notebook is a basic python numpy notebook no special knowledge is requirement accept some fundamental python syntax.

Python Modules

Since this will be an numpy only notebook, we only need the numpy libary.

import numpy as np

Basics

Initialization and Metainformation

A simple 1D Array of certain length can be initialized by

Test1D = np.array([1,2,3,4])
print(Test1D)

For 1D arrays Python doesn't distinguish between column and row 'vectors'. To do this we have to define a 2D array which we will consider later.

Array with only zeros or ones can be initialized by

length = 10
Test1D_Ones = np.ones(length)
Test1D_Zeros = np.zeros(length)
print(Test1D_Ones)
print(Test1D_Zeros)

The dtype is assinged according to the number with the highest dtype hierachy. The position of this number doesnt matter for the dtype assigment. The hierachy is given by

int8,…,uint8,…,float16, float32, float64, complex64, complex128.

Default for integers is np.int32 or np.int64 depending on the operation system. However Windows 10 64bit also defaults to np.int32.

The default for floats is always np.float64. Zeros and ones array, have float64 as dtype.

print(Test1D_Ones.dtype)

Note the difference between the python typeand the numpy dtype.

type(Test1D_Ones)

The dtype can also be set using the additional argument, e.g. dtype=np.int32.

Test1D = np.array([1,2,3,4])
print(Test1D.dtype)
Test1D = np.array([1,2,3.3,4])
print(Test1D.dtype)
Test1D = np.array([1,2,3,4],dtype=np.int64)
print(Test1D.dtype)
Test1D = np.array([1,2.2,3,4],dtype=np.float16)
print(Test1D.dtype)

A simple 2D Array (= matrix) can be initialized by

Test2D = np.array([[1,2,3],[4,5,6],[7,8,9],[14,25,16]])
print(Test2D)
print()
print("dytype: ",Test2D.dtype)

With ndim we get the number of dimensions of the numpy array:

Test2D.ndim

Basically, a 2D numpy array is an array of a arrays, the outer bracket represents an array which contains more arrays as elements instead of just numbers. Each inner bracket [] then represents a row in the matrix. This is analog to a matrix in math which can be thought of as a collection of row or column vectors. Since we have 2 dimension which one is which?

The first dimension (dim=0 in numpy) is given along the inner arrays (row index), since we have an array of arrays. The second dimension (dim=1 in numpy) is given along the elements in the inner arrays (column index). I.e., the index order is [row, column]analog to matrix indices. This will become more clearly later when we talk about how to access elements in arrays.

All arguments about dtypes in 1D arrays are also valid here.

The shape will tell us how many elements we have along the first and second dimension.

print(Test2D.shape)

Since the array Test2D contains 4 arrays with 3 elements the shape is (4,3). For a 1D array shape will basically give us the length/the number of elements.

print(Test1D.shape)
print(len(Test1D))

Note that the length len of an array is the same as the first shape value, i.e.:

print(len(Test2D), "=" , Test2D.shape[0])

As for 1D arrays it is also possible to initialize 2D zeros and ones arrays. The only difference is that instead of providing a length as argument we give a shape as a tuple as the argument.

Test2D_Ones = np.ones((3,4))
print(Test2D_Ones)
print()
Test2D_Zeros = np.zeros((4,3))
print(Test2D_Zeros)

Indexing

Each element in an array can simply be accessed by calling its position using index notations. To address exactly one element in a multidimensional array we need for each dimension an index, e.g.

for a 1D array its only one index
for 2D we can use up to 2
and so on.

Important!

Python and numpy always starts counting at 0, so our first element will have position 0 the second element position 1 and so on.

Test1D = np.array([1,2,3,4,4,5,6,7,3,4])
print(Test1D)

If we want to have the third element from our 1D Array, which is 3, we write

print(Test1D[2])

So [...] attached to our 1D array will run over all elements we have:

print(Test1D[...]) # this seems to be silly. But with multidimensional arrays this will make more sense (see below).

The same principle can be used for 2D arrays but here we attach [dim0,dim1] where we have 2 indices running over the 2 dimensions. The first runs over the first dimension and the second over the second dimension. If we want to have the second element from the third row of our 2D matrix we write

print(Test2D)
print(Test2D[2,1])

As said above for 2D Array we can us UP TO 2 indices, meaning that we are also allowed to run only over the first dimension by only using one index. So if we want to have for example the second row we write

print(Test2D[1])

Besides the positive, intuitive indices, python also allows negativ indices. Basically positve indices count from the 'left' (forward) starting with 0 and negative indices count from the 'right' (backward) starting with -1.

So for a 1D array of length 4 the positions can be labeled by

[0,1,2,3] or [-4,-3,-2,-1].

print(Test1D)
print(Test1D[2])
print(Test1D[-8])

Besides positive and negative indices there is also the symbol '...', which means 'take all' elements in this dimension. This allows us for example to get a certain row or column from a 2D array. If we want to get the third element from each element in the frist dimension,third column of a matrix, we write [...,2].

print(Test2D)
print()
print(Test2D[...,2])

To get the second row we can either write just [1] or [1,...].

print(Test2D)
print()
print(Test2D[1,...])
print()
print(Test2D[1])

All arguments above can also be extended for arbitrary multidimensional arrays.

Slicing

An extremly useful technique to access elements in arrays is called slicing. Basically slicing is indexing using a start, stop and step value, equivalent to an explicit python loop running over an array but much faster. The notation goes like following, [start index: stop index: step size] for each dimension. If we leave out the start or the stop index by default the first and respectivly last element is used. The default for the step size is 1.

So if we want to filter every second element we write:

print(Test1D)
print(Test1D[1:10:2]) # from index 1 to 10, every 2nd element

# or

print(Test1D[1::2]) # from index 1 to end (default), every 2nd element

If we want to split an array in two halfs we write

print(Test1D)

# cast as int because "/2" results in float
half_len = int((len(Test1D)/2)) 

print(Test1D[0:half_len:]) # from 0 to half_len, stepsize 1 (default)
print(Test1D[:half_len:]) # from 0 (default) to half_len, stepsize 1 (default)
print(Test1D[half_len::]) # from half_len to end (default), stepsize 1 (default)

# Note: there is also a build-in function for splitting an array
np.split(Test1D, 2)

print(Test2D)
print()
# get the second to third column of the matrix:
print(Test2D[...,1:3]) #

Note: Therefore, : is equivalent to ....

print(Test2D[:,2])
print(Test2D[...,2])

We can also manipulate a 2D array quite fast without any loops. Consider for example the following matrix:

np.random.seed(42)

#look up in the build-in help (shift + tab) what's going on, here! 
Test2D = np.random.rand(4,4).round(1)
print(Test2D)

To set every second row to zero we write

Test2D[1::2] = 0
print(Test2D)

To set every second column to zero we write

Test2D = np.random.rand(4,4).round(1)
print(Test2D)
print()
Test2D[:,1::2] = 0
print(Test2D)

So now lets see how much faster and also shorter slicing is towards loops. To do this we generate a fairly big matrix and set each even row to zero and measure the execution time.

import time as t
np.random.seed(43)
size = 10000

Test2D = np.random.rand(size,size).round(1)

start = t.time()
Test2D[::2] = 0
end = t.time()
time1 = end - start
print('Slicing :',time1,'s')
#print(Test2D)

Test2D = np.random.rand(size,size).round(1)

start = t.time()
for i in range(size):
    for j in range(size):
        if i%2==0:
            Test2D[i,j] = 0
end = t.time()
time2 = end - start
print('Loop :',time2,'s')
#print(Test2D)
print('-------------------------')
print('Difference factor :',time2/time1)

The result is quite impressive, slicing takes only one line of code and is faster by a considerable margin. Depending on the hardware, slicing should be approximately 200-400 times faster than the equivalent code using loops. So for machine learning codes slicing is the way to go for since it is extremely efficient and saves a lot of computation time.

So we have the following rule: Avoid python loops!

Fancy Indexing

Fancy indexing replaces the indexing using a scalar as seen before with index arrays. So instead of using a scalar to get one particular element we use an array of index to get multiple elements at a time.

Consider the following matrix.

Test2D = np.random.rand(8,8).round(2)*100
print(Test2D)

If we want to get the diagonal we can use fancy indexing the following way

IndexArray = np.arange(8)
print (IndexArray)
Dia = Test2D[IndexArray, IndexArray]
print()
print(Dia)

# Note: We can also get the diagonal by the build-in method:
print(Test2D.diagonal())

It is also possible to define 2D array indices, matrix of indices, which can be used to filter some important elements out of a matrix. If we want to get the elements with indices (2,1),(3,4),(5,3) and (6,7) we can write

IndexArray2 = [2,3,5,6]
IndexArray3 = np.array([[1,4,3,7]],dtype=int)
print(Test2D)
Sec = Test2D[IndexArray2,IndexArray3]
print()
print(Sec)

Boolean Indexing

Boolean indexing is a technique which allows you to use a boolean vales as index, that filters elements in your array which fullfill this expression. Consider the following matrix.

np.random.seed(43)

Test2D = 10*np.random.rand(5,5).round(1)
print(Test2D)

If we now use a boolean expression on an array, the expression is evaluated elementwise and the output will be an array of the same shape with dtype bool, which will then be used as a mask.

Test2DBool = Test2D>3
print(Test2DBool)
print()
print('dtype:',Test2DBool.dtype)

Now we use this mask as an index for our array and as output we will get an array (1D array) with all entries for which the mask is true.

print(Test2D[Test2DBool])

Boolean indexing supports all standard comparision operators <,>,==,<=,>= in a boolean expression plus the bitwise logic operators &(and),|(or) and ^(xor) and ~(not).

print(Test2D[(Test2D>2) & (Test2D<5)])

This can be used e.g. to set all elements of Test2D larger than 3 to the value 3:

print(Test2D)
print()

# make a true copy
Test2D_ = Test2D.copy()
Test2D_[Test2D_>3]=3
print(Test2D_)
# Note: Here we can generate the same effect with the build-in function resp. method clip(..)
Test2D.clip(max=3)

Copies and references

Note that we copied the original array by the method copy(). If not Test2D_ would point to the same memory for storing the elements as Test2D (manipulating Test2D_ would also manipulate Test2D).

The same holds if we slice.

print (Test2D_)

# this a called a view:
second_row = Test2D_[1] # points to the same memory  

second_row[:] = 4 # this also manipulates the second_row of Test2D_
print()
print(second_row)
print()
print (Test2D_)

Reshape

In the chapter Initialization and Metainformation we saw that each array comes with a shape. Sometimes along data adjustment it might be useful to change the shape and this can be easily done using the array method reshape(). reshape() takes one number for each dimension as an argument which stands for how many elements this dimension will have in the 'new' reshaped array.

We can for example initialize a array of length 9 and then reshape this array into a 3x3 matrix or vice versa.

Test1D = np.arange(9)
print(Test1D)
Test2D = Test1D.reshape(3,3)
print(Test2D)

Test1D = Test2D.reshape(1,9)
print(Test1D)
Test1D = Test2D.reshape(9)
print(Test1D)

Caution: If we want to reshape a matrix into an array reshape(1,9) will output a matrix, 2D array, with 1 row and 9 columns and reshape(9) will output a real 1D array, as seen above.

It is also important that the new array is able to contain all the numbers given by the old array, so reshaping an array of length 9 into a 3x4 matrix won't work since we would need 12 elements to fill a 3x4 matrix. Same arguments hold for reshaping a bigger array into a smaller one.

Test1D = np.arange(9)
try:
    Test2D = Test1D.reshape(3,4)
except Exception as e:
    print("An exception occured: ", e)

Test1D = np.arange(9)
try:
    Test2D = Test1D.reshape(2,2)
except Exception as e:
    print("An exception occured: ", e)

If the argument -1 is used the shape of that dimension is computed automatically. That can be used,
e.g., to 'delete' a dimension and turn a 2D array into a 1D.

print(np.arange(1,24,2)) # 1 to 24, stepsize 2
print()
Test2D = np.arange(1,24,2).reshape(-1,4)
print(Test2D)
print()
print(Test2D.reshape(-1, 2))
print()

print(Test2D.reshape(-1))# nearly the same as
print()
print(Test2D.flatten())

But:

flatten() returns a copy and
reshape() returns a new reference with different shape.

print(Test2D)

print()
Test2D_ = Test2D.flatten() # the content is copied.
Test2D_[3:7] = -1 # Test2D is not changed!! 
print(Test2D)

print()
Test2D_ = Test2D.reshape(-1) # points to the same memory as Test2D
Test2D_[3:7] = -1
print(Test2D)

Concatenate

Sometimes you may have a few data sets each given by a matrix or array and each set has to run through the same calculations. Instead of running each calculation separately it is possibly to concatenate all given data sets into one using the general function concatenate() or the more specfic functions vstack() and hstack(). concatenate() takes two arguments: a list with all arrays you want to concatenate and the axis, dimension, along you want to concatenate the arrays.

Concatenate three matrices along the first dimension (axis=0) looks like the following

Test2D1 = np.arange(9).reshape(3,3)
print(Test2D1, '\n')

Test2D2 = 2*np.arange(9).reshape(3,3)
print(Test2D2, '\n')

Test2D3 = 3*np.arange(9).reshape(3,3)
print(Test2D3, '\n')

Test2D = np.concatenate((Test2D1,Test2D2,Test2D3),axis=0)
print(Test2D)

Concatenate three matrices along the second dimension (axis=1) looks like

Test2D = np.concatenate((Test2D1,Test2D2,Test2D3),axis=1)
print(Test2D)

Concatenating three 1D-arrays (vecotors) looks like

Test1D1 = np.arange(4)
print(Test1D1, '\n')

Test1D2 = 2*np.arange(4)
Test1D3 = 3*np.arange(4)
Test2D = np.concatenate((Test1D1,Test1D2,Test1D3),axis=0)
print(Test2D)

To put three arrays row-wise into a matrix we first have to reshape these arrays into 1x4 matrices

Test1D1 = np.arange(4)
print(Test1D1.shape, '\n')

Test1D1 = Test1D1.reshape(1,4)
print(Test1D1.shape, '\n')


Test1D2 = Test1D2.reshape(1,4)
Test1D3 = Test1D3.reshape(1,4)
Test2D = np.concatenate((Test1D1,Test1D2,Test1D3),axis=0)
print(Test2D)

If we are lazy and don't want to write the axis explicitly we can also use vstack(), which will concatenate vertically, meaning axis=0 and hstack(), which will concatenate horizontally, meaning axis=1.

Test2D = np.vstack((Test2D1,Test2D2,Test2D3))
print(Test2D)

Test2D = np.hstack((Test2D1,Test2D2,Test2D3))
print(Test2D)

Note that concatenate, hstack and vstack copies the content to new memory locations, because the memory of a numpy array must be contiguous.

print(Test2D1, '\n')
Test2D1[:] = 0
print(Test2D1, '\n')

print(Test2D)

Operators

Before looking at how python deals with array (matrix) operations let's see how mathematics does it. As far as the maths section is concerned, let's say the terms 2D array and matrix are interchangeable.

In math a matrix is represented as $ \textbf{A} =\begin{bmatrix} 0 & 1 & 2 \\ 3 & 4 & 5 \\ 6 & 7 & 9 \\ 10 & 11 & 12 \\ \end{bmatrix} $

with elements$ A_{i,j} $

For example, a row vector $ \textbf{A}_{0,:} = \textbf{r} = \begin{bmatrix} 0 & 1 & 2 \\ \end{bmatrix} $

with elements$ r_{i} = A_{0,i} $, and a column vector $ \textbf{A}_{:,1} = \textbf{c} = \begin{bmatrix} 1 \\ 4 \\ 7 \\ 11 \\ \end{bmatrix}. $

with elements$ c_{j}= A_{1,j} $.

The basic operations are defined as follows

Addtion and Subtraction

$ (\textbf{A} + \textbf{B} )_{i,j} = A_{i,j} + B_{i,j} \\ (\textbf{A} - \textbf{B} )_{i,j} = A_{i,j} - B_{i,j} $

Here$ \textbf{A} $ and$ \textbf{B} $ have to have the same shape since both operations are performed elementwise.

Scalar multiplication

Let$ s $ be a scalar (number). $ (s \textbf{A})_{i,j} = s\cdot A_{i,j} $

Here each element is multiplied by the number s.

Transpose

The transpose of a matrix basically switches rows with columns and vice versa. $ (\textbf{A}^{T})_{i,j} = A_{j,i} $

For row and column vectors the transpose operation does the same, it simply turns row vectors into column vectors and column vectors into row vectors.

Matrix multiplication (dot product)

$ (\textbf{A} \circ \textbf{B})_{i,j} = \sum_{l=1} A_{i,l} \cdot B_{l,j} $

Matrix multiplication is not commutative and it is not an elementwise multiplication. In order to be defined the number of columns from the left matrix has to be the same as the number of rows of the right matrix, in general$ shape(m,k) \circ shape(k,n) $ gives$ shape(m,n) $ for all combinations of m and n.

Matrix-Vector product

The matrix vector product follows the same rules as the dot product where a row vectors has$ shape(1,m) $ and a column vector$ shape(n,1) $.

Operators in Numpy

Let's see how these operations are implemented in numpy.

Matrix1 = np.arange(12).reshape(4,3)
Matrix2 = np.random.rand(4,3).round(1)
print(Matrix1)
print()
print(Matrix2)

Addtion and Subtraction

Matrix3 = Matrix1 + Matrix2
Matrix4 = Matrix1 - Matrix2
print(Matrix3)
print()
print(Matrix4)

Scalar multiplication

s = 5.0
Matrix3 = s * Matrix1 
print(Matrix3)

Multiplication

Numpy perfoms a 'normal' multiplication between to arrays. If their shapes are euqal numpy will use element-wise multiplication. If their shapes are not equal numpy will try to broadcast them (broadcasting see next chapter).

Matrix1 = np.arange(12).reshape(4,3)
print(Matrix1)
print()
print(Matrix1*Matrix1)

Transpose

For the transpose of a matrix numpy has a build in function called transpose() or shorthand .T .

print(Matrix1)
print()
Matrix3 = Matrix1.transpose()
print(Matrix3)
print()
Matrix4 = Matrix2.T
print(Matrix4)
print()

Row =np.array([[1,2,3,4]])
print(Row)
Column = Row.T
print()
print(Column)

Matrix multiplication (dot product)

As for the transpose numpy also has a function called dot() taking two matrices as arguments

print(Matrix1)
print()
print(Matrix2.T)

Matrix3 = np.dot(Matrix1,Matrix2.T)
print(Matrix3)

Matrix4 = Matrix2.T.dot(Matrix1)
print(Matrix4)

print(Row)
print()
print(Column)
print()
print(Matrix1)
print()
print('Row o Matrix:',np.dot(Row,Matrix1))
print('Row o Column (scalar product):',np.dot(Row,Column))

So we see that if we perform operations on arrays with the 'right' shapes numpy does exactly what we would expect. But numpy can and will do much more which we will explore in the next chapter about broadcasting.

Other types of vector operations

a = np.array([1, 4, 0], float)
b = np.array([2, 2, 1], float)
print(np.outer(a, b)) # 3x1 dot 1x3 results in 3x3 shape
print(np.inner(a, b)) # same as dot()
print(np.cross(a, b)) # result of crossproduct is perpendicular to a,b
print(np.cross(b, a))

Broadcasting

Broadcasting is numpy's internal algorithm to perform element-wise operations on arrays with different shapes. Above, we saw that if we use the 'right' shapes for our arrays all numpy operations will do what we expect from math.

If the shapes are not the same numpy will start to compare their shapes starting from the trailing dimension, dimension-wise, and use the following rules. Two dimensions can be broadcasted together if

they are equal

at least one of them is 1.

Otherwise numpy will throw an execption.

Now lets consider a simple example, a number plus a matrix (2D) array and see what numpy does.

Number = 5.
Matrix = np.arange(12).reshape(3,4)
print(Number)
print(Matrix)
print()
print(Number + Matrix)

So numpy actually adds 5 to every element in the matrix, which is equivalent to adding a 3x4 matrix filled with the value '5' to the matrix.

Five = np.ones((3,4))*5.
print(Five)
print()
print(Five + Matrix)

So numpy compares the shapes of the number and the matrix and will hit the second rule since the dimension of a number is 1. In the background numpy will than stretch the scalar 5 into a 3x3 matrix and add them.

Important: Numpy wont make any copies of any broadcasted array in the background. So broadcasting in general saves valuable memory.

Now lets add an array to a matrix.

Array = np.arange(1,5)
print(Array,' with', Array.shape)
print()
print(Matrix,' with', Matrix.shape)

print(Matrix+Array)

Here numpy will hit rule 1, since the dimension of the array is equal to the second dimension of the matrix. So numpy will broadcast the array [1 2 3 4] into a matrix of shape (3,4) and add them.

Since numpy performs all its basic operations (+,-,*,/) element-wise this broadcasting principle will hold for all of them.

Misc

Special arrays

print(np.zeros(7, dtype=int))
print(np.ones(8))
print(np.zeros_like(Matrix))
print(np.ones_like(Matrix))
print(np.identity(4, dtype=float)) # same as
print(np.eye(4, dtype=float))
print(np.eye(4,k=1)) # eye allows shifting of the diagonal

print(np.array(list(range(1,20,2)))) # don't do that
a = np.arange(1, 20, 2, dtype=int) # directly !
print(a)

# test if an element occurs somewhere in the array
print(8 in a) # python way
print (np.any((a == 8))) # numpy way

any and all

x = np.arange(0,2,0.5)
y = 2*x

# test if there is an element <=0.6
print("Is there an element in y with y[i] <=0.6: ", np.any(y<=0.6) )
print("Are all elements in y <=0.6: ", np.all(y<=0.6) )

print("the boolean array of y<=0.6 is ", y<=0.6)

# test if an element is in the array
a = np.arange(1, 20, 2, dtype=int) 

print(8 in a) # the python way also works
print (np.any((a == 8))) #  with any

arange, linspace and logspace

a = np.arange(0, 2*np.pi, .1) # from 0 to 2*pi, STEPSIZE 0.1
print(a)
print(a.shape)

a = np.linspace(0, 2 * np.pi , 20) # form 1 to 2*pi, 20 ELEMENTS
print(a)
print(a.shape)

import matplotlib.pyplot as plt


x = np.linspace(0.1,10,100)
print('x: \n', x)

plt.plot(x, np.log(x), label='natural log (base e)')
plt.plot(x, np.log10(x), label='log (base 10)')
plt.plot(x, np.log2(x), label='log (base 2)')
plt.plot(x, np.log(x) / np.log(0.5), label='log (base 0.5)')

plt.plot(x, np.zeros_like(x), ':', color='black')

plt.legend()

print(np.log10(100), '\n')
a = np.logspace(0, np.log10(100), 20) 
#In linear space, the sequence starts at ``base ** start``
# (`base` to the power of `start`) and ends with ``base ** stop``
# default of base=10.0
print(a)
print(a.shape)

plt.plot(a, label='logspace')
plt.legend()

Adding new axis

Test = np.ones((8,2))
print(Test, '\n Shape: ', Test.shape, '\n')

Test_ = Test[np.newaxis,:, np.newaxis]
print(Test_, '\n Shape: ', Test_.shape, '\n')

Application of functions

Here we define the function$ f(x) = x^2 + 10 $:

# application of a fuction on all elements
x = np.arange(1, 10, 2)
print(x)
f = lambda x: x ** 2 + 10
f(x)

### build- in mathematical functions

# abs, sign, sqrt, log, log10, exp, sin, cos, tan, arcsin, arccos,
# arctan, sinh, cosh, tanh, arcsinh, arccosh, arctanh
a = np.array([1, 4, 9], float) 
print(a, '\n')
print(np.sqrt(a))

# application of a function along an axis
c = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]], int)
print(c)
print()

#we want all columns which contain a number % 6 = 0 ("modulo 6")
ind = np.apply_along_axis(lambda x: np.any(x % 6 == 0), axis=0, arr=c)

print(ind)
print()

print(c[..., ind])

a = np.array([1.1, 1.5, 1.9], float)

print(np.floor(a)) # round downwards
print(np.ceil(a)) # round upwards
print(np.rint(a)) # round at threshold 0.5

Iterations over arrays

a = np.array([[1, 2], [3, 4], [5, 6]], float)
for x in a:
    print(x)
    
### just for showing iteration functionality
### DON'T do this when you want to multiply elementwise the columns!!!
for(x, y) in a:
    print(x, ' * ' , y , ' = ' ,  x * y)

### this would be the way to do it
print(a[:,0] * a[:,1])

Build-in array operations

a = np.array([2, 4, 3], float)

print(a.sum()) # 2+4+3
print(a.prod()) # 2*4*3
print() 

print(a.mean())
print(a.var(ddof=1))
print(a.std(ddof=1.5))
print()

print(a.min())
print(a.max())

# application along an axis
a = np.array([[0, 2], [3, -1], [3, 5]], float)
print(a)
print(a.mean(axis=0))
print(a.mean(axis=1))

# get the index positions
print(a)
print(a.argmin()) # flattend index positions
print(np.unravel_index(a.argmax(), a.shape))

a = np.array([6, 2, 5, -1, 0], float)
print(sorted(a))

a = np.array([1, 1, 4, 5, 5, 5, 7], float)
print(np.unique(a))

a = np.array([[1, 2], [3, 4]], float)
print(a.diagonal()) # diagonal elements

Boolean arrays

a = np.array([1, 3, 0], float)
b = np.array([0, 3, 2], float)
a > b

a = np.array([1, 3, 0, 2], float)
print(np.logical_and(a > 0, a < 3)) # boolean array

# Remember other methods doing slightly different things:

print(a[(a > 0) & (a < 3)]) # return elements which hold conditions
print(np.where((a > 0) & (a < 3))) # return array with indexes which hold condition

b = np.array([True, False, False], bool)
print(np.logical_not(b))
print(~b) # same

c = np.array([False, True, False], bool)
print(np.logical_or(b, c))

a = np.arange(10)
print(a)
print(np.where(a%2==0 , a, 1000*a))

a = np.array([1, 3, 0], float)
# broadcasting for false array
np.where(a > 0, a, 42)

a = np.array([[0, 1], [3, 0]], float)

print(a.nonzero()) # array of indexes where nonzero
print()
print(a[a.nonzero()]) # apply indexes where nonzero on same array

# NaN: Not any number
# Inf: Infinity
a = np.array([1, np.NaN, np.Inf, 4, 7], float)
print(a)
print(np.isnan(a))
print(~np.isnan(a))
print(np.isfinite(a))

# handling of no existing values
# assume a was read in from a file 
a = np.array([2, np.NaN, 4, 6], float)
print(a)
print(a[~np.isnan(a)])
print(a.mean())
print(a[~np.isnan(a)].mean()) # e.g. to ignore nans

# take from an array multiple indices at ones
a = np.array([[0, 1], [2, 3]], float)
b = np.array([0, 0, 1], int)
a.take(b, axis=0)  # results in row0, row0, row1

# analog put 
a = np.array([0, 1, 2, 3, 4, 5], float)
a.put([0, 3], 5)
a

More from linear algebra

a = np.array([[4, 2, 0], [9, 3, 7], [1, 2, 1]], float) 
np.linalg.det(a) # determinate

# Eigenvalues and eigenvectors
vals, vecs = np.linalg.eig(a)
vals, vecs

b = np.linalg.inv(a) # inverse
b
a.dot(b) # note the numerical issues

Consider the linear equations:

$ x + 3y + 4z = 4 \\ 2x + 3y + 5z = 4 \\ 5x + 7y + 9z = 4 $

as matrix (read as dot product: "1x + 3y + 4z = 4 ...."):

$ \begin{bmatrix} 1 & 3 & 4 \\ 2 & 3 & 5 \\ 5 & 7 & 9 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 4 \\ 4 \\ 4 \end{bmatrix} $

# Solve a system of linear equations

list_matrix = [[1, 3, 4], [2, 3, 5], [5, 7, 9]]
A = np.array(list_matrix)
b = np.array([4, 4, 4])
# Solve
xyz = np.linalg.solve(A, b)
print(xyz)

Literature

Broadcasting

Numpys documentation:

Licenses

Notebook License (CC-BY-SA 4.0)

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

NumpyAndArrays
Oliver Fischer
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://gitlab.com/deep.TEACHING.

Code License (MIT)

The following license only applies to code cells of the notebook.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Introduction to Numpy and Arrays

Table of Contents

Introduction

Requirements

Knowledge

Python Modules

Basics

Initialization and Metainformation

Indexing

Important!

Slicing

Fancy Indexing

Boolean Indexing

Copies and references

Reshape

Concatenate

Operators

Addtion and Subtraction

Scalar multiplication

Transpose

Matrix multiplication (dot product)

Matrix-Vector product

Operators in Numpy

Addtion and Subtraction

Scalar multiplication

Multiplication

Transpose

Matrix multiplication (dot product)

Other types of vector operations

Broadcasting

Misc

Special arrays

any and all

arange, linspace and logspace

Adding new axis

Application of functions

Iterations over arrays

Build-in array operations

Boolean arrays

More from linear algebra

Literature

Broadcasting

Licenses

Notebook License (CC-BY-SA 4.0)

Code License (MIT)