In [1]:
import numpy as np

Making numpy arrays in a loop

Sometimes, in your calculations, you want to run some code in a loop and then add that calculated value to a numpy array. As is always the case in coding, there are a couple of different ways to do this, we show some of them here.

Using a list then convert to a numpy array

One way to do this is to use a list to collect your values and then convert it at the end to a numpy array. If you like lists, this is quick and handy.

In [4]:
# Make an empty list
y = []

# Grow it by appending values
for x in range(10):
    temp = x**2 # this would usually be a more complicated calculation
    y.append(temp)

# Take a look at it
print("y as a list:")
print(type(y))
print(y)
print()

# Then convert it to an array
y = np.array(y)
print("y as an array:")
print(type(y))
print(y)
y as a list:
<class 'list'>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

y as an array:
<class 'numpy.ndarray'>
[ 0  1  4  9 16 25 36 49 64 81]

Using np.append (not recommended...)

Numpy also has an append function, but it does not append to a given array, it instead creates a new array with the results appended. This is both memory inefficient, and also computationally inefficient. You should only use np.append if you really want a second copy of the array.

Here below though is how you would use np.append in the loop:

In [6]:
y = np.array([]) # An empty array

for x in range(10):
    temp = x**2 
    # overwrite y each time with the new array that np.append returns
    y = np.append(y,temp)

# Y is now the array already, but for big calculations, the above is not recommended 
# as it will use a lot of extra CPU time and extra memory

print("y is:")
print(type(y))
print(y)
y is:
<class 'numpy.ndarray'>
[ 0.  1.  4.  9. 16. 25. 36. 49. 64. 81.]

Using pre-allocation

This is the most computationally efficient way, but it only works if you know in advance how big the final array will be! If you don't know how big it will be in then end, the first example with lists is probably your best bet.

In [8]:
N = 10

# Pre-allocate an array of the right size
y = np.zeros(N) 

# Now fill the arry 
for i in range(N):
    x = i
    y[i] = x**2

print("y is:")
print(type(y))
print(y)
y is:
<class 'numpy.ndarray'>
[ 0.  1.  4.  9. 16. 25. 36. 49. 64. 81.]

Pre allocation is much faster than growing lists and numpy arrays as the computer just has to insert the value at the right position in memory, and does not have to create an entirely new array object.

But, pre-allocation will always be slower than "vectorising", which is shown below.

Vectorisation

If you can think of a way to "vectorise" your calculation, then you don't need a for loop at all. The above example is actually very easy to "vectorise" in numpy:

In [9]:
x = np.array(range(10))
y = x**2

print("y is:")
print(type(y))
print(y)
y is:
<class 'numpy.ndarray'>
[ 0  1  4  9 16 25 36 49 64 81]

If you can wrap your brain around slicing, you can also vectorise things like a derivative easily using slicing:

In [10]:
x = np.array(range(10))

# A derivative with slicing. Say we want y[n] = x[n] - x[n-1].
# The first slice gives entries 1 to 9, the second one gives entries 0 to 8.
# Then this array slicing magic does the job. 
y = (x[1:] - x[:-1])

print("y is:")
print(type(y))
print(y)
y is:
<class 'numpy.ndarray'>
[1 1 1 1 1 1 1 1 1]

Not only does vectorisation result in much more concise (but not always as transparent) code, it also can be much much much faster (see example in Notebook 4 of the review material to see this in action).