Linear algebra

In this lecture, you will implement linear algebra techniques in Python. You will see that for solving physics problems using linear algebra, the most complicated part is to translate the physical problem into a matrix-vector equation. When put in a standardized form, many linear algebra tools are at your disposal in the numpy package.

Learning objectives: After finishing this lecture, you should be able to:

  1. Solve a set of linear equations
  2. Know when a set of linear equations can be solved
  3. Determine eigenvalues and eigenvectors of a matrix
In [ ]:
# Initialisation code for the notebook
import numpy as np
import matplotlib.pyplot as plt
import sys
sys.path.append('resource/asnlib/public/')
sys.path.append('../answer_checking/') 
from validate_answers import *

Some basic vector operations

As we have seen earlier, numpy can do basic algebra with matrices and vectors, although we have to take care to correctly treat arrays as vectors, i.e. use them with the correct dimensions.

Exercise 1

Write a script for calculating the angle between two vectors in degrees. Start with vectors a and b as input and then find the angle. Evaluation of the angle must be done with a single line. Note that numpy does not care about the dimensions (column or row) as calculates the inner product for arrays. Howeve, take care that the script does not crash in case of faulty input (two vectors of different length for example). Output the angle to variable theta. Test your code with vectors for which you already know the outcome.

In [ ]:
a=np.array([1, 0, 0])
b=np.array([0, 1, 0])

###
### Your code here
###

Solving sets of linear equations with Gaussian elimination

To understand how to solve a complicated physics problem in matrix algebra we are going to make solve for the currents in a complex electrical network. You will see how a long and complex code can be substituted with a single line of code from a Python package.

Consider the following circuit of resistors:

All the resistors have the same resistance $R$. The power rail at the top is at voltage $V_+$=5V. What are the other four voltages, $V_1$ to $V_4$?

To answer this question we use Ohm's law and the Kirchhoff current law, which says that the total net current flow out of (or into) any junction in a circuit must be zero. Thus for the junction at voltage $V_1$, for instance, we have \begin{equation} {V_1-V_2\over R} + {V_1-V_3\over R} + {V_1-V_4\over R} + {V_1-V_+\over R} = 0, \end{equation} or equivalently \begin{equation} 4V_1 - V_2 - V_3 - V_4 = V_+. \end{equation}

Exercise 2

Write with pen and paper similar equations for the other three junctions with unknown voltages. Define the matrix equation relating the four unknown voltages and the known parameters.

The most straightforward way to solve a set of equations is by means of Gaussian elimination. We can multiply any equation by a constant and subtract equations from each other while still obtaining the correct solution. This is utilized to make the matrix describing the linear set of equations upper triangular. The solution is then easily obtained by back-substitution from the bottom equation to the top. The program below can solve the four resulting equations using Gaussian elimination and find the four voltages. Use your result obtained above to set-up the correct A matrix and v vector for your problem. Go through it and notice how Python efficiently uses the matrix $A$ storage and works on entire rows by slicing.

In [ ]:
from numpy import array,empty
###
### Your code here
###


N = len(v)

# Gaussian elimination
for m in range(N):
    # Divide by the diagonal element
    div = A[m,m]
    A[m,:] /= div
    v[m] /= div

    # Now subtract from the lower rows
    for i in range(m+1,N):
        mult = A[i,m]
        A[i,:] -= mult*A[m,:]
        v[i] -= mult*v[m]
print('The matrix after Gaussian elimiation is')
print(A)
        
# Backsubstitution
x = empty(N,float)
for m in range(N-1,-1,-1):
    x[m] = v[m]
    for i in range(m+1,N):
        x[m] -= A[m,i]*x[i]

print('The solution is')
print(x)

Solving sets of linear equations

Solving sets of linear equations is not always trivial. For example, the matrix of coefficients can be non-square (more equations than unknowns) or have elements with zeros in it (leading to division by zero). In that case standard Gaussian elimination will not work. Alternatively, the equations may not be linear independent. Here we will investigate these issues by solving various sets of linear equations.

Consider the linear set of equations $Ax=b$.

\begin{equation} \left[\begin{array}{cccc} 0 & 0 & 2 & 3\\4 & 2 & 3 & 1\\2 & 5 & 1 & 2\\1 & 0 & 0 & 1\end{array}\right]\left[\begin{array}{c}x_1\\x_2\\x_3\\x_4\end{array}\right]=\left[\begin{array}{c}5\\1\\7\\2\end{array}\right] \end{equation}

This set of equation is solvable as $x=A^{-1}b$ when the matrix $A$ is invertible. There are many conditions that a matrix meets/does not meet when it is invertible/non-invertible. The most important ones for a square matrix to be invertible are

  1. The matrix is full rank, i.e., the rank is equal to the number of rows/columns. This means that all rows/columns are linearly independent
  2. The determinant of the matrix is non-zero, e.g., for a two-by-two matrix $A$ the inverse is proportional to (det$(A))^{-1}$ and therefore det$(A)$ needs to be non-zero in order for the inverse to exist.

If a matrix is non-invertible, the above mentioned conditions are not met, these matrices are also called singular matrices.

Exercise 3

Calculate the following matrix properties:

  • Inverse of the matrix $A$ using the inv function from the numpy linear algebra package. Name the inverse invA.
  • Rank of the matrix A with the numpy function matrix_rank and save it to parameter rankA
  • Matrix determinant with the numpy function det and store it to variable detA.

Look in the help for more details on the functions.

In [ ]:
###
### Your code here
###

answer_6_03_1 = np.copy(invA)
answer_6_03_2 = np.copy(rankA)
answer_6_03_3 = np.copy(detA)
In [ ]:
question = "answer_6_03"
num = 3

to_check = [question + "_%d" % (n+1) for n in range(num)]
feedback = ""
passed = True
for var in to_check:
    res, msg = check_answer(eval(var), var)
    passed = passed and res
    print(msg)

assert passed == True

Now see what happens if you lower the rank of the matrix by making last two rows identical. Print the matrix rank, matrix determinant to the command line. Can you calculate the inverse? And what if the rows are copies of each other differing only by a small number close to the machine precision, say 1e-16 or 1e-17. Play around and try it for yourself.

In [ ]:
###
### Your code here
###

As you have seen a set of linear equations can be solved using the matrix inverse inv for square matrices even when they have zero entries on the diagonal. Calculating the inverse with inv can be used to solve $Ax=b$ through multiplication as $x=A^{-1}b$. However, in many cases you want solve $Ax=y$ for different $b$ but with the same $A$. A more efficient way to do this is by using LU-decomposition and the function solve. Solve does not calculate the full inverse, but instead factorizes it to the set of equations much faster, by how much, we shall investigate now.

Exercise 4

Write a script to evaluate the differences in computational time between the inverse method and the solve method. Perform the following steps

  • Generate random matrices $A$ of size $N\times N$ and random (column) vectors $b$ of size $N\times 1$ using the rand command from the numpy random library.
  • Compute the solution $x$ to $Ax = b$ using the inv method and the solve method
  • Compute the time it takes for the computer to finish the calculation for both methods and store the calculation time. Look up lecture 2c for timing the computation time.
  • Make a loop over a range of values for $N$, perform the calculation for every matrix size $N$ and store the computation times in an array. Take a range of values for $N$ between 100 and 1000 in ten steps on a logarithmic scale (e.g. use np.geomspace and round off to integer value).
  • Make a plot of both calculation times as a function of $N$. Use a double logarithmic plot with the command loglog and plt.grid(which='minor').
  • Increase $N$ to 10000 and repeat your script a few times to check to what extent you get the same results each time you run the script.
  • Finally, formulate a hypothesis for the dependence of computational time on $N$, i.e. propose a formula for this dependence. Read from the graph, by what factor is solve faster than inv.

In a log-log plot a dependence of $t \propto N^{p}$ corresponds to a linear line log $t=p $log $N$. Hence, the slope of the linear dependence is the exponent $p$. To determine power $p$ divide the number of units on the vertical scale with the number of units on the horizontal scale.

In [ ]:
###
### Your code here
###

Self check

  • Do you observe an approximately linear dependence (in log-log) of the computation time as a function of the matrix size?
  • Do the linear curves for inverse and solve have an offset?
  • Did you add a legend to your graph?

Eigenvalues and eigenvectors

Eigenvalues and eigenvectors play an important role in physics. They are the solution to the eigenvalue equation

\begin{equation} A \vec{x} = \lambda \vec{x} \, , \end{equation}

where $A$ is a symmetric $N \times N$ matrix. The eigenvalue equation essentially means that the matrix $A$ operating on vector $\vec{x}$ changes the length of the vector, but not its' direction. For an $N \times N$ matrix there are $N$ eigenvalues and the eigenvectors have the property that they are orthogonal.

Here we are going to show that the calculation of eigenvalues and eigenvectors can be very useful to study the dynamics of mechanical systems. For this we use material from "Dynamics and Relativity" of McComb of your "mecharela" course.

Consider two masses on a stretched string, page 87 McComb. The equations of motions of the two masses are \begin{eqnarray} m \frac{\text{d}^2 x_1}{\text{d} t^2} &=& -T \frac{x_1}{a} + T\frac{x_2 - x_1}{2b}\\ m \frac{\text{d}^2 x_2}{\text{d} t^2} &=& -T \frac{x_2}{a} - T\frac{x_2 - x_1}{2b} \, .\\ \end{eqnarray} With the trial solutions, i.e., assuming harmonic motion, \begin{eqnarray} x_1 &=& A \text{cos}(\omega t + \phi) \\ x_2 &=& B \text{cos}(\omega t + \phi) \, ,\\ \end{eqnarray} we can derive a system of equations.

Exercise 5 Use pen and paper to write this system of equation as an eigenvalue problem.

Define in Python the matrix A that describes this problem as an eigenvalue problem. Fill it with the correct numbers given $T=1$, $m=2$, and $a=b=0.25$. Determine the eigenvalues eigA and eigenvectors eigvA with the eigh function of numpy for the system with $T=1$, $m=2$, and $a=b=0.25$.

In [ ]:
# define parameters
T=1
m=2
a=0.25
b=0.25

###
### Your code here
###

answer_6_05_1 = np.copy(A)
answer_6_05_2 = np.copy(eigA)
answer_6_05_3 = np.copy(eigvA)
In [ ]:
question = "answer_6_05"
num = 3

to_check = [question + "_%d" % (n+1) for n in range(num)]
feedback = ""
passed = True
for var in to_check:
    res, msg = check_answer(eval(var), var)
    passed = passed and res
    print(msg)

assert passed == True

Compare your results to the analytical result shown in McComb. Note that there is an error in McComb: the lowest frequency mode has $A=B$ and the high frequency mode has $A=-B$. Why are the eigenvectors from numpy quantitative, whereas the eigenvectors of McComb are relative $[A, B]$ and $[A, -B]$? Plot the positions of the particles and the string at $t=0$ and $\phi=0$, i.e., create something like Fig. 4.11 of McComb. Use np.concatenate to add the fixed side points to the bead positions for plotting the string.

In [ ]:
###
### Your code here
###

Although in this case the solution can be obtained both analytically and with the computer we now have a framework to set up much more complicated systems of equations and solve them. Consider a string with $N$ particles spaced by distance $a$. The $x$-positions of the $i$th particle is given by $x_i=i \cdot a$. The system of motion of the $i$th particle is given by

\begin{eqnarray} m \frac{\text{d}^2 x_i}{\text{d} t^2} &=& T \frac{x_{i-1} - x_i}{a} + T \frac{x_{i+1} - x_i}{a} \\ \end{eqnarray}

Note that the string is fixed on the wall, hence, $x_0=x_{N+1}=0$.

Exercise 6

Set up a system of 50 particles on a string at distance $a=1$ relative to each other. Hint : use the numpy function eye to quickly fill your matrix with diagonal and off-diagonal elements. Find the eigenvalues and eigenvectors. Plot the positions of the particles for the first 4 eigenmodes at $t=0$ and $\phi=0$ in a for loop. Automatically make a legend indicating the mode number.

In [ ]:
###
### Your code here
###

Self-check

  • Do you observe a wave pattern in the amplitude of the particles?
  • Why do we use the function eigh for solving this eigenvalue problem and not the eig function? Does it give the same results for this problem?