These notes are mainly a record of what we discussed and are not a substitute for attending the lectures and reading books! If anything is unclear/wrong, let me know and I will update the notes.
Code
include("preamble.jl")
✓ file included!
using: Plots, LaTeXStrings, Polynomials, PrettyTables, LinearAlgebra
functions included:
Ch2: simple_iteration, Newton, orderOfConvergence,
Ch3: ChebyshevNodes, ChebyshevInterpolant, Lagrange,
Use @doc <<function>> for help
NoteWarm-up
Use LU decomposition combined with forward and backwards substitution to solve the linear system of equations
The first row of U is u_1 = (2, 0, 4, 3) and so the first row of L is \ell_1 = (1, -1, 1/2, -2)^\intercal (the first column of A divided by (u_1)_1). Therefore, we have
Now we have a problem! In the algorithm we learnt last time, we would set u_2 = (0, 0, 6, -10) and then let \ell_2 be the second column of A^{(2)} divided by (u_2)_2 but this is zero! Instead, we will use a pivoting strategy. This is what the built in function does:
Before, we chose u_k to be the first nonzero row of A^{(k)}. Now we instead choose the p(k) row so that we maximise the absolute value of [A^{(k)}]_{p(k),k}. Let’s consider an example:
First, we choose p(1) so that A_{p(1),1} is maximised: that is, p(1) = 3. Then, we let u_1 be the p(1) row of A: that is, u_1 = (2, 0, 3) and \ell_1 be the first column of A divided by (u_1)_1=2: that is, \ell_1 = ( 1/2, 0, 1 )^\intercal. At this stage we have
We choose p(2) to maximise A^{(2)}_{p(2),2} and so we set p(2) = 2 and u_2 = (0, 2, 1). Then we set \ell_2 to be the second column of A^{(2)} divided by (u_2)_2 = 2 (that is, \ell_2 = (0, 1, 0)^\intercal) and we have
Answer: We get better conditioning by choosing the largest pivot. Later, we introduce the condition number of a matrix, written \kappa(A), which will make the following discussion rigorous.
17.3 Condition numbers of linear systems of equations
In order to fully understand the previous section, we need to make sense of the condition number \kappa(A). Recall that when we were considering functions f : \mathbb R \to \mathbb R, the condition number was (approximately) the constant of proportionality between the relative error in the input to the relative error of the output. That is, if x \approx \tilde{x} then, we have
We now take this idea but the “input” is the vector b and the “output” is x solving the linear system A x = b. Again, the condition number measures how sensitive the solution x is to errors in the data b. Roughly speaking, if the condition number is 10^\rho, we expect to lose \rho digits of accuracy when we go from b to x. This made perfect sense on \mathbb R (with the absolute value) but now we have many choices for the norm. We shall fix the Euclidean norm for the remainder of the course:
Notice that, we have |Ax| \leq \|A\| |x| for all x. We are now ready to define the condition number of the matrix A (or the corresponding linear system).
Suppose we have the exact data b and exact solution x (solving Ax=b) and we have the approximate data \tilde{b} (for example, the floating point representation of b) and corresponding solution \tilde{x} solving A\tilde{x} = \tilde{b}. First, notice that |b| = |Ax| \leq \|A\| |x| and that A(x - \tilde{x}) = b - \tilde{b} and so
Show that \|AB\| \leq \|A\|\|B\| and hence \|A^k\| \leq \|A\|^k,
What is \|D\| for diagonal D,
A is normal (i.e. A^TA = AA^T) if and only if there exists orthogonal P (i.e. P^{-1} = P^T) and diagonal \Lambda such that A = P\Lambda P^T. Show that the columns of P are eigenvectors of A,
Show that, for normal matrices A, the operator norm \|A\| is equal to \max_i |\lambda_i| where (\lambda_i) are the eigenvalues of A.