function adr_1d(n, p, q, f; order=2)
h = 1/n
x = h:h:(1-h)
# Finite Difference Coefficients
# -u'' ≈ (-u_{i-1} + 2u_i - u_{i+1}) / h^2
# use central difference (order of accuracy 2)
# pu' ≈ p * (u_{i+1} - u_{i-1}) / (2h)
# or forward difference (order of accuracy 1)
# pu' ≈ p * (u_{i+1} - u_{i}) / h
if (order == 1)
# use the forward difference D^+ = (u_j+1-u_j)/h
d = (2/h^2 + q - p/h) * ones(n-1)
d₋ = (-1/h^2 ) * ones(n-2)
d₊ = (-1/h^2 + p/h) * ones(n-2)
else
# use a central difference D^0 = (u_j+1 - u_j-1)/2h
d = (2/h^2 + q) * ones(n-1)
d₋ = (-1/h^2 - p/(2h)) * ones(n-2)
d₊ = (-1/h^2 + p/(2h)) * ones(n-2)
end
A = Tridiagonal(d₋, d, d₊)
b = f.(x)
u = A \ b
# Dirichlet boundary conditions
u = [0.0, u..., 0.0]
x = [0.0, x..., 1.0]
return x, u
end
# u'' = diffusion
# p = advection
# q = reaction
# f = source
f(x) = 1.0
# Advection
p, q = 50., 0.
x, u = adr_1d(100, p, q, f)
P1 = plot(x, u, title=L"Diffusion: $-u′′ + %$p u′ + %$q u = 1$",
xlabel=L"Position $x$", ylabel=L"Concentration $u$",
label=L"u(x)", lw=2, m=1)
u_e(x) = (x - (exp(50*x)-1)/(exp(50)-1))/50
plot!(P1, u_e, linestyle=:dot, label="exact solution")
display(P1)
hh = [2.0^(-n) for n=10:-1:2]
errs1 = zeros(length(hh))
errs2 = zeros(length(hh))
for (i,h) ∈ enumerate(hh)
x, u1 = adr_1d(round(Int, 1/h), p, q, f; order=1)
x, u2 = adr_1d(round(Int, 1/h), p, q, f; order=2)
errs1[i] = norm( u1 - u_e.(x) , Inf )
errs2[i] = norm( u2 - u_e.(x) , Inf )
end
P2 = plot( [hh hh], [errs1 errs2];
color=[:red :blue], label=["using forward difference" "using central difference"],
xaxis=:log, yaxis=:log, title=L"errors in $||\cdot||_\infty$", m=3 )
plot!( hh, hh/4;
xlabel=L"h", linestyle=:dash, label=L"O(h)", color=:red)
plot!( hh, hh.^2;
linestyle=:dash, label=L"O(h^2)", color=:blue)13 Convergence of Finite Differences
Lecture 13
I encourage you to play around with the juptyer notebook for this lecture - you can copy the code with the this notebook button on the side of this page.
These notes are mainly a record of what we discussed and are not a substitute for attending the lectures and reading books! If anything is unclear/wrong, let me know and I will update the notes.
13.1 Recap
We are considering boundary value problems of the following form: Suppose we have a differential operator \mathcal L, a “nice” domain \Omega \subset \mathbb R^d, and data f. Seek u : \Omega \to \mathbb R such that
\begin{align} \mathcal L[u] &= f &&\text{in } \Omega \nonumber\\ u &= 0 &&\text{on } \partial \Omega . \tag{BVP} \end{align}
This is a problem with homogeneous Dirichlet boundary conditions (the values of the solution are prescribed on the boundary (Dirichlet) and are zero (homogeneous)).
Example 13.1 (Linear Advection-Diffusion-Reaction Equations) Consider
\begin{align} \mathcal L[u] := -\Delta u + p \cdot \nabla u + q u \end{align}
In the context of semiconductor physics, u(x) represents the concentration of charge carriers at a position x \in \Omega. Each of these terms relates to the physics in question:
- -\Delta : diffusion; electrons move from areas of high concentration to low,
- p \cdot \nabla u : advection (drift); the solution is driven with velocity p (due to e.g. an external electric field),
- q u : reaction; electrons and electron holes are being created or destroyed.
The same equation also models e.g pollutant transport in a river, drug delivery in a blood vessel, asset price in finance, heat transfer in fluids, ……
Last time, we solved this equation on [0,1] numerically using finite differences: we defined the mesh \{x_j\}_{j=0}^n with x_j = \frac{j}{n} and mesh size h= \frac{1}{n} and approximated u_j \approx u(x_j).
Exercise 13.1 Write down the resulting linear system of equations that resulted from using finite differences - check with the person next to you.
By approximating the continuous problem with finite differences, we obtain a discrete problem: find \bm U \in \mathbb R^{n+1} (with index starting at 0 or equivalently U: \Omega_h \to \mathbb R where \Omega_h := \{x_j\}_{j=0}^n \to \mathbb R with U_j := U(x_j)) such that
\begin{align} \begin{split} &\mathcal L_h [\bm U] = \bm f \\ &U_0 = U_n = 0 \end{split}\tag{$\text{BVP}_h$} \end{align}
where \mathcal L_h depends on the choice of finite difference scheme.
Example 13.2 In the example we saw on Monday, we have
\begin{align} \mathcal L_h &= -L + p D^0 + qI \nonumber\\ % &= -\frac{1}{h^2} \begin{pmatrix} -2 & 1 & & & \\ 1 & -2 & 1 & & \\ & 1 & -2 & \ddots \\ & & \ddots & \ddots & 1\\ & & & 1 & -2 \end{pmatrix} % + \frac{p}{2h} \begin{pmatrix} 0 & 1 & & & \\ -1 & 0 & 1 & & \\ & -1 & 0 & \ddots \\ & & \ddots & \ddots & 1\\ & & & -1 & 0 \end{pmatrix} % + q I, \nonumber \end{align}
a (n-1)\times(n-1) matrix (acting on the interior points of the grid).
Recall, we say that a numerical approximation is consistent if the error converges when you substitute in the exact solution of the problem. In this particular setting, we have:
Definition 13.1 (Consistency) Suppose that u : \Omega \to \mathbb R is the exact solution to \text{(BVP)} and let \bm u = \{ u(x_j) \}_{j=0}^n be the exact solution evaluated on the mesh. We say the numerical approximation (\text{BVP}_h) is consistent if \| \mathcal L_h[\bm u] - \bm f\|_{\infty} \to 0 as h \to 0 where f_j = f(x_j).
We say the numerical approximation is order-p accurate (or has order of accuracy p) if \| \mathcal L_h[\bm u] - \bm f\|_{\infty} = O(h^p) as h \to 0.
Definition 13.2 (Convergence) Suppose that u : \Omega \to \mathbb R is the exact solution to \text{(BVP)} and let \bm u be the exact solution restricted to the mesh. Let \bm U_h be the numerical solution to (\text{BVP}_h). We say the numerical scheme is convergent if \| \bm U_h - \bm u \|_{\infty} \to 0 as h \to 0.
We say the numerical approximation is convergent of order-p if \| \bm U_h - \bm u \|_{\infty} = O(h^p) as h \to 0.
Definition 13.3 (Stability) We say that (\text{BVP}_h) is stable if there exists C>0 such that
\begin{align} \| \bm U_h - \bm W_h \|_{\infty} \leq C \| \mathcal L_h[\bm U_h] - \mathcal L_h[\bm W_h] \|_{\infty} \end{align}
for all \bm U_h, \bm W_h.
Again, stability means that if we have data \bm f and \bm f + \bm \delta (for example, we have data and the corresponding floating-point representation of the data) and corresponding numerical solutions \bm U_h and \bm W_h (resp.), then stability means \bm W_h is close to \bm U_h.
These notions all depend on the choice of norm. We choose \| \cdot \|_{\infty} for simplicity.
We have the theorem that you should be expecting:
Theorem 13.1 Suppose that \text{(BVP)} and (\text{BVP}_h) have unique solutions u and \bm U_h. Then, if (\text{BVP}_h) is consistent (with order p) and stable then the method is convergent (with order at least p).
Proof.
Let \bm u be the exact solution to the continuous problem restricted to the mesh and suppose that (\text{BVP}_h) is order p accurate. Then, by stability, we have
\begin{align} \| \bm U_h - \bm u \|_\infty &\leq C \| \mathcal L_h[\bm U_h] - \mathcal L_h[\bm u] \|_\infty \nonumber\\ % &= C \| \bm f - \mathcal L_h[\bm u] \|_\infty \nonumber\\ % &= O(h^p) \end{align}
as h\to0. (The final equality is the consistency estimate).
13.2 Convergence of finite differences for ADR
Here, we return to the linear Advection-Diffusion-Reaction equation in 1d:
\begin{align} &\mathcal L[u] := -u'' + p u' + q u = f \nonumber\\ &u(0) = u(1) = 0, \nonumber \end{align}
and the numerical scheme
\begin{align} &\mathcal L_h[\bm U] := (-L + p D^0 + q I) \bm U = \bm f \nonumber\\ &U_0 = U_n = 0. \end{align}
In the following, we show that the method is consistent and (under some assumptions on the parameters p,q) stable. We can then use the theorem to conclude that the method is convergent.
Exercise 13.2 (Consistency) What is the order of accuracy of the approximate scheme?
We now move on to stability. We will use the following notion:
Definition 13.4 (Strictly Diagonally Dominant (SDD)) A matrix A is SDD if
\begin{align} |A_{ii}| > \sum_{j \not= i} |A_{ij}| \end{align}
for all i.
For what parameters p,q is the matrix
\begin{align} \mathcal L_h = \frac{1}{h^2} \begin{pmatrix} 2 + h^2q & -1+\frac{p}{2} h & & & \\ -1-\frac{p}{2}h & 2 + h^2q & -1+\frac{p}{2}h & & \\ & -1-\frac{p}{2}h & 2 + h^2q & \ddots \\ & & \ddots & \ddots & -1+\frac{p}{2}h\\ & & & -1-\frac{p}{2}h & 2 + h^2q \end{pmatrix} % \end{align}
SDD for all sufficiently small h?
Lemma 13.1 Suppose A is an SDD matrix. Then, A is invertible and there exists C such that
\begin{align} \|x\|_\infty \leq C \| Ax\|_{\infty} . \end{align}
Proof.
Take i such that |x_i| = \|x\|_\infty and notice that, if A x = b, then b_i = A_{ii} x_i + \sum_{j\not= i} A_{ij} x_j and so
\begin{align} |A_{ii}||x_i| &\leq |b_i| + \sum_{j\not= i} |A_{ij}| |x_j| \nonumber\\ % &\leq |b_i| + \sum_{j\not= i} |A_{ij}| |x_i|. \end{align}
Since |x_i| = \|x\|_\infty, we have
\begin{align} \Bigg( |A_{ii}| - \sum_{j\not= i} |A_{ij}|\Bigg) \|x\|_{\infty} &\leq |b_i| \nonumber \end{align}
and, since A is SDD, we have
\begin{align} \|x\|_\infty &\leq \Bigg( |A_{ii}| - \sum_{j\not= i} |A_{ij}|\Bigg)^{-1} |b_i| % \nonumber\\ % &\leq C \|b\|_\infty = C \|Ax\|_\infty \end{align}
where C = \max_i \big( |A_{ii}| - \sum_{j\not= i} |A_{ij}|\big)^{-1}.
Exercise 13.3 Show that, for p \in\mathbb R and q > 0, the approximation scheme (\text{BVP}_h) is stable for all h < \frac{2}{|p|}.
We apply our numerical scheme to this problem and plot the error between the exact solution and the approximate scheme. (Hopefully) we will see that our method converges with rate O(h^2) for the choice of p,q mentioned above. We also plot the error when using a forward difference of order of accuracy 1 instead of the central difference for approximating q u' (what do you expect to see in the error plots?):
# Reaction (decay)
p, q = 0., 100.
x, u = adr_1d(100, p, q, f)
P3 = plot(x, u, title=L"Reaction: $-u′′ + %$p u′ + %$q u = 1$",
xlabel=L"Position $x$", ylabel=L"Concentration $u$",
label=L"u(x)", lw=2, m=1)
u_e(x) = ( 1 - (sinh(10x)+sinh(10(1-x)))/(sinh(10)) )/100
plot!(P3, u_e, linestyle=:dot, label="exact solution")
display(P3)
hh = [2.0^(-n) for n=10:-1:2]
errs = zeros(length(hh))
for (i,h) ∈ enumerate(hh)
x, u = adr_1d(round(Int, 1/h), p, q, f)
errs[i] = norm( u - u_e.(x) , Inf )
end
P2 = plot( hh, errs;
primary=false,
xaxis=:log, yaxis=:log, title=L"errors in $||\cdot||_\infty$", m=3, color=:red )
plot!( hh, hh.^2/100;
xlabel=L"h", linestyle=:dash, label=L"O(h^2)")# Advection + Reaction
p, q = 100., 100.
x, u = adr_1d(100, p, q, f)
P4 = plot(x, u, title=L"$-u′′ + %$p u′ + %$q u = 1$",
xlabel=L"Position $x$", ylabel=L"Concentration $u$",
label=L"u(x)", lw=2, m=1)
r1 = 100.99
r2 = -.99
u_e(x) = .01*( 1 - exp(r1*x) * (1-exp(r2))/(exp(r1)-exp(r2)) - exp(r2*x) * (exp(r1)-1)/(exp(r1)-exp(r2)) )
plot!(P4, u_e, linestyle=:dot, label="exact solution")
display(P4)
hh = [2.0^(-n) for n=10:-1:2]
errs = zeros(length(hh))
for (i,h) ∈ enumerate(hh)
x, u = adr_1d(round(Int, 1/h), p, q, f)
errs[i] = norm( u - u_e.(x) , Inf )
end
P2 = plot( hh, errs;
primary=false,
xaxis=:log, yaxis=:log, title=L"errors in $||\cdot||_\infty$", m=3, color=:red )
plot!( hh, hh.^2;
xlabel=L"h", linestyle=:dash, label=L"O(h^2)")Exercise 13.4 1. What is the stability constant C that you get by following the proof of the lemma (A SDD implies stability \|x\|_\infty \leq C \|Ax\|_\infty) when applied to \mathcal L_h?
2. Is this constant what you observe numerically? If C is larger, how does this change the error curve?
Next:
- (L^\infty) Stability of -L \bm U = \bm F (here, L is not SDD!)
- More generally, stability of finite difference schemes which give rise to irreducible diagonally dominant (IDD) matrices,
- L^2 stability
- Take a quick look at the suggestions for Presentations/Posters/Papers,
- (Sign-up to office hours if you would like)
Chapter 1 reading:
- Burden, Faires, and Burden (2015) sections 5.1 - 5.6,
- Driscoll and Braun (2018) zero stability
Chapter 2 reading:
- Recap matrices: sections 7.1 - 7.2 of Burden, Faires, and Burden (2015)
- Jacobi & Gauss-Seidel: section 7.3 Burden, Faires, and Burden (2015)
- CG: section 7.6 Burden, Faires, and Burden (2015)
Chapter 3 reading:
- Sections 11.1 - 11.2 of Burden, Faires, and Burden (2015): shooting methods