A Brief Introduction to Fréchet Derivative
Fréchet derivative is a generalisation to the ordinary derivatives. Generally we are talking about Banach space, where $\mathbb{R}$ is a special case. This is to say, the space discussed is not even required to be of finite dimension. We use $\mathbf{E}$ and $\mathbf{F}$ to denote Banach spaces.
Recall
A real-valued function $f(t)$ of a real variable, defined on some neighbourhood of $0$, is said to be of $o(t)$ if
And its derivative at some point $a$ is defined by
We also have this equivalent equation:
Now suppose $f:U \subset \mathbb{R}^n \to \mathbb{R}^m$ where $U$ is an open set. The function $f$ is differentiable at $x_0 \in U$ if satisfying the following conditions.
All partial derivatives of $f$, i.e. $\frac{\partial f_i}{\partial x_j}$ exists for all $i=1,\cdots,m$ and $j = 1,\cdots,n$ at $f$. (Which ensures that the Jacobian matrix exists and is well-defined).
The Jacobian matrix $J(x_0)\in\mathbb{R}^{m\times n}$ satisfies
In fact the Jacobian matrix can be considered as the derivative of $f$ at $x_0$ although it’s a matrix in lieu of number. But we should treat a number as a matrix in the general case. In the following definition of Fréchet derivative, you will see that we should treat something as linear maps.
Definition
Let $f:U\to\mathbf{F}$ be a function where $U$ is an open subset of $\mathbf{E}$. We say $f$ is Fréchet differentiable at $x \in U$ if there is a bounded and linear operator $\lambda:\mathbf{E} \to \mathbf{F}$ such that
We say that $\lambda$ is the derivative of $f$ at $x$, which will be denoted by $Df(x)$ or $f’(x)$. Notice that $\lambda \in L(\mathbf{E},\mathbf{F})$. If $f$ is differentiable at every point of $f$, then $f’$ is a map given by
The definition above doesn’t go too far from real functions defined on the real axis. Now we are assuming that both $\mathbf{E}$ and $\mathbf{F}$ are merely topological vector spaces, and still we can get the definition of Fréchet derivative (generalised).
Let $\varphi$ be a mapping of a neighborhood of $0$ of $\mathbf{E}$ into $\mathbf{F}$. We say that $\varphi$ is tangent to $0$ if given a neighbourhood $W$ of $0$ in $\mathbf{F}$, there exists a neighbourhood $V$ of $0$ in $\mathbf{E}$ such that
for some real function of $o(t)$. For example, if both $\mathbf{E}$ and $\mathbf{F}$ are normed (not have to be Banach), then we get a usual condition by
where $\lim_{\lVert x \rVert \to 0}\psi(x)=0$.
Still we assume that $\mathbf{E}$ and $\mathbf{F}$ are topological vector spaces. Let $f:U \to \mathbf{F}$ be a continuous map. We say that $f$ is differentiable at a point $x \in U$ if there exists some $\lambda \in L(\mathbf{E},\mathbf{F})$ such that for small $y$ we have
where $\varphi$ is tangent to $0$. Notice that $\lambda$ is uniquely determined. This definition can be easily tested on the real line.
Basic concepts
You are certainly familiar with these properties of derivative, but we are redoing these in Banach spaces.
Chain rule
If $f: U \to V$ is differentiable at $x_0$, and $g:V \to W$ is differentiable at $f(x_0)$, then $g \circ f$ is differentiable at $x_0$, and
Proof. We are proving this in topological vector space. By definition, we already have some linear operator $\lambda$ and $\mu$ such that
where $\varphi$ and $\psi$ are tangent to $0$. Further, we got
To evaluate $g(f(x_0+y))$, notice that
It’s clear that $\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y))$ is tangent to $0$, and $\mu\circ\lambda$ is the linear map we are looking for. That is,
Derivative of higher orders
From now on, we are dealing with Banach spaces. Let $U$ be an open subset of $\mathbf{E}$, and $f:U \to \mathbf{F}$ be differentiable at each point of $U$. If $f’$ is continuous, then we say that $f$ is of class $C^1$. The function of order $C^p$ where $p \geq 1$ is defined inductively. The $p$-th derivative $D^pf$ is defined as $D(D^{p-1}f)$ and is itself a map of $U$ into $L(\mathbf{E},L(\mathbf{E},\cdots,L(\mathbf{E},\mathbf{F})\cdots)))$ which is isomorphic to $L^p(\mathbf{E},\mathbf{F})$. A map $f$ is said to be of class $C^p$ if its $kth$ derivative $D^kf$ exists for $1 \leq k \leq p$, and is continuous. With the help of chain rule, and the fact that the composition of two continuous functions are continuous, we get
Let $U,V$ be open subsets of some Banach spaces. If $f:U \to V$ and $g: V \to \mathbf{F}$ are of class $C^p$, then so is $g \circ f$.
Open subsets of Banach spaces as a category
We in fact get a category $\{(U,f_U)\}$ where $U$ is the object as an open subset of some Banach space, and $f_U$ is the morphism as a map of class $C^p$ mapping $U$ into another open set. To verify this, one only has to realize that the composition of two maps of class $C^p$ is still of class $C^p$ (as stated above).
We say that $f$ is of class $C^\infty$ if $f$ is of class $C^p$ for all integers $p \geq 1$. Meanwhile $C^0$ maps are the continuous maps.
An example
We are going to evaluate the Fréchet derivative of a nonlinear functional. It is the derivative of a functional mapping an infinite dimensional space into $\mathbb{R}$ (instead of $\mathbb{R}$ to $\mathbb{R}$).
Consider the functional by
where the norm is defined by
For $u\in C[0,1]$, we are going to find an linear operator $\lambda$ such that
where $\varphi(\eta)$ is tangent to $0$.
Solution. By evaluating $\Gamma(u+\eta)$, we get
To prove that $\int_{0}^{1}\eta^2\sin{x}dx$ is the $\varphi(\eta)$ desired, notice that
Therefore we have
as desired. The Fréchet derivative of $\Gamma$ at $u$ is defined by
It’s hard to believe but, the derivative is not a number, nor a matrix, but a linear operator. But essentially, a real or complex number or matrix can be and should be treated as a linear operator in the nature of things.
A Brief Introduction to Fréchet Derivative