A Brief Introduction to Fréchet Derivative
Fréchet derivative is a generalisation to the ordinary derivatives. Generally we are talking about Banach space, where $\mathbb{R}$ is a special case. Indeed, the space discussed is not even required to be of finite dimension. We use $\mathbf{E}$ and $\mathbf{F}$ to denote Banach spaces.
Recall
A real-valued function $f(t)$ of a real variable, defined on some neighborhood of $0$, is said to be of $o(t)$ if
And its derivative at some point $a$ is defined by
We also have this equivalent equation:
Now suppose $f:U \subset \mathbb{R}^n \to \mathbb{R}^m$ where $U$ is an open set. The function $f$ is differentiable at $x_0 \in U$ if satisfying the following conditions.
All partial derivatives of $f$, i.e. $\frac{\partial f_i}{\partial x_j}$ exists for all $i=1,\cdots,m$ and $j = 1,\cdots,n$ at $f$. (Which ensures that the Jacobian matrix exists and is well-defined).
The Jacobian matrix $J(x_0)\in\mathbb{R}^{m\times n}$ satisfies
In fact the Jacobian matrix has been the derivative of $f$ at $x_0$ although it’s a matrix in lieu of number. But we should treat a number as a matrix in the general case. In the following definition of Fréchet derivative, you will see that we should treat something as linear functional.
Definition
Let $f:U\to\mathbf{F}$ be a function where $U$ is an open subset of $\mathbf{E}$. We say $f$ is Fréchet differentiable at $x \in U$ if there is a bounded and linear operator $\lambda:\mathbf{E} \to \mathbf{F}$ such that
We say that $\lambda$ is the derivative of $f$ at $x$, which will be denoted by $Df(x)$ or $f’(x)$. Notice that $\lambda \in L(\mathbf{E},\mathbf{F})$. If $f$ is differentiable at every point of $f$, then $f’$ is a map given by
The definition above doesn’t go too far from real functions defined on the real axis. Now we are assuming that both $\mathbf{E}$ and $\mathbf{F}$ are merely topological vector spaces, and still we can get the definition of Fréchet derivative (generalized).
Let $\varphi$ be a mapping of a neighborhood of $0$ of $\mathbf{E}$ into $\mathbf{F}$. We say that $\varphi$ is tangent to $0$ if given a neighborhood $W$ of $0$ in $\mathbf{F}$, there exists a neighborhood $V$ of $0$ in $\mathbf{E}$ such that
for some function of $o(t)$. For example, if both $\mathbf{E}$ and $\mathbf{F}$ are normed (not have to be Banach), then we get a usual condition by
where $\lim_{\lVert x \rVert \to 0}\psi(x)=0$.
Still we assume that $\mathbf{E}$ and $\mathbf{F}$ are topological vector spaces. Let $f:U \to \mathbf{F}$ be a continuous map. We say that $f$ is differentiable at a point $x \in U$ if there exists some $\lambda \in L(\mathbf{E},\mathbf{F})$ such that for small $y$ we have
where $\varphi$ is tangent to $0$. Notice that $\lambda$ is uniquely determined.
Propositions
You are certainly familiar with some properties of derivative, but we are redoing these in Banach spaces.
Chain rule
If $f: U \to V$ is differentiable at $x_0$, and $g:V \to W$ is differentiable at $f(x_0)$, then $g \circ f$ is differentiable at $x_0$, and
Proof. We are proving this in topological vector space. By definition, we already have some linear operator $\lambda$ and $\mu$ such that
where $\varphi$ and $\psi$ are tangent to $0$. Further, we got
To evaluate $g(f(x_0+y))$, notice that
It’s clear that $\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y))$ is tangent to $0$, and $\mu\circ\lambda$ is the linear map we are looking for. That is,
Derivative of higher orders
From now on, we are dealing with Banach spaces. Let $U$ be an open subset of $\mathbf{E}$, and $f:U \to \mathbf{F}$ be differentiable at each point of $U$. If $f’$ is continuous, then we say that $f$ is of class $C^1$. The function of order $C^p$ where $p \geq 1$ is defined inductively. The $p$-th derivative $D^pf$ is defined as $D(D^{p-1}f)$ and is itself a map of $U$ into $L(\mathbf{E},L(\mathbf{E},\cdots,L(\mathbf{E},\mathbf{F})\cdots)))$ which is isomorphic to $L^p(\mathbf{E},\mathbf{F})$. A map $f$ is said to be of class $C^p$ if its $kth$ derivative $D^kf$ exists for $1 \leq k \leq p$, and is continuous. With the help of chain rule, and the fact that the composition of two continuous functions are continuous, we get
Let $U,V$ be open subsets of some Banach spaces. If $f:U \to V$ and $g: V \to \mathbf{F}$ are of class $C^p$, then so is $g \circ f$.
Open subsets of Banach spaces as a category
We in fact get a category $\{(U,f_U)\}$ where $U$ is the object as an open subset of some Banach space, and $f_U$ is the morphism as a map of class $C^p$ mapping $U$ into another open set. To verify this, one only has to realize that the composition of two maps of class $C^p$ is still of class $C^p$ (as stated above).
We say that $f$ is of class $C^\infty$ if $f$ is of class $C^p$ for all integers $p \geq 1$. Meanwhile $C^0$ maps are the continuous maps.
An example
We are going to evaluate the Fréchet derivative of a nonlinear functional. It is the derivative of a functional mapping an infinite dimensional space into $\mathbb{R}$ (instead of $\mathbb{R}$ to $\mathbb{R}$).
Consider the functional by
where the norm is defined by
For $u\in C[0,1]$, we are going to find an linear operator $\lambda$ such that
where $\varphi(\eta)$ is tangent to $0$.
Solution. By evaluating $\Gamma(u+\eta)$, we get
To prove that $\int_{0}^{1}\eta^2\sin{x}dx$ is the $\varphi(\eta)$ desired, notice that
Therefore we have
as desired. The Fréchet derivative of $\Gamma$ at $u$ is defined by
It’s hard to believe but, the derivative is not a number, nor a matrix, but a linear operator. But conversely, a real or complex number or matrix can be treated as a linear operator in the nature of things.
A Brief Introduction to Fréchet Derivative