# A brief introduction to Fréchet derivative

Fréchet derivative is a generalisation to the ordinary derivative. Generally we are talking about Banach space, where $\mathbb{R}$​ is a special case. Indeed, the space discussed is not even required to be of finite dimension.

## Recall

A real-valued function $f(t)$ of a real variable, defined on some neighborhood of $0$, is said to be of $o(t)$ if $\lim_{t \to 0} \frac{f(t)}{t}=0.$ And its derivative at some point $a$ is defined by $f'(a)=\lim_{h \to 0}\frac{f(a+h)-f(a)}{h}.$ We also have this equivalent equation: $f(a+h)=f(a)+f'(a)h+o(h).$ Now suppose $f:U \subset \mathbb{R}^n \to \mathbb{R}^m$ where $U$ is an open set. The function $f$ is differentiable at $x_0 \in U$ if satisfying the following conditions.

1. All partial derivatives of $f$, i.e. $\frac{\partial f_i}{\partial x_j}$ exists for all $i=1,\cdots,m$ and $j = 1,\cdots,n$ at $f$. (Which ensures that the Jacobian matrix exists and is well-defined).

2. The Jacobian matrix $J(x_0)\in\mathbb{R}^{m\times n}$ satisfies $\lim_{|h| \to 0}\frac{|f(x_0+h)-f(x_0)-J(x_0)h|}{|h|}=0.$ In fact the Jacobian matrix has been the derivative of $f$ at $x_0$ although it's a matrix in lieu of number. But we should treat a number as a matrix in the general case. In the following definition of Fréchet derivative, you will see that we should treat something as linear functional.

## Definition

Let $f:U\to\mathbf{F}$ be a function where $U$ is an open subset of $\mathbf{E}$. We say $f$ is Fréchet differentiable at $x \in U$ if there is a bounded and linear operator $\lambda:\mathbf{E} \to \mathbf{F}$ such that $\lim_{\lVert y \rVert \to 0}\frac{\lVert f(x+y)-f(x)-\lambda y \rVert}{\lVert y \rVert}=0.$ We say that $\lambda$ is the derivative of $f$ at $x$, which will be denoted by $Df(x)$ or $f'(x)$. Notice that $\lambda \in L(\mathbf{E},\mathbf{F})$. If $f$ is differentiable at every point of $f$, then $f'$ is a map by $f':U \to L(\mathbf{E},\mathbf{F}).$

The definition above doesn't go too far from real functions defined on the real axis. Now we are assuming that both $\mathbf{E}$ and $\mathbf{F}$ are merely topological vector spaces, and still we can get the definition of Fréchet derivative (generalized).

Let $\varphi$ be a mapping of a neighborhood of $0$ of $\mathbf{E}$ into $\mathbf{F}$. We say that $\varphi$ is tangent to $0$ if given a neighborhood $W$ of $0$ in $\mathbf{F}$, there exists a neighborhood $V$ of $0$ in $\mathbf{E}$ such that $\varphi(tV) \subset o(t)W$ for some function of $o(t)$. For example, if both $\mathbf{E}$ and $\mathbf{F}$ are normed (not have to be Banach), then we get a usual condition by $\lVert \varphi(x) \rVert \leq \lVert x \rVert \psi(x)$ where $\lim_{\lVert x \rVert \to 0}\psi(x)=0$.

Still we assume that $\mathbf{E}$ and $\mathbf{F}$ are topological vector spaces. Let $f:U \to \mathbf{F}$ be a continuous map. We say that $f$ is differentiable at a point $x \in U$ if there exists some $\lambda \in L(\mathbf{E},\mathbf{F})$ such that for small $y$ we have $f(x+y)=f(x)+\lambda{y}+\varphi(y)$ where $\varphi$ is tangent to $0$. Notice that $\lambda$ is uniquely determined.

## Propositions

You must be familiar with some properties of derivative, but we are redoing these in Banach space.

### Chain rule

If $f: U \to V$ is differentiable at $x_0$, and $g:V \to W$ is differentiable at $f(x_0)$, then $g \circ f$ is differentiable at $x_0$, and $(g \circ f)'(x_0)=g'(f(x_0)) \circ f'(x_0)$

Proof. We are proving this in topological vector space. By definition, we already have some linear operator $\lambda$ and $\mu$ such that $f(x_0+y)=f(x_0)+\lambda{y}+\varphi(y) \\ g(f(x_0)+h)=g(f(x_0))+\mu{h}+\psi(h)$ where $\varphi$ and $\psi$ are tangent to $0$. Further, we got $f'(x_0)=\lambda \\ g'(f(x_0))=\mu$ To evaluate $g(f(x_0+y))$, notice that \begin{aligned} g(f(x_0+y))&=g[f(x_0)+(\lambda{y}+\varphi(y))] \\ &=g(f(x_0))+\mu(\lambda{y}+\varphi(y))+\psi(\lambda{y}+\varphi(y)) \\ &=g(f(x_0))+\mu\circ\lambda{y}+\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y)) \end{aligned} It's clear that $\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y))$ is tangent to $0$, and $\mu\circ\lambda$ is the linear map we are looking for. That is, $(g \circ f)'(x)=g'(f(x_0)) \circ f'(x_0).$

### Derivative of higher orders

From now on, we are dealing with Banach spaces. Let $U$ be an open subset of $\mathbf{E}$, and $f:U \to \mathbf{F}$ be differentiable at each point of $U$. If $f'$ is continuous, then we say that $f$ is of class $C^1$. The function of order $C^p$ where $p \geq 1$ is defined inductively. The $p$-th derivative $D^pf$ is defined as $D(D^{p-1}f)$ and is itself a map of $U$ into $L(\mathbf{E},L(\mathbf{E},\cdots,L(\mathbf{E},\mathbf{F})\cdots)))$ which is isomorphic to $L^p(\mathbf{E},\mathbf{F})$. A map $f$ is said to be of class $C^p$ if its $kth$ derivative $D^kf$ exists for $1 \leq k \leq p$, and is continuous. With the help of chain rule, and the fact that the composition of two continuous functions are continuous, we get

Let $U,V$ be open subsets of some Banach spaces. If $f:U \to V$ and $g: V \to \mathbf{F}$ are of class $C^p$, then so is $g \circ f$.

### Open subsets of Banach spaces as a category

We in fact get a category $\{(U,f_U)\}$ where $U$ is the object as an open subset of some Banach space, and $f_U$ is the morphism as a map of class $C^p$ mapping $U$ into another open set. To verify this, one only has to realize that the composition of two maps of class $C^p$ is still of class $C^p$ (as stated above).

We say that $f$ is of class $C^\infty$ if $f$ is of class $C^p$ for all integers $p \geq 1$. Meanwhile $C^0$ maps are the continuous maps.

## An example

We are going to evaluate the Fréchet derivative of a nonlinear functional. It is the derivative of a functional mapping an infinite dimensional space into $\mathbb{R}$ (instead of $\mathbb{R}$ to $\mathbb{R}$).

Consider the functional by \begin{aligned} \Gamma:C^0[0,1] &\to \mathbb{R} \\ u &\mapsto \int_{0}^{1}u^2(x)\sin\pi{x}dx. \end{aligned} where the norm is defined by $\lVert u \rVert = \sup_{x \in [0,1]}|u|.$

For $u\in C[0,1]$, we are going to find an linear operator $\lambda$ such that $\Gamma(u+\eta)=\Gamma(u)+\lambda{\eta}+\varphi(\eta),$ where $\varphi(\eta)$ is tangent to $0$.

Solution. By evaluating $\Gamma(u+\eta)$, we get \begin{aligned} \Gamma(u+\eta)&=\int_{0}^{1}(u+\eta)^2\sin\pi{x}dx \\ &= \Gamma(u)+2\int_{0}^{1}u\eta\sin\pi{x}dx+\int_{0}^{1}\eta^2\sin\pi{x}dx. \end{aligned} To prove that $\int_{0}^{1}\eta^2\sin{x}dx$ is the $\varphi(\eta)$ desired, notice that $\int_{0}^{1}\eta^2\sin\pi{x}dx \leq \lVert\eta\rVert^2\int_{0}^{1}\sin\pi{x}dx=2\lVert \eta \rVert^2.$ Therefore we have $0\leq\lim_{\lVert \eta \rVert \to 0}\frac{\int_{0}^{1}\eta^2\sin\pi{x}dx}{\lVert \eta \rVert} \leq \lim_{\lVert\eta\rVert\to0}2\lVert\eta\rVert=0$ as desired. The Fréchet derivative of $\Gamma$ at $u$ is defined by \begin{aligned} \Gamma'(u):C[0,1] &\to \mathbb{R} \\ \eta &\mapsto 2\int_{0}^{1}u\eta\sin\pi{x}dx. \end{aligned} It's hard to believe but, the derivative is not a number, nor a matrix, but a linear operator. But conversely, a real or complex number or matrix can be treated as a linear operator in the nature of things.