A Brief Introduction to Fréchet Derivative

Fréchet derivative is a generalisation to the ordinary derivatives. Generally we are talking about Banach space, where $\mathbb{R}$​ is a special case. This is to say, the space discussed is not even required to be of finite dimension. We use $\mathbf{E}$ and $\mathbf{F}$ to denote Banach spaces.


A real-valued function $f(t)$ of a real variable, defined on some neighbourhood of $0$, is said to be of $o(t)$ if

And its derivative at some point $a$ is defined by

We also have this equivalent equation:

Now suppose $f:U \subset \mathbb{R}^n \to \mathbb{R}^m$ where $U$ is an open set. The function $f$ is differentiable at $x_0 \in U$ if satisfying the following conditions.

  1. All partial derivatives of $f$, i.e. $\frac{\partial f_i}{\partial x_j}$ exists for all $i=1,\cdots,m$ and $j = 1,\cdots,n$ at $f$. (Which ensures that the Jacobian matrix exists and is well-defined).

  2. The Jacobian matrix $J(x_0)\in\mathbb{R}^{m\times n}$ satisfies

    In fact the Jacobian matrix can be considered as the derivative of $f$ at $x_0$ although it’s a matrix in lieu of number. But we should treat a number as a matrix in the general case. In the following definition of Fréchet derivative, you will see that we should treat something as linear maps.


Let $f:U\to\mathbf{F}$ be a function where $U$ is an open subset of $\mathbf{E}$. We say $f$ is Fréchet differentiable at $x \in U$ if there is a bounded and linear operator $\lambda:\mathbf{E} \to \mathbf{F}$ such that

We say that $\lambda$ is the derivative of $f$ at $x$, which will be denoted by $Df(x)$ or $f’(x)$. Notice that $\lambda \in L(\mathbf{E},\mathbf{F})$. If $f$ is differentiable at every point of $f$, then $f’$ is a map given by

The definition above doesn’t go too far from real functions defined on the real axis. Now we are assuming that both $\mathbf{E}$ and $\mathbf{F}$ are merely topological vector spaces, and still we can get the definition of Fréchet derivative (generalised).

Let $\varphi$ be a mapping of a neighborhood of $0$ of $\mathbf{E}$ into $\mathbf{F}$. We say that $\varphi$ is tangent to $0$ if given a neighbourhood $W$ of $0$ in $\mathbf{F}$, there exists a neighbourhood $V$ of $0$ in $\mathbf{E}$ such that

for some real function of $o(t)$. For example, if both $\mathbf{E}$ and $\mathbf{F}$ are normed (not have to be Banach), then we get a usual condition by

where $\lim_{\lVert x \rVert \to 0}\psi(x)=0$.

Still we assume that $\mathbf{E}$ and $\mathbf{F}$ are topological vector spaces. Let $f:U \to \mathbf{F}$ be a continuous map. We say that $f$ is differentiable at a point $x \in U$ if there exists some $\lambda \in L(\mathbf{E},\mathbf{F})$ such that for small $y$ we have

where $\varphi$ is tangent to $0$. Notice that $\lambda$ is uniquely determined. This definition can be easily tested on the real line.

Basic concepts

You are certainly familiar with these properties of derivative, but we are redoing these in Banach spaces.

Chain rule

If $f: U \to V$ is differentiable at $x_0$, and $g:V \to W$ is differentiable at $f(x_0)$, then $g \circ f$ is differentiable at $x_0$, and

Proof. We are proving this in topological vector space. By definition, we already have some linear operator $\lambda$ and $\mu$ such that

where $\varphi$ and $\psi$ are tangent to $0$. Further, we got

To evaluate $g(f(x_0+y))$, notice that

It’s clear that $\mu\circ\varphi(y)+\psi(\lambda{y}+\varphi(y))$ is tangent to $0$, and $\mu\circ\lambda$ is the linear map we are looking for. That is,

Derivative of higher orders

From now on, we are dealing with Banach spaces. Let $U$ be an open subset of $\mathbf{E}$, and $f:U \to \mathbf{F}$ be differentiable at each point of $U$. If $f’$ is continuous, then we say that $f$ is of class $C^1$. The function of order $C^p$ where $p \geq 1$ is defined inductively. The $p$-th derivative $D^pf$ is defined as $D(D^{p-1}f)$ and is itself a map of $U$ into $L(\mathbf{E},L(\mathbf{E},\cdots,L(\mathbf{E},\mathbf{F})\cdots)))$ which is isomorphic to $L^p(\mathbf{E},\mathbf{F})$. A map $f$ is said to be of class $C^p$ if its $kth$ derivative $D^kf$ exists for $1 \leq k \leq p$, and is continuous. With the help of chain rule, and the fact that the composition of two continuous functions are continuous, we get

Let $U,V$ be open subsets of some Banach spaces. If $f:U \to V$ and $g: V \to \mathbf{F}$ are of class $C^p$, then so is $g \circ f$.

Open subsets of Banach spaces as a category

We in fact get a category $\{(U,f_U)\}$ where $U$ is the object as an open subset of some Banach space, and $f_U$ is the morphism as a map of class $C^p$ mapping $U$ into another open set. To verify this, one only has to realize that the composition of two maps of class $C^p$ is still of class $C^p$ (as stated above).

We say that $f$ is of class $C^\infty$ if $f$ is of class $C^p$ for all integers $p \geq 1$. Meanwhile $C^0$ maps are the continuous maps.

An example

We are going to evaluate the Fréchet derivative of a nonlinear functional. It is the derivative of a functional mapping an infinite dimensional space into $\mathbb{R}$ (instead of $\mathbb{R}$ to $\mathbb{R}$).

Consider the functional by

where the norm is defined by

For $u\in C[0,1]$, we are going to find an linear operator $\lambda$ such that

where $\varphi(\eta)$ is tangent to $0$.

Solution. By evaluating $\Gamma(u+\eta)$, we get

To prove that $\int_{0}^{1}\eta^2\sin{x}dx$ is the $\varphi(\eta)$ desired, notice that

Therefore we have

as desired. The Fréchet derivative of $\Gamma$ at $u$ is defined by

It’s hard to believe but, the derivative is not a number, nor a matrix, but a linear operator. But essentially, a real or complex number or matrix can be and should be treated as a linear operator in the nature of things.

A Brief Introduction to Fréchet Derivative




Posted on


Updated on


Licensed under