Differential calculus in higher dimension

In this part of the course we work on the following skills:

Become comfortable working with coordinates in arbitrary dimension.
Develop an intuition for working with vector fields.
Understand the subtleties of derivatives in dimension greater than 1, evaluate and manipulate partial derivatives, directional derivatives, Jacobian.

See also the exercises associated to this part of the course.

Here we start to consider higher dimensional space. That is, instead of $R$ we consider $R^{n}$ for $n \in N$ . We will particularly focus on 2D and 3D but everything also holds in any dimension. Going beyond $R$ we have more options for functions and correspondingly more options for derivatives. Various different notation is commonly used. Here we will primarily use $(x, y) \in R^{2}$ , $(x, y, z) \in R^{3}$ or, more generally, $x = (x_{1}, x_{2}, \dots, x_{n}) \in R^{n}$ where $x_{1} \in R, \dots, x_{n} \in R$ . For example, $R^{2}$ is the plane, $R^{3}$ is 3D space.

Definition (inner product)

x \cdot y = \sum_{k = 1}^{n} x_{k} y_{k} \in R

We recall that the inner product being zero has a geometric meaning, it means that the two vectors are orthogonal. We also recall that the "length" of a vector is given by the norm, defined as follows.

Definition (norm)

$‖ x ‖ = \sqrt{x \cdot x} = {(\sum_{k = 1}^{n} x_{k}^{2})}^{\frac{1}{2}}$ .

For example, in $R^{2}$ then $‖ (x, y) ‖ = \sqrt{x^{2} + y^{2}}$ . There are various convenient properties for working with norms and inner products, in particular, the Cauchy-Schwarz inequality $| x \cdot y | \leq ‖ x ‖ ‖ y ‖$ and the triangle inequality $‖ x + y ‖ \leq ‖ x ‖ + ‖ y ‖$ .

The primary higher-dimensional functions we consider in this course are:

Scalar fields: $f : R^{n} \to R$
Vector fields: $F : R^{n} \to R^{n}$
Paths: $α : R \to R^{n}$
Change of coordinates: $x : R^{n} \to R^{n}$

These possibilities all fit into the general pattern of $f : R^{n} \to R^{m}$ for $n, m \in N$ but tradition and use of the function gives us different terminology and symbols. Such functions are useful for representing various practical things, for example: gravitational force; temperature in a region; wind velocity; fluid flow; electric field; etc.

Open sets, closed sets, boundary, continuity

Let $a \in R^{n}$ , $r > 0$ . The open $n$ -ball of radius $r$ and centre $a$ is written as

B (a, r) := {x \in R^{n} : ‖ x - a ‖ < r} .

Definition (interior point)

Let $S \subset R^{n}$ . A point $a \in S$ is said to be an interior point if there is $r > 0$ such that $B (a, r) \subset S$ . The set of all interior points of $S$ is denoted $int S$ .

Definition (open set)

A set $S \subset R^{n}$ is said to be open if all of its points are interior points, i.e., if $int S = S$ .

Interior points are the centre of a ball contained within the set

For example, open intervals, open disks, open balls, unions of open intervals, etc., are all open sets.

Lemma

Let $r > 0$ , $a \in R^{n}$ . The set $B (a, r) \subset R^{n}$ is open.

Proof

Let $b \in B (a, r)$ . It suffices to show that $b$ is an interior point. (1) Let $r_{1} = ∥ b - a ∥ < r$ . (2) Let $r_{2} = (r - r_{1}) / 2$ . (3) We claim that $B (b, r_{2}) \subset B (a, r)$ : In order to see this take any $c \in B (b, r_{2})$ and observe that

∥ c - a ∥ \leq ∥ c - b ∥ + ∥ b - a ∥ \leq r_{2} + r_{1} = \frac{r + r_{1}}{2} < r .

Observe that the radius of the ball will be small for points close to the boundary.

Definition (Cartesian product)

If $A_{1} \subset R$ , $A_{2} \subset R$ then the Cartesian product is defined as

A_{1} \times A_{2} := {(x, y) : x \in A_{1}, y \in A_{2}} \subset R^{2} .

Analogously the Cartesian product can be defined in higher dimensions: If $A_{1} \subset R^{m}$ , $A_{2} \subset R^{n}$ then the Cartesian product $A_{1} \times A_{2}$ is defined as the set of all points $(x_{1}, \dots, x_{m}, y_{1}, \dots, y_{n}) \in R^{m + n}$ such that $(x_{1}, \dots, x_{m}) \in A_{1}$ and $(y_{1}, \dots, y_{n}) \in A_{2}$ .

Lemma

If $A_{1}, A_{2}$ are open subsets of $R$ then $A_{1} \times A_{2}$ is an open subset of $R^{2}$ .

Proof

Let $a = (a_{1}, a_{2}) \in A_{1} \times A_{2} \subset R^{2}$ . Since $A_{1}$ is open there exists $r_{1} > 0$ such that $B (a_{1}, r_{1}) \subset A_{1}$ . Similarly for $A_{2}$ . Let $r = min {r_{1}, r_{2}}$ . This all means that $B (a, r) \subset B (a_{1}, r_{1}) \times B (a_{2}, r_{2}) \subset A_{1} \times A_{2}$ .

A_{1}, A_{2}

are intervals then

A_{1} \times A_{2}

is a rectangle

Discussing the "interior" of the set naturally suggests the topic of the "boundary" of the set. In the following definitions we develop this idea.

Definition (exterior points)

Let $S \subset R^{n}$ . A point $a \notin S$ is said to be an exterior point if there exists $r > 0$ such that $B (a, r) \cap S = \emptyset$ . The set of all exterior points of $S$ is denoted $ext S$ .

Observe that $ext S$ is an open set. We use the notation $S^{c} = R^{n} ∖ S$ and we say that $C^{c}$ is the complement of the set $S$ .

Definition (boundary)

The set $R^{n} ∖ (int S \cup ext S)$ is called the boundary of $S \subset R^{n}$ and is denoted $\partial S$ .

Definition (closed)

A set $S \subset R^{n}$ is said to be closed if $\partial S \subset S$ .

Lemma

$S$ is open $⟺$ $S^{c}$ is closed.

Proof

Observe that $R^{n} = int S \cup \partial S \cup ext S$ (disjointly). If $x \in \partial S$ then, for every $r > 0$ , $B (x, r) \cap S \neq \emptyset$ and so $x \in \partial (S^{c})$ . Similarly with $S$ and $S^{c}$ swapped and so $\partial S = \partial (S^{c})$ . If $S$ is open then $int S = S$ and $S^{c} = ext S \cup \partial S = ext S \cup \partial (S^{c})$ and so $S^{c}$ is closed. If $S$ is not open then there exists $a \in \partial S \cap S$ . Additionally $a \in \partial (S^{c}) \cap S$ hence $S^{c}$ is not closed.

Limits and continuity

Let $S \subset R^{n}$ and $f : S \to R^{m}$ . If $a \in R^{n}$ , $b \in R^{m}$ we write $lim_{x \to a} f (x) = b$ to mean that $‖ f (x) - b ‖ \to 0$ as $‖ x - a ‖ \to 0$ . Observe how, if $n = m = 1$ , this is the familiar notion of continuity for functions on $R$ .

Definition (Continuous)

A function $f$ is said to be continuous at $a$ if $f$ is defined at $a$ and $lim_{x \to a} f (x) = f (a)$ . We say $f$ is continuous on $S$ if $f$ is continuous at each point of $S$ .

Even functions which look "nice" can fail to be continuous as we can see in the following example.

Example (continuity in higher dimensions)

Let $f$ be defined, for $(x, y) \neq (0, 0)$ , as

f (x, y) = \frac{x y}{x^{2} + y^{2}}

and $f (0, 0) = 0$ . What is the behaviour of $f$ when approaching $(0, 0)$ along the following lines?

line	value
${x = 0}$	$f (0, t) = 0$
${y = 0}$	$f (t, 0) = 0$
${x = y}$	$f (t, t) = \frac{1}{2}$
${x = - y}$	$f (t, t) = - \frac{1}{2}$

Theorem

Suppose that $lim_{x \to a} f (x) = b$ and $lim_{x \to a} g (x) = c$ . Then

$lim_{x \to a} (f (x) + g (x)) = b + c$ ,
$lim_{x \to a} λ f (x) = λ b$ for every $λ \in R$ ,
$lim_{x \to a} f (x) \cdot g (x) = b \cdot c$ ,
$lim_{x \to a} ‖ f (x) ‖ = ‖ b ‖$ .

We prove a couple of the parts of the above theorem here, the other parts are left as exercises.

Proof of part 3.

Observe that $f (x) \cdot g (x) - b \cdot c = (f (x) - b) \cdot (g (x) - c) + b \cdot (g (x) - c) + c \cdot (f (x) - b)$ . By the triangle inequality and Cauchy-Schwarz,

\begin{aligned} ‖ f (x) \cdot g (x) - b \cdot c ‖ & \leq ‖ f (x) - b ‖ ‖ g (x) - c ‖ \\ + ‖ b ‖ ‖ g (x) - c ‖ \\ + ‖ c ‖ ‖ f (x) - b ‖ . \end{aligned}

Since we already know that $‖ f (x) - b ‖ \to 0$ and $‖ g (x) - c ‖ \to 0$ as $x \to a$ , this implies that $‖ f (x) \cdot g (x) - b \cdot c ‖ \to 0$ .

Proof of part 4.

Take $f = g$ in part (c) implies that $lim_{x \to a} {‖ f (x) ‖}^{2} = {‖ b ‖}^{2}$ .

When writing a vector field (or similar functions) it is often convenient to divide the higher-dimensional function into smaller parts. We call these parts the components of a vector field. For example $F (x) = (F_{1} (x), F_{2} (x))$ in 2D, $F (x) = (F_{1} (x), F_{2} (x), F_{3} (x))$ in 3D, etc.

Theorem

Let $F (x) = (F_{1} (x), F_{2} (x))$ . Then $F$ is continuous if and only if $F_{1}$ and $F_{2}$ are continuous.

Proof

We will independently prove the two implications.

( $\Rightarrow$ ) Let $e_{1} = (1, 0)$ , $e_{2} = (0, 1)$ and observe that $F_{k} (x) = F (x) \cdot e_{k}$ . We have already shown that the continuity of two vector fields implies the continuity of the inner product.
( $\Leftarrow$ ) By definition of the norm ${‖ F (x) - F (a) ‖}^{2} = \sum_{k = 1}^{2} {(F_{k} (x) - F_{k} (a))}^{2}$ and we know $‖ F_{k} (x) - F_{k} (a) ‖ \to 0$ as $‖ x - a ‖ \to 0$ .

In higher dimensions the analogous statement is true for the vector field $F (x) = (F_{1} (x), \dots, F_{m} (x))$ with exactly the same proof. I.e., $F$ is continuous if and only if each $f_{k}$ is continuous.

Example (polynomials)

A polynomial in $n$ variables is a scalar field on $R^{n}$ of the form

f (x_{1}, \dots, x_{n}) = \sum_{k_{1} = 0}^{j} \dots \sum_{k_{n} = 0}^{j} c_{k_{1}, \dots, k_{n}} x_{1}^{k_{1}} \dots x_{n}^{k_{n}} .

E.g., $f (x, y) := x + 2 x y - x^{2}$ is a polynomial in $2$ variables. Polynomials are continuous everywhere in $R^{n}$ . This is because they are the finite sum of products of continuous scalar fields.

Example (rational functions)

A rational function is a scalar field

f (x) = \frac{p (x)}{q (x)}

where $p (x)$ and $q (x)$ are polynomials. A rational function is continuous at every point $x$ such that $q (x) \neq 0$ .

As described in the following result, the continuity of functions continues to hold, in an intuitive way, under composition of functions.

Theorem

Suppose $S \subset R^{l}$ , $T \subset R^{m}$ , $f : S \to R^{m}$ , $g : T \to R^{n}$ and that $f (S) \subset T$ so that

(g \circ f) (x) = g (f (x))

makes sense. If $f$ is continuous at $a \in S$ and $g$ is continuous at $f (a)$ then $g \circ f$ is continuous at $a$ .

Proof

lim x \to a ‖ f (g (x)) - f (g (a)) ‖ = lim y \to g (a) ‖ f (y) - f (g (a)) ‖ = 0

Example

We can consider the scalar field $f (x, y) = \sin (x^{2} + y) + x y$ as the composition of functions.

Derivatives of scalar fields

Plot where colour represents the value of $f (x, y) = x^{2} + y^{2}$ . The change in $f$ depends on direction

We can imagine, for example in the figure, that in higher dimensions, the derivative of a scalar field depends on the direction. This motivates the following.

Definition (directional derivative)

Let $S \subset R^{n}$ and $f : S \to R$ . For any $a \in int S$ and $v \in R^{n}$ , $‖ v ‖ = 1$ the directional derivative of $f$ with respect to $v$ is defined as

D_{v} f (a) = lim_{h \to 0} \frac{1}{h} (f (a + h v) - f (a)) .

When $h$ is small we can guarantee that $a + h v \in S$ because $a \in int S$ so this definition makes sense.

Theorem

Suppose $S \subset R^{n}$ , $f : S \to R$ , $a \in int S$ . Let $g (t) := f (a + t v)$ . If one of the derivatives $g^{'} (t)$ or $D_{v} f (a)$ exists then the other also exists and

g^{'} (t) = D_{v} f (a + t v) .

In particular $g^{'} (0) = D_{v} f (a)$ .

Proof

By definition $\frac{1}{h} (g (t + h) - g (h)) = \frac{1}{h} (f (a + h v) - f (a))$ .

The following result is useful for proving later results.

Theorem (mean value)

Assume that $D_{v} (a + t v)$ exists for each $t \in [0, 1]$ . Then for some $θ \in (0, 1)$ ,

f (a + v) - f (a) = D_{v} f (z), where z = a + θ v .

Proof

Apply mean value theorem to $g (t) = f (a + t v)$ .

The following notation is convenient. For any $k \in {1, 2, \dots, n}$ , let $e_{k}$ be the $n$ -dimensional unit vector where all entries are zero except the $k$ th position which is equal to $1$ . I.e., $e_{1} = (1, 0, \dots, 0)$ , $e_{2} = (0, 1, 0, \dots, 0)$ , $e_{n} = (0, \dots, 0, 1)$ .

Definition (partial derivatives)

We define the partial derivative in $x_{k}$ of $f (x_{1}, \dots, x_{n})$ at $a$ as

\frac{\partial f}{\partial x_{k}} (a) = D_{e_{k}} f (a) .

Remark

Various symbols used for partial derivatives: $\frac{\partial f}{\partial x_{k}} (a) = D_{k} f (a) = \partial_{k} f (a)$ . If a function is written $f (x, y)$ we write $\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}$ for the partial derivatives. Similarly for higher dimension.

In practice, to compute the partial derivative $\frac{\partial f}{\partial x_{k}}$ , one should consider all other $x_{j}$ for $j \neq k$ as constants and take the derivative with respect to $x_{k}$ . In a moment we see this rigorously.

If $f : R \to R$ is differentiable, then we know that, when $x$ is close to $a$ ,

f (x) \approx f (a) + (x - a) f^{'} (a) .

More precisely, we know that $f (x) = f (a) + (x - a) f^{'} (a) + ϵ (x - a)$ where $| ϵ (x - a) | = o (| x - a |)$ . (This is little-o notation and here means that $| f (x) - f (a) - (x - a) f^{'} (a) | / | x - a | \to 0$ as $| x - a | \to 0$ .) This way of seeing differentiability is convenient for the higher dimensional definition of differentiability.

Definition (differentiable)

Let $S \subset R^{n}$ be open, $f : S \to R$ . We say that $f$ is differentiable at $a \in S$ if there exists a linear transformation ${d f}_{a} : R^{n} \to R$ such that, for $x \in B (a, r)$ ,

f (x) = f (a) + {d f}_{a} (x - a) + ϵ (x - a)

where $| ϵ (x - a) | = o (‖ x - a ‖)$ .

For future convenience we introduce the following notation.

Definition (gradient)

The gradient of the scalar field $f (x, y, z)$ at the point $a$ is

\nabla f (a) = (\begin{matrix} \frac{\partial f}{\partial x} (a) \\ \frac{\partial f}{\partial y} (a) \\ \frac{\partial f}{\partial z} (a) \end{matrix}) .

In general, when working in $R^{n}$ for some $n \in N$ , the gradient of the scalar field $f (x_{1}, \dots, x_{n})$ at the point $a$ is

\nabla f (a) = (\begin{matrix} \frac{\partial f}{\partial x_{1}} (a) \\ \frac{\partial f}{\partial x_{2}} (a) \\ ⋮ \\ \frac{\partial f}{\partial x_{n}} (a) \end{matrix}) .

Theorem

If $f$ is differentiable at $a$ then ${d f}_{a} (v) = \nabla f (a) \cdot v$ . This means that, for $x \in B (a, r)$ ,

f (x) = f (a) + \nabla f (a) \cdot (x - a) + ϵ (x - a)

where $| ϵ (x - a) | = o (‖ x - a ‖)$ . Moreover, for any vector $v$ , $‖ v ‖ = 1$ ,

D_{v} f (a) = \nabla f (a) \cdot v .

Proof

Since $f$ is differentiable there exists a linear transformation ${d f}_{a} : R^{n} \to R$ such that $f (a + h v) = f (a) + h {d f}_{a} (v) + ϵ (h v)$ and hence

\begin{aligned} D_{v} f (a) & = lim_{h \to 0} \frac{1}{h} (f (a + h v) - f (a)) \\ = lim_{h \to 0} \frac{1}{h} (h {d f}_{a} (v) + ϵ (h v)) = {d f}_{a} (v) . \end{aligned}

In particular ${d f}_{a} (e_{k}) = D_{e_{k}} f (a)$ .

Theorem

If $f$ is differentiable at $a$ , then it is continuous at $a$ .

Proof

Observe that $| f (a + v) - f (a) | = | {d f}_{a} (v) + ϵ (v) |$ . This means that

| f (a + v) - f (a) | \leq ‖ {d f}_{a} ‖ ‖ v ‖ + | ϵ (v) |

and so this tends to $0$ as $‖ v ‖ \to 0$ .

Theorem

Suppose that $f (x_{1}, \dots, x_{n})$ is a scalar field. If the partial derivatives $\partial_{1} f (x), \dots, \partial_{n} f (x)$ exist for all $x \in B (a, r)$ and are continuous at $a$ then $f$ is differentiable at $a$ .

Proof

For convenience define the vectors

\begin{aligned} v & = (v_{1}, v_{2}, \dots, v_{n}), \\ u_{k} & = (v_{1}, v_{2}, \dots, v_{k}, 0, \dots, 0) . \end{aligned}

Observe that

u_{k} - u_{k - 1} = v_{k} e_{k}, u_{0} = (0, 0, \dots, 0), u_{n} = v .

Using the mean value theorem we know that there exists $z_{k} = u_{k - 1} + θ_{k} e_{k}$ such that $f (a + u_{k}) - f (a + u_{k - 1}) = v_{k} D_{e_k} f (a + z_k)$ . Consequently

\begin{aligned} f (a + v) - f (a) & = \sum_{k = 1}^{n} f (a + u_{k}) - f (a + u_{k - 1}) \\ = \sum_{k = 1}^{n} v_{k} D_{e_{k}} f (a + z_k) \\ = \sum_{k = 1}^{n} v_{k} D_{e_{k}} f (a + u_{k - 1}) \\ + \sum_{k = 1}^{n} v_{k} (D_{e_{k}} f (a + z_k) - D_{e_{k}} f (a + u_{k - 1})) \end{aligned}

To conclude, observe that the second sum vanishes as $‖ v ‖ \to 0$ and that the first sum, $\sum_{k = 1}^{n} v_{k} D_{e_{k}} f (a + u_{k - 1})$ , converges to $v \cdot \nabla f (a)$ .

Chain rule

When we are working in $R$ we know that, if $g$ and $h$ are differentiable, then $f (t) = g \circ h (t)$ is also differentiable and also $f^{'} (t) = g^{'} (h (t)) h^{'} (t)$ . This is called the chain rule and is frequently very useful in calculating derivatives. We now investigate how this extends to higher dimension?

Example

Suppose that $α : R \to R^{3}$ describes the position $α (t)$ at time $t$ and that $f : R^{3} \to R$ describes the temperature $f (α)$ at a point $α$ The temperature at time $t$ is equal to $g (t) = f (α (t))$ . We want to calculate $g^{'} (t)$ because this is the change in temperature with respect to time.

In situations like the above example it is convenient to consider the derivative of a path $α : R \to R^{n}$ . Let $α : R \to R^{n}$ and suppose it has the form $α (t) = (α_{1} (t), \dots, α_{n} (t))$ . We define the derivative as

α^{'} (t) := (\begin{matrix} x_{1}^{'} (t) \\ ⋮ \\ x_{n}^{'} (t) \end{matrix}) .

Here $α^{'}$ is a vector-valued function which represents the "direction of movement".

α (t) = (\cos t, \sin t, t)

t \in R

Theorem

Let $S \subset R^{n}$ be open and $I \subset R$ an interval. Let $x : I \to S$ and $f : S \to R$ and define, for $t \in I$ ,

g (t) = f (x (t)) .

Suppose that $t \in I$ is such that $x^{'} (t)$ exists and $f$ is differentiable at $x (t)$ . Then $g^{'} (t)$ exists and

g^{'} (t) = \nabla f (x (t)) \cdot x^{'} (t) .

Proof

Since $f$ is differentiable, $f (y) - f (x) = \nabla f (x) \cdot (y - x) + ϵ (x, y - x)$ where $| ϵ (x, y - x) | = o (y - x)$ . Let $h > 0$ be small.

\begin{aligned} \frac{1}{h} [g (t + h) - g (t)] & = \frac{1}{h} [f (x (t + h)) - f (x (t))] \\ = \frac{1}{h} \nabla f (x (t)) \cdot (x (t + h) - x (t)) \\ + \frac{1}{h} ϵ (x (t), x (t + h) - x (t)) . \end{aligned}

Observe that $\frac{1}{h} (x (t + h) - x (t)) \to x^{'} (t)$ as $h \to 0$ .

Example

A particle moves in a circle and its position at time $t \in [0, 2 π]$ is given by

x (t) = (\cos t, \sin t) .

The temperature at a point $y = (y_{1}, y_{2})$ is given by the function $f (y) := y_{1} + y_{2}$ , The temperature the particle experiences at time $t$ is given by $g (t) = f (x (t))$ . Temperature change:

g^{'} (t) = \nabla f (x (t)) \cdot x^{'} (t) = (\begin{matrix} 1 \\ 1 \end{matrix}) \cdot (\begin{matrix} - \sin t \\ \cos t \end{matrix}) = \cos t - \sin t .

x (t)

is the position of a particle.

Level sets & tangent planes

Let $S \subset R^{2}$ , $f : S \to R$ . Suppose $c \in R$ and let

L (c) = {x \in S : f (x) = c} .

The set $L (c)$ is called the level set. In general this set can be empty or it can be all of $S$ . However the set $L (c)$ is often a curve and this is the case of interest. This is the same notion as that of contour lines on a map. I.e., $x (t_{a}) = a$ for some $t_{a} \in I$ and

f (x (t)) = c

for all $t \in I$ . Then

$\nabla f (a)$ is normal to the curve at $a$
Tangent line at $a$ is ${x \in R^{2} : \nabla f (a) \cdot (x - a) = 0}$

This is because the chain rule implies that $\nabla f (x (t)) \cdot x^{'} (t) = 0$ .

Example

Let $f (x_{1}, x_{2}, x_{3}) := x_{1}^{2} + x_{2}^{2} + x_{3}^{2}$ .

If $c > 0$ then $L (c)$ is a sphere,
$L (0)$ is a single point $(0, 0, 0)$ ,
If $c < 0$ then $L (c)$ is empty.

Example

Let $f (x_{1}, x_{2}, x_{3}) := x_{1}^{2} + x_{2}^{2} - x_{3}^{2}$ . See figure.

If $c > 0$ then $L (c)$ is a one-sheeted hyperboloid,
$L (0)$ is an infinite cone,
If $c < 0$ then $L (c)$ is a two-sheeted hyperboloid.

Sphere

2-sheet hyperboloid

Infinite cone

1-sheet hyperboloid

Let $f$ be a differentiable scalar field on $S \subset R^{3}$ and suppose that the level set $L (c) = {x \in S : f (x) = c}$ defines a surface.

The gradient $\nabla f (a)$ is normal to every curve $α (t)$ in the surface which passes through $a$ ,
The tangent plane at $a$ is ${x \in R^{3} : \nabla f (a) \cdot (x - a) = 0}$ .

Same argument as in $R^{2}$ works in $R^{n}$ .

Derivatives of vector fields

Essentially everything discussed above for scalar fields extends to vector fields in a predictable way. This is because of the linearity and that we can consider each component of the vector field independently.

Definition (directional derivative)

Let $S \subset R^{n}$ and $F : S \to R^{m}$ . For any $a \in int S$ and $v \in R^{n}$ the derivative of the vector field $F$ with respect to $v$ is defined as

D_{v} F (a) := lim_{h \to 0} \frac{1}{h} (F (a + h v) - F (a)) .

Remark

If we use the notation $F = (F_{1}, \dots, F_{m})$ , i.e., we write the function using the ``components'' where each $F_{k}$ is a scalar field, then $D_{v} F = (D_{v} F_{1}, \dots, D_{v} F_{m})$ .

Definition (differentiable)

We say that $F : R^{n} \to R^{m}$ is differentiable at $a$ if there exists a linear transformation ${d f}_{a} : R^{n} \to R^{m}$ such that, for $x \in B (a, r)$ ,

F (x) = F (a) + {d f}_{a} (x - a) + ϵ (x - a),

$‖ ϵ (x - a) ‖ = o (‖ x - a ‖)$ .

Theorem

If $F$ is differentiable at $a$ then $F$ is continuous at $a$ and ${d f}_{a} (v) = D_{v} F (a)$ .

Proof

Same as for the case of scalar fields when $f : R^{n} \to R$ .

Jacobian matrix & the chain rule

The relevant differential for higher-dimensional functions is the Jacobian matrix.

Definition (Jacobian matrix)

Suppose that $F : R^{2} \to R^{2}$ and use the notation $F (x, y) = (F_{1} (x, y), F_{2} (x, y))$ . The Jacobian matrix of $F$ at $a$ is defined as

D F (a) = (\begin{matrix} \frac{\partial F_{1}}{\partial x} (a) & \frac{\partial F_{1}}{\partial y} (a) \\ \frac{\partial F_{2}}{\partial x} (a) & \frac{\partial F_{2}}{\partial y} (a) \end{matrix}) .

The Jacobian matrix is defined analogously in any dimension. I.e., if $F : R^{n} \to R^{m}$ the the Jacobian at $a$ is

D F (a) = (\begin{matrix} \partial_{1} F_{1} (a) & \partial_{2} F_{1} (a) & \dots & \partial_{n} F_{1} (a) \\ \partial_{1} F_{2} (a) & \partial_{2} F_{2} (a) & \dots & \partial_{n} F_{2} (a) \\ ⋮ & ⋮ & ⋮ \\ \partial_{1} F_{m} (a) & \partial_{2} F_{m} (a) & \dots & \partial_{n} F_{m} (a) \end{matrix})

If we choose a basis then any linear transformation $R^{n} \to R^{m}$ can be written as a $m \times n$ matrix. We find that ${d f}_{a} (v) = D F (a) v$ .

Let $S \subset R^{n}$ and $F : S \to R^{m}$ . If $f$ is differentiable at $a \in S$ then, for all $x \in B (a, r) \subset S$ ,

F (x) = F (a) + D F (a) (x - a) + ϵ (x - a)

where $| ϵ (x - a) | = o (‖ x - a ‖)$ . This is like a Taylor expansion in higher dimensions.

Here we see that in higher dimensions we have a matrix form of the chain rule.

Theorem

Let $S \subset R^{l}$ , $T \subset R^{m}$ be open. Let $f : S \to T$ and $g : T \to R^{n}$ and define

h = g \circ f : S \to R^{n} .

Let $a \in S$ . Suppose that $f$ is differentiable at $a$ and $g$ is differentiable at $f (a)$ . Then $h$ is differentiable at $a$ and

D h (a) = D g (f (a)) D f (a) .

Proof

Let $u = f (a + v) - f (a)$ . Since $f$ and $g$ are differentiable,

\begin{aligned} h (a + v) - h (a) & = g (f (a + v)) - g (f (a)) \\ = D g (f (a)) (f (a + v) - f (a)) + ϵ_{g} (u) \\ = D g (f (a)) D f (a) v + D g (f (a)) ϵ_{f} (v) + ϵ_{g} (u) . \end{aligned}

Example (polar coordinates)

Here we consider polar coordinates and calculate the Jacobian of this transformation. We can write the change of coordinates

(r, θ) \mapsto (r \cos θ, r \sin θ)

as the function $f (r, θ) = (x (r, θ), y (r, θ))$ where $f : (0, \infty) \times [0, 2 π) \to R^{2}$ . We calculate the Jacobian matrix of this transformation

D f (r, θ) = (\begin{matrix} \frac{\partial x}{\partial r} (r, θ) & \frac{\partial x}{\partial θ} (r, θ) \\ \frac{\partial y}{\partial r} (r, θ) & \frac{\partial y}{\partial θ} (r, θ) \end{matrix}) = (\begin{matrix} \cos θ & - r \sin θ \\ \sin θ & r \cos θ \end{matrix}) .

In particular we see that $det D f (r, θ) = r$ , the familiar value used in change of variables with polar coordinated.

Suppose now that we wish to calculate derivatives of $h := g \circ f$ for some $g : R^{2} \to R$ . Here we take advantage of the theorem concerning multiplication of Jacobians.

\begin{aligned} D h (r, θ) & = D g (f (r, θ)) D f (r, θ) \\ (\begin{array}{c} \frac{\partial h}{\partial r} (r, θ) & \frac{\partial h}{\partial θ} (r, θ) \end{array}) & = (\begin{array}{c} \frac{\partial g}{\partial x} (f (r, θ)) & \frac{\partial g}{\partial y} (f (r, θ)) \end{array}) (\begin{array}{c} \cos θ & - r \sin θ \\ \sin θ & r \cos θ \end{array}) \end{aligned}

In other words, we have shown that

\begin{aligned} \frac{\partial h}{\partial r} (r, θ) & = \frac{\partial g}{\partial x} (r \cos θ, r \sin θ) \cos θ + \frac{\partial g}{\partial y} (r \cos θ, r \sin θ) \sin θ \\ \frac{\partial h}{\partial θ} (r, θ) & = - r \frac{\partial g}{\partial x} (r \cos θ, r \sin θ) \sin θ + r \frac{\partial g}{\partial y} (r \cos θ, r \sin θ) \cos θ . \end{aligned}

Implicit functions & partial derivatives

Just like with derivatives, we can take higher order partial derivatives. For convenience when we want to write $\frac{\partial}{\partial y} \frac{\partial}{\partial x} f (x, y)$ , i.e., differentiate first with respect to $x$ and then with respect to $y$ , we write instead $\frac{\partial^{2} f}{\partial y \partial x} (x, y)$ . The analogous notation is used for higher derivatives and any other choice of coordinates.

We first consider the question of when

\frac{\partial^{2} f}{\partial y \partial x} (x, y) \overset{?}{=} \frac{\partial^{2} f}{\partial x \partial y} (x, y) .

Example (partial derivative problem)

Let $f : R^{2} \to R$ be defined as $f (0, 0) = 0$ and, for $(x, y) \neq (0, 0)$ ,

f (x, y) := \frac{x y (x^{2} - y^{2})}{x^{2} + y^{2}} .

We calculate that $\frac{\partial^{2} f}{\partial y \partial x} (0, 0) = - 1$ but $\frac{\partial^{2} f}{\partial x \partial y} (0, 0) = 1$ .

Theorem

Let $f : S \to R$ be a scalar field such that the partial derivatives $\frac{\partial f}{\partial x}$ , $\frac{\partial f}{\partial y}$ and $\frac{\partial^{2} f}{\partial y \partial x}$ exist on an open set $S \subset R^{2}$ containing $x$ . Further assume that $\frac{\partial^{2} f}{\partial y \partial x}$ is continuous on $S$ . Then the derivative $\frac{\partial^{2} f}{\partial x \partial y} (x)$ exists and

\frac{\partial^{2} f}{\partial x \partial y} (x) = \frac{\partial^{2} f}{\partial y \partial x} (x) .

In many cases we can choose to write a given curve/function either in implicit or explicit form.

Implicit	Explicit
$x^{2} - y = 0$	$y (x) = x^{2}$
$x^{2} + y^{2} = 1$	$y (x) = \pm \sqrt{1 - x^{2}}$ , $\| x \| \leq 1$
$x^{2} - y^{2} - 1 = 0$	$y (x) = \pm \sqrt{x^{2} - 1}$ , $\| x \| \geq 1$
$x^{2} + y^{2} - e^{y} - 4 = 0$	A mess?
$x^{2} y^{4} - 3 = \sin (x y)$	A huge mess?

Given the above observation, the following method of calculating derivatives is sometimes useful. Suppose that some $f : R^{2} \to R$ is given and we suppose there exists some $y : R \to R$ such that

f (x, y (x)) = 0 for all x .

Let $h (x) := f (x, y (x))$ and note that $h^{'} (x) = 0$ . Here we are using the idea that $h = f \circ g$ where $g (x) = (x, y (x))$ . By the chain rule $h^{'} (x)$ is equal to

(\begin{matrix} \frac{\partial f}{\partial x} (x, y (x)) & \frac{\partial f}{\partial y} (x, y (x)) \end{matrix}) (\begin{matrix} 1 \\ y^{'} (x) \end{matrix}) = 0.

Consequently

y^{'} (x) = - \frac{\frac{\partial f}{\partial x} (x, y (x))}{\frac{\partial f}{\partial y} (x, y (x))} .

Differential calculus in higher dimension ​

Open sets, closed sets, boundary, continuity ​

Limits and continuity ​

Derivatives of scalar fields ​

Chain rule ​

Example ​

Level sets & tangent planes ​

Derivatives of vector fields ​

Jacobian matrix & the chain rule ​

Implicit functions & partial derivatives ​

Differential calculus in higher dimension

Open sets, closed sets, boundary, continuity

Limits and continuity

Derivatives of scalar fields

Chain rule

Example

Level sets & tangent planes

Derivatives of vector fields

Jacobian matrix & the chain rule

Implicit functions & partial derivatives