Scalespace or Differential Geometry


Differential geometry applies techniques from multivariable calculus and linear algebra to study features of curves and surfaces of dimension 1, 2, and higher.

This webpage gives an overview grayscale scale space or differential geometry as it can be used in digital image processing, from digital microscopy and High Content Screening (HCS) to medical imaging and astronomy.
The geometric properties of differential gray-value invariants allow for the selection of structural image features for analysis and quantification. Although this article is limited to 2D, the scale space approach can be used for 3D and even 4D image processing. The basic principle is "what you describe is what you get".
For an introduction to the spatial color model see color scale space, which is another application of scale space applied to color image processing in microscopy and High Content Screening.

Derivatives of a 1D function

A parabole y=x2+x and its first (fx=2*x+1) and second order (fxx=2) derivatives.

The principal interpretations of derivatives of a function f concern instantaneous rate of change and tangency. In mathematics the derivatives of a one-dimensional function f are denoted as:
fx=dy/dx = 1st order derivative (the slope of the function)
fxx=d2y/dx2 = 2nd order derivative (the curvature of the function, or the rate of change of the slope)
fxxx=d3y/dx3 = 3rd order derivative (the rate of change of the curvature of the function)

An image as a 2D function

Image 3D plot
A gray scale 2D image and the gray values (intensities) plotted in a 3D view. The x,y axis is the position of the pixel in the 2D plane of the image, while the z-axis is the gray value or intensity of the pixel.

The principles of derivatives can be used on 2D images, which becomes clear if we plot the intensity profile of the image in 3D view.
The intensity profile of the image constitutes a landscape of (dark) valleys and (bright) hills and ridges, which form the curves of a 2D function.
The differentiation of the 2D image is done by convolving the original image with the derivatives of a 2D Gaussian kernel or, equivalently, by multiplying the Fourier transform of the original image with the Fourier transform of the Gaussian and doing the reverse Fourier transform of the result.

The same principle can be extended to 3D images, but this is beyond the current scope of this webpage.

The Gaussian kernel

Gauss kernel

The Gaussian kernel is the unique kernel for differentiation for an uncommitted front-end (eg. human eye).
Differentiating an image is done by convolving the image in the spatial domain with a Gaussian kernel with a standard deviation sigma.
The sigma parameter of the Gaussian kernel determines the size of the features that are resolved, ie. it is a measure for the aperture of the filter.

Derivatives of the Gaussian kernel

Zero order derivative

Zero order Gaussian
Zero order derivative of a Gaussian kernel in 1D, sigma is 10.0.

First order derivative

First order Gaussian
First order derivative of a Gaussian kernel in 1D, sigma is 10.0.

Second order derivative

Second order Gaussian
Second order derivative of a Gaussian kernel in 1D, sigma is 10.0.

Third order derivative

Third order Gaussian
Third order derivative of a Gaussian kernel in 1D, sigma is 10.0

Differentiating with the Gaussian kernel

Convolving an image with a Gaussian kernel and derivatives of the Gaussian kernel, highlights features of the image with a size larger than the chosen scale sigma.
The examples shown here are for 2D images, but can be extended to 3D images.

Original image
The original 256x256 pixels image from which the following differentiated images are derived.

Zero order derivative Zero order x derivative, 3D Plot
Zero order Gaussian derivative in x and y direction, just blurs the image. Sigma (scale) is 2.
Original resolution is 2562

First order x derivative First order x derivative, 3D Plot
First order Gaussian derivative in x direction. Sigma (scale) is 2.

Second order x derivative Second order x derivative, 3D Plot
Second order Gaussian derivative in x direction. Sigma (scale) is 2.


2D grayscale

First Order

Intensity gradient 3D plot of intensity gradient
The image on the left shows the response intensity of the Canny filter (bright means high gradient strength), which is also shown in 3D on the right, the scale, sigma, is 3.

Intensity gradient = Canny = || V L|| = Lw

Second Order

Gradient2 = LiLi = Lx2 + Ly2
Laplacian = Trace of Hessian matrix = trHf = Lii = Lxx + Lyy
Determinant of Hessian matrix = |Hf| = LxxLyy-Lxy2
Euclidian shortening flow = Lvv = Ls
Ridge strength = Lvv
Derivative of gradient in gradient direction = Lww (if Lww=0, Lw maximal)
Deviation from flatness = D = Sqrt(tr H2) = Sqrt(LijLji) = Sqrt(Lxx2 + 2Lxy2 + Lyy2)
Isophote curvature = K = -Lvv / Lw
Flow-line curvature = -Lvw / Lw
Isophote density per unit luminance = -Lww / Lw

Principal intensity surface curvature in principal coordinates in 2D:
Kp1 = Lxx/Sqrt(Lx2 + Ly2)
Kp2 = Lyy/Sqrt(Lx2 + Ly2)

Inflexion strength = LvvLw5-3LvvLvwLw4

Umbilicity = U = (trH)2/TrH2 - 1 = (Lxx + Lyy)2 / D2 = Lll2 / LijLji

Elongatedness = E = ((trH)2(2trH2-(trH)2))/(2trH2)2 = (Lxx2 + Lyy2)/D4

Higher Order

T-junction = ((LvvLvvLvwLvw+LvvLvvLwwLww)/Lw2-2LvvLvvvLvw/Lw-2LvvLvvwLww/Lw+(LvvvLvvv+LvvwLvvw))/Lw2

Resultants and Discriminants

All resultants and discriminants are Euclidian and affine invariant

R12 = Corner strength = Lx2Lyy-2LyLxLxy+ Ly2Lxx = LvvLw2

D1 = 1
D2 = detH = -Lxy2+Lyy + Lxx or Lxx+Lyy-Lxy2

Partial Derivative Equations


Lx = 0
Ly = 0
LxxLyy-Lxy2 >= 0
|Lxx| > threshold
threshold determines salient features


Lx = 0
Ly = 0
LxxLyy-Lxy2 >= 0
Lxx < -threshold
threshold determines salient features


Lx = 0
Ly = 0
LxxLyy-Lxy2 >= 0
Lxx > threshold
threshold determines salient features

Saddle points

Lx = 0
Ly = 0
LxxLyy-Lxy2 < 0

Parabolic points

Lx = 0
Ly = 0
LxxLyy-Lxy2 = 0


affine corner detection: LwwLw2
with high curvature: LwwLw2 > threshold


Bright ridges: Lqq < -4.0 * threshold / sigma2
Dark ridges: Lpp > 4.0 * threshold / sigma2
threshold determines salient features

Gauge coordinates

Coordinate axes in space can be chosen arbitrarily. The space and its properties do not change when we change the coordinate axes. But if coordinate systems are arbitrary, what then is the value of derivatives in the axis direction? The individual derivatives (fx, fxy, ...) are of no use. It is the collection of all derivatives in the N-jet that are of importance as it has geometrical meaning indepedent of the arbitrary coordinate system.
Gauge coordinates are coordinates with respect to axes that are attached to the function geometry. A gauge coordinate system therefore is independent of the natural coordinate axes.
Examples of gauge systems are the gradient gauge and the principal curvature gauge, in which all points in the image have their own gauge coordinate system attached to it.

The word gauge (in gauge theory) was introduced as the German word ma�stab by H. Weyl (1885-1955) in 1918 in Sitzungsber. d. Preuss. Akad. d. Wissensch. 30 May 475 (OED2).

The Gradient Gauge

Gradient Gauge

A special notation is used in image processing for this gauge system: the w,v-coordinates. w is the coordinate in the gradient direction and v is the coordinate perpendicular to the gradient direction.

{v,w} are gradient gauge coordinates, w along and v perpendicular on the gradient; w and v are unit vectors. Lw is the derivative of the luminance field in the w direction.
Lv = 0, Lw = |Grad L|
Lvv, Lwv, Lww = 2nd order, etc.
Lww is also called the SDGD (Second Derivative in Gradient Direction).
Lw gives us information about the local change in intensity of an image, which is a first order derivative phenomenon.

The Curvature Gauge

Curvature Gauge

In situations where the gradient gauge vanishes (Lx=0, Ly=0), the second order differentials may indicate that something interesting is happening, for instance the points along the axis of a line.
A gauge system suitable for this phenomenon is the curvature gauge

The local second order differentials around a point in a 2D image can be grouped into a Hessian matrix in which all second order derivatives are gathered, it describes the local second-order profile of intensity variations in a 2D image.

Hessian Matrix

The partial second order derivatives of a 2D image L(x,y) are represented like Lxx, Lxy, etc. .
The Hessian matrix of second order derivatives is a symmetric matrix and can be transformed to a diagonal matrix, using a rotation matrix. The coordinate frame that diagonalizes the Hessian matrix is often denoted as the (p,q)-frame, it is built from the eigenvectors of the Hessian matrix.
The eigenvectors of the Hessian, p and q, point in the directions of the minimal and maximal directional second order derivatives.
The eigenvalues of the Hessian are the second order differentials fpp and fqq or Lpp and Lqq.
{p,q} are the Hessian gauge coordinates, this diagonalizes the Hessian (eigenvalues of the Hessian matrix), by definition Lpq = 0

The trace (trHf) of a matrix equals the sum of its diagonal elements, it is a sort of Laplacian, it might be called the mean curvature and it describes the curvature at patch boundaries.
The sign of the trace indicates whether creases are ridges or valleys and its magnitude reflects the angle at which two linear patches meet.
The trace of the Hessian=trHf=fxx+fyy= Lxx+Lyy=Lww+Lvv

The determinant (|Hf|) of the Hessian matrix is:
|Hf|=fxxfyy-fxy2= LxxLyy-Lxy2=LwwLvv-Lvw2

The determinant is sometimes called deviation of flatness, and it indicates the overall change in first-order information; magnitude and direction.

The term Hessian was coined by James Joseph Sylvester (1814-1897), named for Otto Hesse, who had used the term functional determinants.

Notation and conventions

Evaluation of differential expressions:
L = the image
Lx = 1st order x derivative
Ly = 1st order y derivative
Lxx = 2nd order x derivative, etc.

Li = 1st order Partial Differential Equation (PDE) according to summation convention
"i" sums over all dimensions, i.e. Li = Lx+Ly
Li is a vector:{Lx,Ly,Lz}. Summation convention on paired indices

Lii = 2nd order, Lii = Lxx+Lyy
Lij = 2nd order, Lij = Hessian matrix

Taylor series of a function

A way to approximate an image is to perform a Taylor expansion of the local image structure. Locally, i.e. in a small neighborhoud of a point, an image (as any function) can be represented by its Taylor expansion. A Taylor series is a series expansion of a function, it is the most basic tool for approximation of functions.
Consider a function f(x) that is smooth - when we say ``smooth'', what we mean is that its derivatives exist and are bounded (for the following discussion, we need f to have (n+1)derivatives). We would like to approximate f(x) near the point x=a, and we can do it as follows with a 1-D Taylor series:

Taylor series

If we look at the infinite sum (let n-> infinity), then the resulting infinite sum is called the Taylor series of f(x) about x=a.
This result is also know as Taylor's Theorem.

If a=0, the expansion is known as a Maclaurin series:

Maclaurin series

An alternative form of the 1-D Taylor series may be obtained by letting

so that

Substitute this result into () to give

A Taylor series of a function in two variables f(x,y) is given by

A Taylor expansion is useful for calculating image values between pixels and extrapolation. Truncating the Taylor expansion at some finite number N, we obtain an approximated representation of the function (image). Since the complete expansion (N=infinity) is a unique description of the function f, the approximated expansion can be the same for different images.
The local N-jet is definded as the complete set of images which all give the same Taylor expansion at that point up to order N. This is the set of images which are equal in the first N derivatives at that point.
The scale has be to taken into account in the Taylor-expansion, The scale is the level of resolution at which we look at an image, only resolving features at a certain level of resolution (the aperture of the filter in image analysis). All derivatives are scaled derivatives.

Practical applications in image analysis

Minima and Maxima

A necessary condition that a function f assumes a relative maximum or a relative minimum at (x0, y0) is that:
fx = fy = 0 at (x0,y0)
If this condition is true then in summary the following criteria can be used.
All subsequent images were analyzed with sigma = 5.0 .

Relative Maximum

Relative maximum

Relative Minimum

Relative minimum

Saddle Point

Saddle point

Elliptic Patches

Relative Maximum Region

Relative maximum region

Relative Minimum Region

Relative minimum region

Saddle Region

Saddle point region


By using for instance Lxx/L in the above mentioned equations, a normalisation relative to the image intensity is obtained.
By selecting for instance Lxx/L > Threshold or Lxx/L < Threshold, a noise reduction is performed, as only salient features are selected.

Tracing lines and ridges

Dark ridge Dark ridge differentiated
Tracing the neurites of a neuron with differential geometry as an example of a line detector.
The equation used for detecting a dark ridge, with sigma 2.0 and threshold 1.0:
Lpp > 4.0 * threshold / s2

The purpose of a line filter is the discrimination of line structures from other image structures.
A line or a bar is a second order differential structure (see Hessian matrix). The points along the axis have a gradient near zero. A gauge system that is capable of capturing these local phenomena is the above mentioned principal curvature gauge.
As we have seen, the eigenvectors of the Hessian matrix, p and q, point in the directions of the minimal and maximal directional second order derivatives.
The difference between Lqq and Lpp determines the directionality of the local image structure.
If Lqq = Lpp, there is no preferred orientation; if Lqq >> Lpp this indicates a line structure.
In short, we have a line when Lqq - Lpp > Th and Lx = 0, Ly = 0. The parameter Th determines the saliency of the line.

The larger the difference between the trace of the Hessian (trHf) and the determinant (|Hf|), the larger is the directionality of the local 2D image structure and as such the probability for a line.

With respect to the Cartesian coordinate frame (x,y):

p = (fxx-fyy- sqrt((trHf)2-4*|Hf|)) / (2*fxy)

q = (fxx-fyy+ sqrt((trHf)2-4*|Hf|)) / (2*fxy)

or in L notation:

p = (Lxx-Lyy- sqrt((trHf)2-4*|Hf|)) / (2*Lxy)

q = (Lxx-Lyy+ sqrt((trHf)2-4*|Hf|)) / (2*Lxy)

The eigenvalues of the Hessian, expressed as a function of the trace (trHf) and the determinant (|Hf|) of the Hessian matrix:
Lpp = 0.5*(trHf- sqrt((trHf)2-4*|Hf|))

Lqq = 0.5*(trHf+ sqrt((trHf)2-4*|Hf|))

The eigenvalues of the Hessian matrix in L notation, expressed in Cartesian coordinates:
Lpp = 0.5*(Lyy+Lxx- sqrt((Lyy+Lxx)2-4*(LyyLxx-Lyx2)))

Lqq = 0.5*(Lyy+Lxx+ sqrt((Lyy+Lxx)2-4*(LyyLxx-Lyx2)))

The eigenvalues of the Hessian matrix in L notation expressed in gauge coordinates:
Lpp = 0.5*(Lww+Lvv- sqrt((Lww+Lvv)2-4*(LwwLvv-Lvw2)))

Lqq = 0.5*(Lww+Lvv+ sqrt((Lww+Lvv)2-4*(LwwLvv-Lvw2)))


For additonal information you may also contact my former colleague Jan-Mark Geusebroek (UvA) or Prof. Bart M. ter Haar Romeny (T.U. Eindhoven).

See also


I am indebted, for their pioneering work on automated digital microscopy and High Content Screening (HCS) (1988-2001), to my former colleagues at Janssen Pharmaceutica (1997-2001), such as Frans Cornelissen, Hugo Geerts, Jan-Mark Geusebroek, Roger Nuyens, Rony Nuydens, Luk Ver Donck and their colleagues.

Many thanks also to the pioneers of Nanovid microscopy at Janssen Pharmaceutica, Marc De Brabander, Jan De Mey, Hugo Geerts, Marc Moeremans, Rony Nuydens and their colleagues. I also want to thank all those scientists who have helped me with general information and articles.

The author of this Webpage is Peter Van Osta