Vorlesung Numerische Mathematik (Prof. Bartz-Beielstein): The Curse of Dimensionality

1. The Emptiness of High-Dimensional Space

The most fundamental aspect of the curse is that space expands at an exponential rate. As we add dimensions, any fixed number of data points become increasingly sparse. This visualization shows how a "local" neighborhood must stretch to capture even a tiny fraction of the data. The formula is $e_d(r) = r^{1/d}$, where $d$ is the dimension and $r$ is the fraction of volume to capture. Watch how the required edge length, $e$, approaches 1 (the full width of the space) as you increase the dimension.

Dimension (d): 10

Volume Fraction to Capture (r): 0.10

Required Edge Length:

0.794

This "local" hypercube must cover 79.4% of each axis.

2. The Paradoxical Volume of a Hypersphere

Our intuition suggests that the volume of a sphere should always grow with dimension. The opposite is true. The volume of a unit hypersphere (radius = 1) increases until dimension 5, then begins a steady, inexorable decline towards zero. This happens because in high dimensions, most of the volume of an enclosing hypercube is in its "corners," leaving almost no space for the inscribed sphere.

Drag to select dimension: 5

The All-Encompassing Shell

Even more strangely, the tiny amount of volume that remains is not in the center of the hypersphere. It's almost entirely concentrated in a paper-thin shell near the surface. The formula is $1 - (1-\epsilon)^d$, where $\epsilon$ is the thickness of the shell as a fraction of the radius. This means if you sample a point uniformly from a high-dimensional sphere, it's almost guaranteed to be near the boundary.

Dimension (d): 100

Shell Thickness (ε): 0.05

Volume in Shell:

99.41%

3. The Great Convergence of Distances

This is the most damaging consequence for many ML algorithms. As dimension increases, the distances between all pairs of points converge. The difference between the "nearest" and "farthest" neighbor vanishes, making distance-based algorithms like k-NN and clustering unstable. Watch the histogram of pairwise distances morph from a broad distribution into a sharp spike as you increase the dimension. The table tracks the "Relative Contrast" `(Max - Min) / Min`, which plummets towards zero.

Dimension (d): 2

Dimension	Mean	Std. Dev.	Min	Max	Relative Contrast