17 Lab 01 — Visualizing Activation Regions

Anchor chapter: Chapter 1 — ReLU Networks as CPWA Maps.

Goal. Plot the bent-hyperplane arrangement of a [2, 3, 1] ReLU net on \(\mathbb{R}^2\). Observe how each region corresponds to a binary activation pattern.

Build the [2, 4, 1] running network in NumPy, sweep the input plane on a grid, and record the activation pattern at each point. Plot the resulting bent-hyperplane arrangement. Then: perturb the weights and watch the arrangement change; identify a set of weights that creates the maximum number of regions for a 2D input with 4 neurons. The lab is the geometric foundation for the sheaf construction in Ch. 7: the activation regions you plot here are exactly the polyhedral cells over which the state-dependent sheaf \(\mathcal{F}_t\) is constant.

Runs in your browser

This lab requires only NumPy and Matplotlib, loaded automatically via Pyodide. Code cells run directly in the page via WebAssembly — no local Python installation needed.

Plan B — run in Jupyter

Prefer a local Jupyter environment? Download lab-01-activation-regions.ipynb

17.1 Setup

17.2 1. Build the object

Construct the [2, 4, 1] ReLU network: weight matrices W1 (4×2), W2 (1×4) and bias vectors b1, b2. A forward pass composes an affine map, a ReLU, and a final affine map. The activation pattern \(\sigma(x) \in \{0,1\}^4\) records which hidden neurons are active at input \(x\); two inputs with the same pattern lie in the same polyhedral region of the bent-hyperplane arrangement.

17.3 2. Verify a theorem / run an experiment

We sweep a 300×300 grid over \([-2,2]^2\), compute the activation pattern \(\sigma(x) \in \{0,1\}^4\) at every point, and encode each pattern as an integer index. The left panel shades each polyhedral region by its index, making the bent-hyperplane arrangement visible; the right panel overlays the four zero-crossing loci \(\{x : (W_1 x + b_1)_j = 0\}\), one per neuron. Warren’s bound guarantees at most \(\sum_{i=0}^{2}\binom{4}{i} = 11\) distinct regions for four hyperplanes in \(\mathbb{R}^2\); the print statement checks how many the random initialisation actually achieves.

17.4 Exercises

Warren’s bound. The maximum number of regions for a single hidden layer with \(m\) neurons and \(d\)-dimensional input is \(\sum_{i=0}^{d} \binom{m}{i}\). For \(m=4\), \(d=2\) this equals 11. Does the random initialisation achieve the maximum? Try 50 different seeds and plot a histogram of the number of regions.
Weight perturbation. Starting from the lab’s default weights, add \(\varepsilon \cdot \mathcal{N}(0, I)\) noise for \(\varepsilon \in \{0.01, 0.1, 0.5, 1.0\}\). Plot the activation regions for each \(\varepsilon\) and describe how the arrangement changes.
Affine consistency check. Pick one activation region (e.g., the one containing the origin). For 20 random points in that region, verify that the network output \(f(x)\) equals the affine map \((W_2 \operatorname{diag}(\sigma_0) W_1) x + (W_2 \operatorname{diag}(\sigma_0) b_1 + b_2)\) to floating-point precision.
Wider network. Replace the [2,4,1] network with [2,8,1] and [2,16,1]. Plot the bent-hyperplane arrangements and verify that wider networks create finer-grained decompositions.