11  The Neural Sheaf Heat Equation: Identity Output

Purpose. Proves exponential convergence of the sheaf heat equation on the neural sheaf with identity output, despite ReLU switching.

11.1 Key concepts & results

  • The sheaf heat equation ẋ = −L_F(σ(t)) x with state-dependent activation pattern σ(t).
  • Common Lyapunov function: Dirichlet energy E(x) = ½‖δ(σ)x‖² is decreasing across all activation patterns.
  • Theorem 4.1: exponential convergence to the forward-pass value; rate bounded by the worst-case spectral gap.
  • Filippov framework + uniform positive-definiteness gives convergence despite switching.

Prerequisites: Ch 3, Ch 6, Ch 8

11.2 Motivating example

Take the [2, 4, 1] sheaf with input pinned to \(x_{\text{in}} = (1, -1)\) but initialize the interior cochain \(x(0)\) at something arbitrary — say i.i.d. Gaussian noise far from the forward-pass value. Run the sheaf heat equation \(\dot{x} = -L_\mathcal{F}(\sigma(t)) \, x\) numerically. Plot \(\|x(t) - \texttt{forward}(x_{\text{in}})\|\) on a semi-log axis. The curve decays roughly as a straight line, with small visible “kinks” at each instant where one of the hidden pre-activations \(z^{(1)}_j\) crosses zero and the sheaf switches to a new activation pattern \(\sigma\). Despite those switches, the envelope stays exponential — by the end, \(x(t)\) has converged to the unique harmonic extension, which Ch. 8 identified as the forward pass itself. This is Theorem 4.1 of [1] in action: the identity-output heat equation converges exponentially, at a rate bounded below by the worst-case spectral gap over activation patterns.

11.3 Intuition

A switched linear system — a system whose state matrix jumps between finitely many choices — can behave badly. Even when every branch is individually Hurwitz (exponentially stable), switching between them too fast or at the wrong times can destabilize the trajectory. Classical examples in control theory are routine. So on the face of it, running ReLU heat equation should be frightening: the Laplacian \(L_\mathcal{F}(\sigma)\) changes every time an interior neuron crosses its activation boundary, and the activation boundaries can be crossed arbitrarily often during a single trajectory.

The miracle — the central pleasant surprise of [1] §4 — is that all branches of this switched system share a common Lyapunov function, and it is the object we cared about anyway: Dirichlet energy \(E(x) = \tfrac{1}{2} \|\delta(\sigma) x\|^2\). The unitriangular factorization of Ch. 8 implies that for every activation pattern, \(L_\mathcal{F}(\sigma)\) restricted to the free vertices is positive definite, with a spectrum uniformly bounded away from zero. So along any trajectory — whatever sequence of activation patterns is visited — \(E\) decreases at a rate at least \(2 \lambda_{\min}^{\text{free}}\), where \(\lambda_{\min}^{\text{free}}\) is the minimum over all patterns of the smallest eigenvalue of the free Laplacian. Exponential decay of \(E\) forces exponential convergence of \(x\) to the unique equilibrium (the harmonic extension = forward pass).

The Filippov framework (Ch. 3) is what lets us even talk about the dynamics on measure-zero boundary sets where \(\sigma\) is ambiguous: it promotes the ODE to a differential inclusion, and the Clarke subdifferential of \(E\) takes over the role of \(\nabla E\) at the switching surfaces. The argument is two ingredients — Filippov existence + common Lyapunov function — glued by the unitriangular identity. Everything else in this book that says “despite ReLU, things converge” will be a decorated version of this argument (Ch. 10 adds a nonlinear output; Ch. 12 couples a slow weight flow).

Intuition device (planned): Animation overlaying (i) trajectory in cochain space, (ii) time-axis with shaded activation-pattern epochs, (iii) semi-log energy decay.

11.4 Formal development

[TO FILL: formal development — definitions, statements, careful notation]

11.5 Theorem demonstrations

[TO FILL: proofs / proof sketches of the key results named above. Proofs should come *after* the intuition section, as agreed.]

11.6 Worked examples

[TO FILL: worked example(s) carried out by hand]

11.7 Coding lab

lab-09-relu-heat-equation[TO FILL: one-paragraph description of the lab's goal]

11.8 Exercises

[TO FILL: 3–6 exercises, graded from warm-up to project-level]

11.9 Further reading

[TO FILL: annotated paragraph of 3–6 references]

11.10 FAQ / common misconceptions

[TO FILL: short Q&A for things readers frequently get wrong]