← Back to Theory Advanced

Vortex Dynamics

🧒

Explain Like I'm 5

Imagine each neuron is like a whirlpool 🌀 in a big ocean of data!

  • 🌊 Regular neurons just push water in straight lines (like a fan)
  • 🌀 ⵟ-neurons create spinning whirlpools that PULL things toward them
  • 🏆 The strongest whirlpool "wins" and claims that part of the ocean

When you have many whirlpools (neurons), they divide up the ocean into territories. Each whirlpool is in charge of the water closest to it and pointing toward it!

🌀 Multi-Neuron Vortex Field Competition

When multiple -neurons compete to classify an input, each creates a gravitational-like "vortex field" that attracts inputs based on both alignment and proximity.

🎮 Interactive: Multi-Neuron Vortex Competition
Click canvas to add data points • Drag neurons (colored dots) to move them

📐 Decision Boundary Mathematics

The decision boundary between two neurons is defined where their responses are equal:

$$\frac{\langle \mathbf{w}_i, \mathbf{x} \rangle^2}{\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon} = \frac{\langle \mathbf{w}_j, \mathbf{x} \rangle^2}{\|\mathbf{w}_j - \mathbf{x}\|^2 + \epsilon}$$

Cross-multiplying and rearranging:

$$\langle \mathbf{w}_i, \mathbf{x} \rangle^2 (\|\mathbf{w}_j - \mathbf{x}\|^2 + \epsilon) = \langle \mathbf{w}_j, \mathbf{x} \rangle^2 (\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon)$$
🔑
Key Difference from Linear: Unlike linear neurons which produce hyperplane boundaries, -neurons produce curved algebraic surfaces — like equipotential contours in a gravitational field.

🧲 Space-Partitioning Properties

📍
Distance Penalization

The inverse-square denominator enforces spatial selectivity. Responses are largest near $\mathbf{x} \approx \mathbf{w}$ and decay for distant inputs.

🎯
Alignment Weighting

$\text{ⵟ}(\mathbf{w}, \mathbf{x}) = 0$ if and only if $\mathbf{w} \perp \mathbf{x}$. High values require both alignment AND proximity.

📊
Global Boundedness

For fixed $\mathbf{w}$: $0 \le \text{ⵟ}(\mathbf{w}, \mathbf{x}) \le \|\mathbf{w}\|^4/\epsilon$. Each neuron has a bounded activation landscape.

🌀
Nonlinear Regions

Decision boundaries form curved surfaces — level sets akin to equipotential surfaces in physics.

⚔️ Softmax Competition

When normalized with softmax, -product scores create "territorial" competition:

$$p_i = \frac{\exp\left(\frac{\langle \mathbf{w}_i, \mathbf{x} \rangle^2}{\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon}\right)}{\sum_{j=1}^{C} \exp\left(\frac{\langle \mathbf{w}_j, \mathbf{x} \rangle^2}{\|\mathbf{w}_j - \mathbf{x}\|^2 + \epsilon}\right)}$$
📐 Show Extended Math Derivation
Step 1: Define Individual Scores

Let $s_i$ denote the raw -product score for neuron $i$:

$$s_i(\mathbf{x}) = \text{ⵟ}(\mathbf{w}_i, \mathbf{x}) = \frac{(\mathbf{w}_i^\top \mathbf{x})^2}{\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon}$$
Step 2: Temperature-Scaled Softmax

With temperature $\tau > 0$ controlling sharpness:

$$p_i = \frac{\exp(s_i / \tau)}{\sum_{j=1}^{C} \exp(s_j / \tau)} = \text{softmax}\left(\frac{\mathbf{s}}{\tau}\right)_i$$

• $\tau \to 0$: Hard assignment (argmax) → sharp territorial boundaries
• $\tau \to \infty$: Uniform distribution → no competition
• $\tau = 1$: Standard softmax

Step 3: Gradient w.r.t. Input

The gradient of the log-probability for the correct class $c$:

$$\nabla_{\mathbf{x}} \log p_c = \frac{1}{\tau}\left(\nabla_{\mathbf{x}} s_c - \sum_{j=1}^{C} p_j \nabla_{\mathbf{x}} s_j\right)$$
Step 4: Gradient of -Product

Using the quotient rule on $s_i = \frac{u^2}{v}$ where $u = \mathbf{w}_i^\top \mathbf{x}$ and $v = \|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon$:

$$\nabla_{\mathbf{x}} s_i = \frac{2u \cdot \mathbf{w}_i \cdot v - u^2 \cdot 2(\mathbf{x} - \mathbf{w}_i)}{v^2}$$

Simplifying:

$$\nabla_{\mathbf{x}} s_i = \frac{2(\mathbf{w}_i^\top \mathbf{x})}{\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon}\left[\mathbf{w}_i - \frac{(\mathbf{w}_i^\top \mathbf{x})(\mathbf{x} - \mathbf{w}_i)}{\|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon}\right]$$
Step 5: Convergence Behavior

As $\mathbf{x} \to \mathbf{w}_i$ (input approaches weight):

$$\lim_{\mathbf{x} \to \mathbf{w}_i} s_i = \frac{\|\mathbf{w}_i\|^4}{\epsilon} \quad \text{(maximum response)}$$

As $\mathbf{x} \perp \mathbf{w}_i$ (orthogonal):

$$s_i = 0 \quad \Rightarrow \quad p_i \to \frac{1}{C} \quad \text{(loses competition)}$$

As training progresses, the softmax approaches a delta distribution — the winning neuron gets probability → 1, others → 0. This creates sharp territorial boundaries.

🔄 Orthogonality and Competitive Dynamics

When two prototypes develop disjoint support regions, they become orthogonal:

Orthogonality-Entropy Connection:
When $\mathbf{w}_i \perp \mathbf{w}_j$ (orthogonal): $$\text{ⵟ}(\mathbf{w}_i, \mathbf{w}_j) = 0 \quad \text{and} \quad H(\mathbf{w}_i, \mathbf{w}_j) = \infty$$ The infinite cross-entropy creates strong pressure for territorial separation.