Explain Like I'm 5
Imagine each neuron is like a whirlpool 🌀 in a big ocean of data!
- 🌊 Regular neurons just push water in straight lines (like a fan)
- 🌀 ⵟ-neurons create spinning whirlpools that PULL things toward them
- 🏆 The strongest whirlpool "wins" and claims that part of the ocean
When you have many whirlpools (neurons), they divide up the ocean into territories. Each whirlpool is in charge of the water closest to it and pointing toward it!
🌀 Multi-Neuron Vortex Field Competition
When multiple ⵟ-neurons compete to classify an input, each creates a gravitational-like "vortex field" that attracts inputs based on both alignment and proximity.
🎮 Interactive: Multi-Neuron Vortex Competition
📐 Decision Boundary Mathematics
The decision boundary between two neurons is defined where their responses are equal:
Cross-multiplying and rearranging:
🧲 Space-Partitioning Properties
Distance Penalization
The inverse-square denominator enforces spatial selectivity. Responses are largest near $\mathbf{x} \approx \mathbf{w}$ and decay for distant inputs.
Alignment Weighting
$\text{ⵟ}(\mathbf{w}, \mathbf{x}) = 0$ if and only if $\mathbf{w} \perp \mathbf{x}$. High values require both alignment AND proximity.
Global Boundedness
For fixed $\mathbf{w}$: $0 \le \text{ⵟ}(\mathbf{w}, \mathbf{x}) \le \|\mathbf{w}\|^4/\epsilon$. Each neuron has a bounded activation landscape.
Nonlinear Regions
Decision boundaries form curved surfaces — level sets akin to equipotential surfaces in physics.
⚔️ Softmax Competition
When normalized with softmax, ⵟ-product scores create "territorial" competition:
📐 Show Extended Math Derivation
Step 1: Define Individual Scores
Let $s_i$ denote the raw ⵟ-product score for neuron $i$:
Step 2: Temperature-Scaled Softmax
With temperature $\tau > 0$ controlling sharpness:
• $\tau \to 0$: Hard assignment (argmax) → sharp territorial boundaries
• $\tau \to \infty$: Uniform distribution → no competition
• $\tau = 1$: Standard softmax
Step 3: Gradient w.r.t. Input
The gradient of the log-probability for the correct class $c$:
Step 4: Gradient of ⵟ-Product
Using the quotient rule on $s_i = \frac{u^2}{v}$ where $u = \mathbf{w}_i^\top \mathbf{x}$ and $v = \|\mathbf{w}_i - \mathbf{x}\|^2 + \epsilon$:
Simplifying:
Step 5: Convergence Behavior
As $\mathbf{x} \to \mathbf{w}_i$ (input approaches weight):
As $\mathbf{x} \perp \mathbf{w}_i$ (orthogonal):
As training progresses, the softmax approaches a delta distribution — the winning neuron gets probability → 1, others → 0. This creates sharp territorial boundaries.
🔄 Orthogonality and Competitive Dynamics
When two prototypes develop disjoint support regions, they become orthogonal:
When $\mathbf{w}_i \perp \mathbf{w}_j$ (orthogonal): $$\text{ⵟ}(\mathbf{w}_i, \mathbf{w}_j) = 0 \quad \text{and} \quad H(\mathbf{w}_i, \mathbf{w}_j) = \infty$$ The infinite cross-entropy creates strong pressure for territorial separation.