โ† Back to Theory Analysis

MNIST Deep Analysis

๐Ÿง’

Explain Like I'm 5

Teaching a computer to recognize handwritten numbers is like teaching a friend to spot your drawing! ๐ŸŽจ

  • ๐Ÿ“ท Each number is a picture made of tiny squares (pixels)
  • ๐Ÿง  The computer learns a "perfect example" of each digit 0-9
  • โœจ NMN neurons are special: They remember the BEST version of each number!

Try drawing below โ€” the computer will guess what number you wrote! โœ๏ธ

โœ๏ธ Draw a Digit (0-9)
Predictions (simulated NMN):
Draw a digit!

๐Ÿ”ฌ Prototype Learning Analysis

When training on MNIST, NMN neurons learn class prototypes โ€” idealized representations of each digit:

Observation: NMN weight vectors converge to patterns that are:
  • โœ… Maximally parallel to their class's data distribution
  • โœ… Minimizing distance to class centroids when possible
  • โœ… In superposition states when class variance is high

๐ŸŒŒ Superposition States

When minimizing distance becomes challenging (high intra-class variance), NMN prototypes can exist in a superposition state:

๐Ÿ’ก
Example: The digit "1" can be written vertically straight or with a slant. Rather than picking one, the NMN prototype learns an "average" direction that maximizes parallelism with both variants, prioritizing alignment over strict proximity.

๐Ÿ”„ Robustness to Inversion

An interesting property: the squared dot product $(w \cdot x)^2$ means that inverted inputs get the same response as originals:

$$\text{โตŸ}(\mathbf{w}, -\mathbf{x}) = \frac{(-\mathbf{w} \cdot \mathbf{x})^2}{\|\mathbf{w} - (-\mathbf{x})\|^2 + \epsilon} \approx \text{โตŸ}(\mathbf{w}, \mathbf{x})$$

This provides natural robustness to certain image transformations!

๐Ÿ“Š Representation Quality

Metric Linear + ReLU NMN Layer
Class Separability Hyperplane boundaries Curved vortex boundaries
Prototype Interpretability Abstract directions Visible digit templates
Inversion Robustness Not inherent Built-in via $(ยท)^2$
Decision Geometry Linear Non-linear, localized