Metadata-Version: 2.4
Name: maya-encoding
Version: 0.1.0
Summary: Maya-inspired numerical encodings for machine learning: Vigesimal Feature Decomposition (VFD) and Maya Calendar Encoding (MCE)
Project-URL: Homepage, https://github.com/danielregalado/maya-encoding
Project-URL: Documentation, https://danielregalado.github.io/maya-encoding
Project-URL: Repository, https://github.com/danielregalado/maya-encoding
Project-URL: Issues, https://github.com/danielregalado/maya-encoding/issues
Author-email: Daniel Regalado <dxr1491@miami.edu>
License-Expression: MIT
License-File: LICENSE
Keywords: calendar,encoding,feature-engineering,machine-learning,maya,scikit-learn,time-series,vigesimal
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.9
Requires-Dist: numpy>=1.21
Requires-Dist: scikit-learn>=1.0
Provides-Extra: all
Requires-Dist: matplotlib>=3.5; extra == 'all'
Requires-Dist: mkdocs; extra == 'all'
Requires-Dist: mkdocs-material; extra == 'all'
Requires-Dist: mkdocstrings[python]; extra == 'all'
Requires-Dist: mypy; extra == 'all'
Requires-Dist: pandas>=1.3; extra == 'all'
Requires-Dist: pytest-cov; extra == 'all'
Requires-Dist: pytest>=7.0; extra == 'all'
Requires-Dist: ruff; extra == 'all'
Requires-Dist: scipy>=1.7; extra == 'all'
Requires-Dist: seaborn>=0.12; extra == 'all'
Requires-Dist: xgboost>=1.5; extra == 'all'
Provides-Extra: benchmarks
Requires-Dist: pandas>=1.3; extra == 'benchmarks'
Requires-Dist: scipy>=1.7; extra == 'benchmarks'
Requires-Dist: seaborn>=0.12; extra == 'benchmarks'
Requires-Dist: xgboost>=1.5; extra == 'benchmarks'
Provides-Extra: dev
Requires-Dist: matplotlib>=3.5; extra == 'dev'
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pandas>=1.3; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs; extra == 'docs'
Requires-Dist: mkdocs-material; extra == 'docs'
Requires-Dist: mkdocstrings[python]; extra == 'docs'
Provides-Extra: viz
Requires-Dist: matplotlib>=3.5; extra == 'viz'
Description-Content-Type: text/markdown

# maya-encoding

[![CI](https://github.com/danielregalado/maya-encoding/actions/workflows/ci.yml/badge.svg)](https://github.com/danielregalado/maya-encoding/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/maya-encoding.svg)](https://badge.fury.io/py/maya-encoding)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Maya-inspired numerical encodings for machine learning.**

Two sklearn-compatible transformers that use the mathematical structure of the ancient Maya number system and calendar to create richer feature representations:

- **VFDEncoder** (Vigesimal Feature Decomposition) — Decomposes numbers into the Maya base-20 system with bars (÷5) and dots (%5), giving models multi-scale numerical structure for free.
- **MayaCalendarEncoder** (Maya Calendar Encoding) — Converts dates into features from the Tzolk'in (260-day), Haab' (365-day), and Long Count calendars, providing interlocking cyclical patterns at multiple time scales.

## Installation

```bash
pip install maya-encoding
```

With visualization support:
```bash
pip install maya-encoding[viz]
```

## Quick Start

### VFD: Numeric Feature Encoding

```python
from maya_encoding import VFDEncoder
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor

# VFD decomposes numbers into vigesimal digits, bars, and dots
encoder = VFDEncoder(components='full')

# Works seamlessly in sklearn pipelines
pipe = Pipeline([
    ('encode', VFDEncoder()),
    ('model', RandomForestRegressor())
])
pipe.fit(X_train, y_train)
```

How it works: the number **347** becomes:
```
347 = 17×20 + 7

Level 0 (ones):   digit=7,  bars=1, dots=2
Level 1 (twenties): digit=17, bars=3, dots=2

Feature vector: [7, 1, 2, 17, 3, 2] (or normalized to [0,1])
```

This gives the model three "zoom levels" per number — coarse magnitude (digits), medium grouping (bars), and fine residual (dots).

### MCE: Temporal Feature Encoding

```python
from maya_encoding import MayaCalendarEncoder

# Encode dates using Maya calendar cycles
encoder = MayaCalendarEncoder(
    components=['tzolkin', 'haab', 'long_count'],
    cyclical=True,  # Add sine/cosine for smooth cycle boundaries
)

X_temporal = encoder.fit_transform(df['date'])
```

The Maya calendar provides interlocking cycles of coprime periods (13, 20, 260, 365, 360), capturing multi-scale temporal patterns that standard sine/cosine encoding requires manual period selection to achieve.

## Why Maya Encoding?

**The problem:** When you feed a number like "347" to a model, it knows nothing about its structure. It has to learn from scratch that 347 is close to 350, "large" compared to 5, and divisible in certain ways.

**The solution:** The Maya vigesimal system naturally decomposes numbers into a hierarchy:
- **Vigesimal digits** (×20): coarse magnitude
- **Bars** (×5): medium grouping
- **Dots** (×1): fine residual

This is a strict information superset — the model can ignore the extra features via regularization if they're not useful, but gets multi-scale structure for free if they are.

For temporal data, the Maya calendar's three interlocking cycles (Tzolk'in 260-day, Haab' 365-day, Long Count) provide coprime-period features that capture patterns standard time encodings miss.

## API Reference

### VFDEncoder

| Parameter | Default | Description |
|-----------|---------|-------------|
| `n_levels` | `'auto'` | Vigesimal levels (auto-detected from data) |
| `components` | `'full'` | `'full'`, `'lite'` (digits only), `'bars_dots'` |
| `normalize` | `True` | Normalize to [0,1] |
| `handle_negative` | `'abs_sign'` | `'abs_sign'`, `'shift'`, `'error'` |
| `handle_float` | `'scale'` | `'scale'`, `'round'`, `'integer_part'` |
| `scale_factor` | `'auto'` | Auto-detected from decimal precision |

### MayaCalendarEncoder

| Parameter | Default | Description |
|-----------|---------|-------------|
| `components` | `['tzolkin', 'haab', 'long_count']` | Calendar systems to use |
| `tzolkin_encoding` | `'separate'` | `'separate'` (2 features) or `'combined'` (1 feature) |
| `haab_encoding` | `'hierarchical'` | `'hierarchical'` (with bars/dots) or `'flat'` |
| `long_count_levels` | `3` | 1-5: kin, uinal, tun, katun, baktun |
| `cyclical` | `True` | Add sine/cosine pairs |
| `epoch` | `'gmt'` | `'gmt'` (standard), `'spinden'`, or custom JDN |
| `wayeb_flag` | `True` | Binary feature for the 5-day Wayeb' period |

## Visualization

```python
from maya_encoding.visualization.glyphs import plot_maya_number, render_maya_text

# Text rendering
print(render_maya_text(347))

# Matplotlib rendering
plot_maya_number(347)
```

## Development

```bash
git clone https://github.com/danielregalado/maya-encoding.git
cd maya-encoding
pip install -e ".[dev]"
pytest
```

Run benchmarks:
```bash
python benchmarks/run_vfd_benchmarks.py
python benchmarks/run_mce_benchmarks.py
```

## Citation

If you use maya-encoding in your research, please cite:

```bibtex
@software{regalado2026maya,
  author = {Regalado, Daniel},
  title = {maya-encoding: Maya-Inspired Numerical Encodings for Machine Learning},
  year = {2026},
  url = {https://github.com/danielregalado/maya-encoding}
}
```

## License

MIT License. See [LICENSE](LICENSE) for details.
