Metadata-Version: 2.4
Name: eulerstack
Version: 0.0.1.dev2
Summary: YAML-driven modular LLM assembler with Hugging Face compatibility
License: Apache-2.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.1
Requires-Dist: transformers>=4.40
Requires-Dist: pyyaml>=6.0
Requires-Dist: click>=8.0
Dynamic: license-file

# EulerStack

**A YAML-driven modular LLM assembler with Hugging Face compatibility.**

[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)

🌐 **Language:** English · [한국어](README.ko.md)

---

EulerStack lets you **describe a transformer-family architecture as a YAML spec**, validate it against a strict schema, estimate parameters, and compile it to either a JSON runtime config or a standard Hugging Face model directory that you can immediately plug into `transformers`, PEFT, vLLM, or any downstream training framework.

It is an **architecture assembly tool**, not a training framework. EulerStack stops at a clean, randomly-initialised, structurally-valid model. From there you bring your own data and your favourite trainer.

## Why EulerStack?

Want to try DeepSeek-V3's MLA attention on top of your Llama baseline?

- **Code path**: fork `modeling_llama.py`, rewrite `LlamaAttention`, patch the KV-cache, re-map the state-dict. **~200–300 lines of diff.** The intent — "try MLA" — is one line buried in hundreds.
- **EulerStack path**:
  ```diff
  -      attention: { qkv_bias: false }
  +      attention: { qkv_bias: false, latent_dim: 384 }
  ```
  **One line.**

That ratio — *idea* vs *mechanical plumbing* — is the whole pitch:

- **Changes are tiny.** Swap attention for Mamba, add MoE to every 4th layer, enable 2-phase reasoning — each is **1–5 YAML lines**, not a refactor.
- **The diff *is* the design decision.** Two months later, you still know what you changed and why. `modeling_custom.py` diffs lose that intent inside the plumbing.
- **You can discuss it like a blueprint.** Reviewers read the spec, not spelunk through PyTorch. Architecture debates happen on a document, not in code comments.
- **Lintable before any GPU.** Parameter counts, head-dim sanity, KV-cache budgets — all caught pre-training.
- **Output is vanilla HuggingFace.** Plugs into `transformers`, PEFT, vLLM, etc. No lock-in, no custom runtime.

## Installation

Requires Python 3.10+.

**From PyPI (recommended):**

```bash
pip install eulerstack
```

**From source (for development or the latest `main`):**

```bash
git clone https://github.com/<your-org>/eulerstack.git
cd eulerstack
pip install -e .
```

Either way, the `eulerstack` CLI is installed on your `PATH`.

Core runtime dependencies: `torch >= 2.1`, `transformers >= 4.40`, `pyyaml`, `click`.

## Quickstart

The CLI speaks five languages (`ko` / `en` / `zh` / `ja` / `es`). The default is Korean; pass `--lang en` or set `EULERSTACK_LANG=en` for English.

```bash
# See the bundled presets
eulerstack --lang en presets list

# Validate a spec (schema check only)
eulerstack --lang en validate --preset configs/presets/llm_2b_simple.yml

# Validate with a full realism report (param estimates, sanity checks, warnings)
eulerstack --lang en validate --preset my_model.yml --report

# Explain what the spec describes in human-readable form
eulerstack --lang en explain --preset configs/presets/arch_beginner_llama.yml

# Compile to a runtime JSON config
eulerstack --lang en compile --preset my_model.yml --output compiled.json

# Compile to a Hugging Face model directory (config.json + model.safetensors)
eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model_hf
```

The `--output-dir` form writes a directory that loads directly with `transformers`:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("./my_model_hf")
tokenizer = AutoTokenizer.from_pretrained("gpt2")  # per the spec's tokenizer_contract
```

Weights are randomly initialised. **Training is explicitly out of scope** — see *Where EulerStack Fits* below.

## What a Spec Looks Like

A minimal decoder-only model:

```yaml
schema_version: 1

model:
  name: "my-llm"
  d_model: 2048
  vocab_size: 32000
  max_seq_len: 4096
  n_heads: 16
  mlp_ratio: 4
  dtype: bfloat16

tokenizer_contract:
  type: hf
  pretrained: gpt2

embedding:
  type: learned
  positional: rope
  rope_theta: 500000.0
  tie_word_embeddings: true

layer_templates:
  decoder:
    mixer:
      type: attention        # or: mamba, retnet, hyena, linear_attention, ...
      attention: {}
    ffn:
      type: gated_mlp        # or: moe, mlp
      activation: swiglu
    norm:
      type: rmsnorm
      position: pre

layer_schedule:
  - template: decoder
    repeat: 24

head:
  type: causal_lm
  tie_weights: true
```

Hybrid and MoE models are expressed the same way — you define multiple `layer_templates` and arrange them in `layer_schedule`. See the `configs/presets/` directory for working examples, including attention-free models, MoE-every-N-layers, and mixed-mixer stacks.

## Architecture

EulerStack is a five-layer pipeline; each layer has one job.

```
  YAML spec (DSL)
       │
       ▼
  ┌──────────┐   validate — schema v1, cross-field checks, realism warnings
  │  Schema  │
  └──────────┘
       │
       ▼
  ┌──────────┐   normalize_to_ir — typed, canonical in-memory representation
  │    IR    │
  └──────────┘
       │
       ▼
  ┌──────────┐   compile_ir — materialise layer list, param count, runtime config
  │ Compiler │
  └──────────┘
       │
       ├──► JSON runtime config
       │
       └──► Hugging Face model directory (PreTrainedModel + safetensors)
```

A few details worth knowing:

- **Schema v1** is versioned and strict. Unknown keys are errors (with one
  exception: reserved prefixes `experimental.*` / `future.*` / `vendor.*.*` are
  accepted as warnings so plugins and in-progress research can coexist).
- **Mixer types** are pluggable: attention (with GQA / sliding-window / RoPE / ALiBi), Mamba / Mamba2, RetNet, Hyena, linear attention, and more. Adding a new mixer means implementing one block class and registering it — no changes to the schema or compiler.
- **FFN types** include dense MLP, gated MLP (SwiGLU / GeGLU), and MoE (top-k routing, capacity factor, router z-loss).
- **Outputs are vanilla Hugging Face**. There is no EulerStack runtime — the exported directory is indistinguishable from any `AutoModelForCausalLM.from_pretrained()` target, so all the standard ecosystem tooling (PEFT, LoRA, bitsandbytes, accelerate, DeepSpeed, vLLM, SGLang, llama.cpp converters where applicable) just works.

## Presets

`configs/presets/` ships with **52 ready-to-compile specs**, organised as a
three-tier progression from industry-standard canon to v1's new architecture features.

### Tier 1 — Validated industrial

Production-grade baselines. Training recipes are well-studied; failure modes
are known.

- `arch_beginner_gpt2`, `arch_beginner_llama` — classic Transformer and Llama-2/3 style
- `arch_intermediate_mistral`, `arch_intermediate_gemma2`, `arch_intermediate_qwen_longctx` — modern attention patterns
- `llm_0p1b_{simple,mistral}` — Stage-1 / CPT warm-up (sovereign-foundation pilot)
- `llm_*_simple`, `llm_*_mistral` across 0.8B / 2B / 4B / 16B

### Tier 2 — Recent / complex (hybrid, MoE, KV-compressed)

Modern research consensus running in production systems.

- `arch_advanced_{jamba, samba, retnet}` — hybrid and attention-free lines
- `arch_advanced_mla` — **MLA** (DeepSeek-V3 2024, runtime Core)
- `arch_advanced_mod` — **Mixture-of-Depths** (Raposo ICML 2024, runtime Component)
- `arch_expert_*` (9 presets, some speculative) — MoE × mixer × depth explorations
- `arch_expert_*_mini` (6 small-scale experts) — ablation-ready for single-GPU
- `llm_*_jamba`, `llm_*_moe`, `llm_*_mla` across 0.1B / 0.8B / 2B / 4B / 16B (MoE skipped at 0.1B)

### Tier 3 — v1 experimental (new advanced architecture features at arch-scale)

Three `arch_expert_*` presets (~1.2–1.4B) that each showcase one of the
advanced architecture features. Schema-complete; runtime partial — the full
spec round-trips via `config.v1_extensions`.

| Preset | Feature | Research basis |
|--------|-----------|----------------|
| `arch_expert_reasoning_r1` | `execution_modes` + `transition` (2-phase think/answer) | DeepSeek-R1 (2025), Quiet-STaR (NeurIPS 2024) |
| `arch_expert_titans_memory` | `template.memory` (parametric + test-time update) | Titans (Google 2024-2025) |
| `arch_expert_dual_stream` | `parallel:` monoidal schedule (Mamba ∥ Attention) | Jamba × PaLM generalization |

Presets are starting points, not the ceiling. EulerStack can assemble models
of essentially any size — the schema has no size cap.

## Where EulerStack Fits (End-to-End Pipeline)

EulerStack is deliberately a **narrow tool**: it produces a well-formed, randomly-initialised Hugging Face model. A realistic LLM pipeline looks like this, and EulerStack owns only the first box.

```
  ┌──────────────┐    ┌──────────────┐    ┌───────────────┐    ┌────────────┐    ┌────────┐
  │  EulerStack  │ -> │   Pretrain   │ -> │ Post-training │ -> │  Evaluate  │ -> │  Serve │
  │  (this tool) │    │  your choice │    │  SFT / DPO /  │    │ your suite │    │  your  │
  │              │    │  of trainer  │    │   RLHF / etc. │    │            │    │  stack │
  └──────────────┘    └──────────────┘    └───────────────┘    └────────────┘    └────────┘
        ^
   YAML spec in,
  HF model out
```

Because the output is a standard `PreTrainedModel`, you can pair EulerStack with any training stack you already trust:

- **Pretraining / continued pretraining:** Megatron-LM, NeMo, TorchTitan, Hugging Face Trainer, Composer, Levanter, GPT-NeoX.
- **Fine-tuning (full / LoRA / QLoRA):** PEFT, TRL, Axolotl, Unsloth, LLaMA-Factory.
- **Alignment:** TRL (DPO / PPO / KTO), OpenRLHF.
- **Serving:** vLLM, SGLang, TGI, TensorRT-LLM — any engine that loads `transformers` checkpoints.

This scope separation is intentional. Training is a fast-moving space with strong, well-maintained tools; EulerStack does not try to re-implement any of them. What it does do is give you a **stable, reviewable, reproducible starting point** so that every downstream step operates on an architecture whose shape is explicit and auditable.

A typical workflow:

```bash
# 1. Design and validate an architecture
eulerstack --lang en validate --preset my_model.yml --report

# 2. Export a Hugging Face model directory (random weights)
eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model_hf

# 3. Hand off to your training stack of choice, e.g. with transformers Trainer:
#    model = AutoModelForCausalLM.from_pretrained("./my_model_hf")
#    trainer = Trainer(model=model, train_dataset=..., ...)
#    trainer.train()
```

## Project Layout

```
eulerstack/
├── eulerstack/          # Python package
│   ├── spec/            # Schema, validation, parameter estimation, reports
│   ├── ir/              # Typed intermediate representation + normalizer
│   ├── compiler/        # IR -> runtime config / HF model directory
│   ├── components/      # Attention, Mamba, RetNet, Hyena, MoE, norms, ...
│   ├── blocks/          # Layer templates composed from components
│   ├── assembler/       # Layer-schedule materialisation
│   ├── hf/              # Hugging Face export (config.json, safetensors)
│   ├── cli/             # `eulerstack` command
│   └── i18n/            # 5-language CLI message catalog
├── configs/presets/     # 52 ready-to-compile YAML specs
├── examples/            # Runnable scripts (compile → export → load → generate)
├── tests/               # Unit + smoke tests
└── pyproject.toml
```

## Tutorials

Full, searchable online tutorials are published at:

🌐 **[eulerwa.com/en/products/eulerstack/tutorials/](https://eulerwa.com/en/products/eulerstack/tutorials/)**

The offline copy under [docs/tutorials/en/](docs/tutorials/en/) mirrors the
site and is the best place to start if you prefer to read locally. Key
entry points:

- [Tutorial 0 — Where EulerStack Fits](docs/tutorials/en/00_positioning.md)
  explains why EulerStack is an **Architecture Description Language (ADL)
  for LLMs**, not a training framework.
- [Tutorial 2 — Use Presets](docs/tutorials/en/02_use_presets.md) walks
  through the 53 shipped presets organised in three tiers.
- [Tutorial 10 — Paper → YAML](docs/tutorials/en/10_paper_to_yaml.md)
  ports DeepSeek-V3 / Jamba / DeepSeek-R1 / Titans into YAML through a
  professor/student dialogue.

## Examples

See [examples/](examples/) for runnable scripts:

- `01_compile_and_export.py` — compile a preset and save as an HF model directory.
- `02_load_and_generate.py` — load the exported model with `transformers` and generate.
- `03_architecture_evolution.py` — compare several architectures side by side.

## Testing

```bash
python -m pytest tests/ -v
```

The unit suite covers schema validation, IR normalisation, compilation, parameter estimation, report generation, and CLI behaviour for every bundled preset.

## Contributing

Contributions are welcome. Please open an issue to discuss substantial changes (new mixer types, schema changes, new presets) before sending a PR. For small fixes or clarifications, a PR is fine on its own.

When adding a new component (e.g. a new mixer), the rough checklist is:

1. Implement the block under `eulerstack/components/` or `eulerstack/blocks/`.
2. Register it so the schema accepts it.
3. Add a minimal preset in `configs/presets/` that exercises it.
4. Add tests alongside the existing suite in `tests/`.

## License

Licensed under the **Apache License, Version 2.0**. See [LICENSE](LICENSE) for the full text.

Copyright © 2026 Eulerwa Inc.

## Contact

**Eulerwa Inc.**
🌐 Website: [eulerwa.com](https://eulerwa.com)
📚 Tutorials: [eulerwa.com/en/products/eulerstack/tutorials/](https://eulerwa.com/en/products/eulerstack/tutorials/)
📧 Tech contact: [tech@eulerwa.com](mailto:tech@eulerwa.com)
