Metadata-Version: 2.4
Name: finlang
Version: 0.7.3
Summary: FinLang: a deterministic, auditable DSL for financial rules
Author: FinLang Ltd
License-Expression: AGPL-3.0-only
Project-URL: Homepage, https://finlang.io
Project-URL: Repository, https://github.com/FinLang-Ltd/finlang
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Topic :: Office/Business :: Financial
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: check-manifest; extra == "dev"
Requires-Dist: PyYAML; extra == "dev"
Provides-Extra: bench
Requires-Dist: matplotlib; extra == "bench"
Requires-Dist: numpy; extra == "bench"
Provides-Extra: fastio
Requires-Dist: pyarrow; extra == "fastio"
Dynamic: license-file

# FinLang — The Financial Rules Engine

**Deterministic. Auditable. Global.**  
Designed for explainable processing in regulated environments.

[![PyPI version](https://badge.fury.io/py/finlang.svg)](https://badge.fury.io/py/finlang)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![Build Status](https://img.shields.io/badge/build-passing-brightgreen)](https://github.com/FinLang-Ltd/finlang)
[![Python versions](https://img.shields.io/pypi/pyversions/finlang.svg)](https://pypi.org/project/finlang/)

---

## 🌐 Overview

**FinLang** is a domain-specific language (DSL) and high-performance CLI engine for financial transaction processing.  
It replaces opaque machine-learning categorization with **transparent, deterministic rules** — delivering explainability, auditability, and global compatibility.

> **Built for audit-friendly logic and deterministic processing.**  
> A deterministic alternative where explainability and reproducibility matter.

---

## 📝 The FinLang DSL

FinLang rules are human-readable, Git-friendly, and designed for precision.  
The engine processes rules top-to-bottom; the last matching rule sets the category, while flags accumulate.

```fin
# Example: Basic categorization and flagging
rule "GROCERIES: Tesco" {
  match:
    - counterparty ~ "*TESCO*"
  set:
    - category = "Groceries"
    - flags += "Supermarket"
}

# Example: Numeric range and exact match
rule "TRAVEL: High Value Flight" {
  match:
    - counterparty == "BRITISH AIRWAYS"
    - amount in -5000.00 .. -500.00
  set:
    - category = "Travel"
    - flags += "HighValue"
}
```

---

## ⚙️ Key Features (v0.7.2)

| Feature | Description |
|:--|:--|
| **Deterministic DSL** | Human-readable `.fin` rules language — explainable logic, Git-friendly. |
| **High-Performance Engine** | Vectorized core (Pandas + NumPy + PyArrow) — 27K+ rows/sec validated throughput. |
| **Dual Backend** | Standard (`Engine: c`) or FastIO (`Engine: pyarrow`) with automatic fallback. |
| **Growth Loop** | Automated Discover → Suggest → Categorize workflow — 97.8% success on addressable patterns. |
| **Global I18n Support** | US/UK/EU/Commonwealth formats, £ € $ ¥ ₹ stripping, localized decimals/dates/delimiters. |
| **Audit Trail System** | Every decision logged (before/after state diffs); stateless for reproducibility. |
| **CR/DR Semantics** | Case-insensitive CR/DR, accounting negatives `(123.45)`, trailing minus `123.45-`. |
| **Amount Synthesis** | Auto-computes `amount = abs(credit) – abs(debit)` across 9 edge cases. |
| **Strict Parsing** | Locale-aware normalization with configurable thresholds (`--strict-parse`). |
| **Flag Integrity** | Append-only (`flags +=`) with deterministic deduplication. |

---

## 📦 Installation

**Requirements:** Python 3.10—3.14

**From PyPI (Recommended):**
```bash
pip install finlang
```

**With Fast I/O (PyArrow):**
```bash
pip install "finlang[fastio]"
```
*(Enables `--fastio` for accelerated CSV I/O.)*

**From Source (Development):**
```bash
git clone https://github.com/FinLang-Ltd/finlang.git
cd finlang
pip install -e .[fastio]
```

---

## 🚀 Quick Start — The 5-Step Growth Loop

1️⃣ **Initial Categorization**
```bash
finlang --input transactions.csv --output baseline.csv \
  --rules my_rules.fin --include-pack retail,transport
```

2️⃣ **Discover Gaps**
```bash
finlang-discover --input baseline.csv \
  --candidates candidates.csv --all-candidates all_candidates.csv \
  --min-count 5
```

3️⃣ **Suggest Rules (Exact Mode Recommended)**
```bash
finlang-suggest --input candidates.csv --output suggested_rules.fin \
  --rules my_rules.fin --emit-match exact
```

4️⃣ **Merge and Re-run**
```bash
cat my_rules.fin suggested_rules.fin > merged.fin
finlang --input transactions.csv --output improved.csv \
  --rules merged.fin --include-pack retail,transport
```

✅ **Expected Result:** 5–10% coverage improvement; zero duplicates in `exact` mode.

---

## 📊 Performance Benchmarks

Measured with `--audit-mode none` (max throughput).

| Dataset | Test | Rules | Time (s) | Rows/sec | Notes |
|:--|:--|:--:|:--:|:--:|:--|
| 100 K (UK Synthetic) | Growth Loop | 121 | 2.54 | **39,370** ✅ | Baseline |
| 100 K (after Growth Loop) | Growth Loop | 764 | 4.96 | **20,161** ✅ | +6.3× rules → ≈ 2× slower |
| **5M × 50 cols** | Benchmark Harness | — | **187.76** | **26,600** ✅ | High volume validation |

> **v0.7.2 improvement:** 10% faster than v0.6.4 (208s → 188s), +12% throughput.  
> **Audit Overhead:** Enabling `--audit-mode lite/full` **reduces throughput by ≈38%** due to diff calculation; provides full decision provenance.  
>
> **Note:** These figures are validated benchmark results from controlled tests (5M × 50 columns). Actual performance varies depending on dataset, ruleset, and audit mode.  
> See [`docs/benchmarks.md`](docs/benchmarks.md) for details.

---

## 🔐 Cryptographic Integrity Verification (Benchmark)

SHA-256 fingerprint verification benchmarked on large datasets:

| Rows | Full Validation | Engine (FastIO) | Result |
|:--:|:--:|:--:|:--|
| 5M | ~5 min | 133K rows/s | ✅ All fingerprints match |
| 10M | ~10 min | 156K rows/s | ✅ All fingerprints match |
| **20M** | **~20 min** | **159K rows/s** | **✅ All fingerprints match** |

> **What this benchmark validated:** Every row's immutable fields (`date`, `amount`, `counterparty`) were verified via SHA-256 hash before and after engine processing. Zero cross-row contamination detected. Zero data corruption detected.
>
> **Note:** This benchmark was performed in the test suite. SHA-256 verification is not currently part of the standard runtime CLI — it is included for validation purposes and will be available as a CLI flag in a future release.

---

## 🌍 Internationalization Matrix

| Region | Example Number | Date Order | CLI Flags |
|:--|:--:|:--:|:--|
| 🇺🇸 US / 🇨🇦 Canada | 1,234.56 | MM/DD | (defaults) |
| 🇬🇧 UK / 🇦🇺 Commonwealth | 1,234.56 | DD/MM | `--dayfirst` |
| 🇪🇺 Continental Europe | 1.234,56 | DD/MM | `--decimal "," --thousands "." --dayfirst` |
| 🇨🇭 Switzerland | 1'234.56 | DD/MM | `--thousands "'" --dayfirst` |

**Auto-Detection and Normalization:** BOM-safe UTF-8 encodings, `, ; | \t` delimiters, and automatic currency symbol stripping.

---

## 🧠 The Growth Loop Explained

> **Discover → Suggest → Categorize → Repeat**

FinLang's Growth Loop accelerates rule creation through data-driven discovery.

- **Discover** uncategorized counterparties  
- **Suggest** new rules in seconds (1:1 mapping in exact mode)  
- **Merge + Re-run** for incremental coverage gains  
- **Validated Result:** 97.8% success on addressable patterns  
- **ROI:** 8.8 transactions categorized per new rule  

📄 See: [`docs/growth_loop_best_practices.md`](docs/growth_loop_best_practices.md)

---

## 🧾 Known Limitations (v0.7.x)

- ⚠️ `--emit-match fuzzy` (default) uses naive tokenization and may produce broad patterns (e.g. `*PLC*`).   
  → Use `--emit-match exact` for production workflows.  
- ⚠️ Hyphenated/apostrophe names may affect fuzzy matching (< 1% impact).  
- ⚠️ No support for non-Gregorian calendars or non-Western numerals.

---

## 📘 Documentation

- [`docs/release_notes/v0.7.2.md`](docs/release_notes/v0.7.2.md)  
- [`docs/runtime_contract.md`](docs/runtime_contract.md)  
- [`docs/cli_reference.md`](docs/cli_reference.md)  
- [`docs/rulepacks.md`](docs/rulepacks.md)  
- [`docs/benchmarks.md`](docs/benchmarks.md)  
- [`docs/growth_loop_best_practices.md`](docs/growth_loop_best_practices.md)  
- [`docs/amount_synthesis.md`](docs/amount_synthesis.md)  
- [`docs/i18n_examples.md`](docs/i18n_examples.md)  
- [`docs/stateless_processing.md`](docs/stateless_processing.md)

**Command-line help:**
```bash
finlang --help
finlang-discover --help
finlang-suggest --help
```

---

## 🧩 Example CLI Usage

```bash
finlang --input bank.csv --output categorized.csv \
  --rules examples/rules.demo.fin \
  --include-pack retail,transport,subs \
  --fastio --audit audit_log.json --audit-mode lite
```

---

## 📜 License & Commercial Use

FinLang is open source under the **GNU Affero General Public License (AGPL-3.0)**.  
Commercial licenses and enterprise support are available via **FinLang Ltd**.

📧 info@finlang.io  
🌐 https://finlang.io

------

## Contributing
Contributions are welcome! Before submitting a PR, please review and accept our
[Contributor Licence Agreement (CLA)](docs/legal/CLA.md).

---

## 📌 Version Summary

| Component | Version | Status |
|:--|:--|:--|
| Core Engine      | v0.7.2    | ✅ Production-Ready  |
| CLI Suite        | v0.7.2    | ✅ Validated         |
| Discover/Suggest | v0.7.2    | ✅ 97.8% accuracy    |
| Integrity Test   | v0.7.2    | ✅ 20M rows verified |
| Docs             | v0.7.2    | ✅ Complete          |
| Python Support   | 3.10—3.14 | ✅ Tested            |
