Metadata-Version: 2.4
Name: dbx-sql-runner
Version: 0.1.0
Summary: A lightweight SQL transformation tool for Databricks SQL
Author-email: Sharma <munish7771@gmail.com>
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENCE.txt
Requires-Dist: networkx>=3.0
Requires-Dist: databricks-sql-connector[pyarrow]>=3.0
Requires-Dist: PyYAML>=6.0
Dynamic: license-file

# dbx-sql-runner

A lightweight, library-first SQL transformation tool for Databricks SQL, inspired by DBT.

## Features

- **Simple SQL Models**: Just write `.sql` files. No complex boilerplate.
- **Automated Dependency Management**: Reference other models using `{upstream_model}` and let the runner build the DAG for you.
- **Environment Aware**: Seamlessly switch between Dev and Prod using `profiles.yml` and Environment Variables.
- **Library Design**: Import `dbx_sql_runner` in your Python scripts (great for Airflow/Databricks Jobs) or run it via CLI.

## Installation

### Local Development
To install the project in editable mode:

```bash
pip install -e .
```

### Production
To install the package normally:

```bash
pip install .
```

## Configuration (profiles.yml)
Create a `profiles.yml` file to store your credentials. **Do not commit this file to version control.**

```yaml
server_hostname: "dbc-xxxxxxxx-xxxx.cloud.databricks.com"
http_path: "/sql/1.0/warehouses/xxxxxxxxxxxxxxxx"
access_token: "${DBX_ACCESS_TOKEN}"  # Use env vars for security!
catalog: "my_catalog"
schema: "my_schema"
sources:
    my_source: "prod_catalog.schema.table"
```

## Usage

### 1. CLI (Easiest)
Run your project from the command line. By default, it looks for `profiles.yml` in the current directory.

```bash
# Run with default profile (profiles.yml)
dbx-sql-runner run

# Run with custom profile
dbx-sql-runner run --profile my_config.yml

# Preview execution plan
dbx-sql-runner build
```

### 2. Python (Advanced)
For fine-grained control (e.g., inside a Databricks Job):

```python
from dbx_sql_runner.api import run_project

# Run models in the 'models/' directory using the config from 'profiles.yml'
run_project(models_dir="models", config_path="profiles.yml")
```

## Project Structure
```text
.
├── models/                  # SQL files (.sql)
│   └── example.sql
├── dbx_sql_runner/          # Library source code
│   ├── adapters/            # Database Adapters
│   ├── project.py           # Model Loading & DAG
│   └── runner.py            # Execution Orchestrator
├── profiles.yml             # Configuration (gitignored)
├── pyproject.toml           # Project metadata
└── README.md
```

## Defining Models
Create `.sql` files in your `models/` directory. 
- Use header comments for metadata.
- Use `{upstream_model}` syntax for references (automatically infers dependency).

```sql
-- name: my_table
-- materialized: table
-- partition_by: date, region

SELECT * FROM {source_view}
WHERE id > 100
```
