Metadata-Version: 2.4
Name: dc-python-sdk
Version: 1.5.25
Summary: Data Connector Python SDK
Home-page: https://github.com/data-connector/dc-python-sdk
Author: DataConnector
Author-email: josh@dataconnector.com
Project-URL: Bug Tracker, https://github.com/data-connector/dc-python-sdk/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: awslambdaric
Requires-Dist: requests
Requires-Dist: boto3>=1.40.0
Requires-Dist: openai
Provides-Extra: test
Requires-Dist: python-dotenv>=0.20.0; extra == "test"
Requires-Dist: faker>=13.12.0; extra == "test"
Requires-Dist: numpy>=1.21.6; extra == "test"
Requires-Dist: pandas>=1.3.5; extra == "test"
Provides-Extra: ai
Requires-Dist: openai; extra == "ai"
Dynamic: license-file

# Data Connector Python SDK

A comprehensive Python SDK for building robust data connectors with standardized error handling and graceful failure management.

[![PyPI version](https://badge.fury.io/py/dc-python-sdk.svg)](https://badge.fury.io/py/dc-python-sdk)
[![Python versions](https://img.shields.io/pypi/pyversions/dc-python-sdk.svg)](https://pypi.org/project/dc-python-sdk/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Table of Contents

- [Installation](#installation)
- [Quick Start](#quick-start)
- [Error Handling](#error-handling)
- [Error Categories](#error-categories)
- [Best Practices](#best-practices)
- [Examples](#examples)
- [Development](#development)
- [Version Management](#version-management-and-release-process)
- [Contributing](#contributing)

## Installation

### Install from PyPI

```bash
pip install dc-python-sdk
```

### Install from Source

```bash
git clone https://github.com/data-connector/dc-python-sdk.git
cd dc-python-sdk
pip install -e .
```

### Requirements

- Python >= 3.6
- setuptools >= 42

## Quick Start

### Using the error classes

```python
from dc_sdk import errors

# Example: Handling authentication failure
try:
    # Your authentication logic here
    authenticate_user(credentials)
except Exception as e:
    raise errors.AuthenticationError("Invalid credentials provided. Please check your username and password.")

# Example: Handling missing objects
def get_available_objects():
    objects = fetch_objects_from_api()
    if not objects:
        raise errors.NoObjectsFoundError("No tables or objects found for this account. Please ensure your account has accessible data.")
    return objects
```

### Running the local HTTP server

Install the SDK and start the HTTP server that wraps your connector:

```bash
pip install dc-python-sdk

# starts a FastAPI server on port 8000
dc-sdk http
```

Then you can POST to the `/invoke` endpoint:

```bash
curl -X POST http://localhost:8000/invoke \
  -H "Content-Type: application/json" \
  -d '{
    "method": "get_objects",
    "credentials": { "api_key": "YOUR_KEY" },
    "params": {}
  }'
```

### Using the AWS Lambda handler

The SDK also exposes a Lambda-style handler in `dc_sdk.handler`:

```python
from dc_sdk.handler import handler

def lambda_handler(event, context):
    return handler(event, context)
```

## Error Handling

> **Reference**: [Error Handling Documentation](https://data-connector.atlassian.net/wiki/spaces/SDL/pages/113705073/Error+Handling+-+Please+Read)

Error handling is one of the most important ways we can provide users with clear, informative feedback—so they're not left wondering what some random "Error 500" means (because nothing says "fun" like debugging a vague server error at 2 AM).

Our system uses the `dc_sdk` library to gracefully throw errors. This ensures:

- **Users see helpful messages** instead of cryptic stack traces
- **Unhandled errors escalate** as server errors, which automatically generate a bug ticket for the engineering team
- **Errors can be tested, logged, and surfaced consistently** across all connectors

## Error Categories

The SDK provides 21 different error classes organized into logical categories. Each error is designed to handle specific failure scenarios:

### Authentication Errors
- `AuthenticationError` - Invalid credentials, expired tokens
- `WhitelistError` - IP not whitelisted, connection refused

### Object & Field Errors  
- `NoObjectsFoundError` - No tables/objects available
- `GetObjectsError` - Failed to retrieve objects
- `NoFieldsFoundError` - Object exists but has no fields
- `GetFieldsError` - Cannot retrieve field information
- `BadFieldIDError` - Invalid field ID for object
- `BadObjectIDError` - Object ID doesn't exist

### Data Filtering & Mapping Errors
- `FilterDataTypeError` - Invalid data type for filtering
- `FieldDataTypeError` - Unsupported field data type
- `MappingError` - Data mapping failures

### Data Retrieval Errors
- `DataError` - Generic data retrieval failure
- `APIRequestError` - API returned error status
- `APITimeoutError` - Request timeout
- `APIPermissionError` - Insufficient API permissions
- `NoRowsFoundError` - Query successful but no data

### Data Loading Errors
- `LoadDataError` - Failed to load data to destination
- `NotADestinationError` - Connector is read-only
- `UpdateMethodNotSupportedError` - Invalid update method

### Implementation Errors
- `NotImplementedError` - Required method not implemented

For detailed examples and usage patterns, see the [Examples](#examples) section below.

## Best Practices

1. **Always raise the most specific error possible** (don't just raise `Error`)

2. **Add a helpful message**—this is surfaced directly to the user

3. **Differentiate between client-facing and internal errors**:
   - **Client-facing errors** (e.g., `AuthenticationError`, `WhitelistError`) provide actionable guidance
   - **Internal errors** (e.g., `GetObjectsError`) indicate issues within the connector or API

4. **Use descriptive error messages** that help users understand what went wrong and how to fix it

5. **Include context** in error messages when possible (e.g., which field, table, or operation failed)

## Examples

### Basic Usage

```python
from dc_sdk import errors

# Authentication example
def authenticate_user(api_key):
    if not api_key:
        raise errors.AuthenticationError("API key is required")
    
    if not validate_api_key(api_key):
        raise errors.AuthenticationError("Invalid API key. Please check your credentials.")

# Object validation example
def get_table_data(table_name):
    if table_name not in available_tables:
        available = ", ".join(available_tables.keys())
        raise errors.BadObjectIDError(f"Table '{table_name}' not found. Available: {available}")
    
    try:
        return fetch_table_data(table_name)
    except Exception as e:
        raise errors.DataError(f"Failed to retrieve data from '{table_name}': {str(e)}")
```

### Advanced Error Handling

```python
from dc_sdk import errors

class DataConnector:
    def sync_data(self, source_table, destination_table, update_method="append"):
        # Validate update method
        supported_methods = ["append", "replace"]
        if update_method not in supported_methods:
            raise errors.UpdateMethodNotSupportedError(
                f"Update method '{update_method}' not supported. Use: {', '.join(supported_methods)}"
            )
        
        # Check if connector supports destinations
        if not self.is_destination:
            raise errors.NotADestinationError("This connector is read-only and cannot receive data")
        
        try:
            # Perform the sync
            result = self.perform_sync(source_table, destination_table, update_method)
            if not result.success:
                raise errors.LoadDataError(f"Sync failed: {result.error_message}")
        except TimeoutError:
            raise errors.APITimeoutError("Sync operation timed out. Please try again with smaller batches.")
        except PermissionError:
            raise errors.APIPermissionError("Insufficient permissions to write to destination table")
```

## Development

### Setting up Development Environment

```bash
# Clone the repository
git clone https://github.com/data-connector/dc-python-sdk.git
cd dc-python-sdk

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e .
pip install pytest pytest-cov

# Run tests
pytest
```

### Version Management and Release Process

#### Semantic Versioning

This project follows [Semantic Versioning](https://semver.org/) (SemVer):

- **MAJOR** version (X.y.z): Breaking changes that are not backward compatible
- **MINOR** version (x.Y.z): New features that are backward compatible  
- **PATCH** version (x.y.Z): Bug fixes and minor improvements that are backward compatible

**Examples:**
- `1.4.3` → `1.4.4`: Bug fixes or minor improvements
- `1.4.3` → `1.5.0`: New features added (backward compatible)
- `1.4.3` → `2.0.0`: Breaking changes (not backward compatible)

**Current version:** `1.5.0`

#### 1. Update Version Number

Edit the version in `pyproject.toml`:

```toml
[project]
name = "dc-python-sdk"
version = "1.5.0"  # bump this
```

#### 2. Update Package Description (Optional)

While updating the version, you can also update the package description in `pyproject.toml`:

```toml
[project]
name = "dc-python-sdk"
version = "1.5.0"
description = "Data Connector Python SDK for building robust connectors with standardized error handling"
```

#### 3. Commit Changes

```bash
# Stage your changes
git add pyproject.toml README.md

# Commit with a descriptive message
git commit -m "Bump version to 1.4.4 and update documentation"

# Push to main branch
git push origin main
```

#### 4. Create GitHub Release

You have two options for creating a release:

**Option A: Using GitHub Web Interface**

1. Go to your repository on GitHub
2. Click on "Releases" in the right sidebar
3. Click "Create a new release"
4. Fill in the release details:
   - **Tag version**: `v1.4.4` (must match your version in setup.cfg)
   - **Release title**: `v1.4.4 - Description of changes`
   - **Description**: Add release notes describing what changed
5. Click "Publish release"

**Option B: Using GitHub CLI**

```bash
# Install GitHub CLI if not already installed
# Windows: winget install GitHub.cli
# macOS: brew install gh
# Linux: See https://cli.github.com/

# Create and push a tag
git tag v1.4.4
git push origin v1.4.4

# Create the release
gh release create v1.4.4 \
  --title "v1.4.4 - Enhanced Error Handling Documentation" \
  --notes "
  ## What's Changed
  - Updated comprehensive README with installation instructions
  - Added detailed error handling documentation
  - Improved code examples and best practices
  - Enhanced development setup instructions
  
  ## Installation
  \`\`\`bash
  pip install dc-python-sdk==1.4.4
  \`\`\`
  "
```

#### 5. Automated Publishing

Once you create a GitHub release, the automated workflow (`.github/workflows/publish.yml`) will:

1. ✅ Automatically build the package
2. ✅ Run tests (if configured)
3. ✅ Publish to PyPI using the stored `PYPI_API_TOKEN`

**Monitor the workflow:**
- Go to the "Actions" tab in your GitHub repository
- Watch the "Publish Python 🐍 package" workflow run
- Verify successful publication to PyPI

#### 6. Verify Release

```bash
# Check that the new version is available on PyPI
pip install dc-python-sdk==1.4.4

# Or upgrade to the latest version
pip install --upgrade dc-python-sdk

# Verify the version in Python
python -c "import dc_sdk; print('Version installed successfully')"
```

#### Quick Release Commands

For common release scenarios, here are the complete command sequences:

**Patch Release (Bug fixes):**
```bash
# 1. Update version in pyproject.toml (1.5.0 → 1.5.1)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 1.5.1 - Bug fixes and documentation updates"
git push origin main
git tag v1.5.1
git push origin v1.5.1
gh release create v1.5.1 --title "v1.5.1 - Bug Fixes" --notes "Bug fixes and minor improvements"
```

**Minor Release (New features):**
```bash
# 1. Update version in pyproject.toml (1.5.1 → 1.6.0)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 1.6.0 - New error handling features"
git push origin main
git tag v1.6.0
git push origin v1.6.0
gh release create v1.6.0 --title "v1.6.0 - New Features" --notes "Added new error classes and improved documentation"
```

**Major Release (Breaking changes):**
```bash
# 1. Update version in pyproject.toml (1.6.0 → 2.0.0)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 2.0.0 - Breaking changes to error API"
git push origin main
git tag v2.0.0
git push origin v2.0.0
gh release create v2.0.0 --title "v2.0.0 - Major Release" --notes "⚠️ Breaking changes: Updated error class signatures"
```

### Complete Release Checklist

- [ ] Update version in `pyproject.toml`
- [ ] Update any documentation or changelog
- [ ] Test changes locally
- [ ] Commit and push changes
- [ ] Create GitHub release with proper tag (v1.4.4)
- [ ] Monitor GitHub Actions workflow
- [ ] Verify package is published to PyPI
- [ ] Test installation of new version

### Manual Building and Publishing (Alternative)

If you need to manually build and publish (not recommended for production):

```bash
# Build the package
python -m build

# Upload to PyPI (requires credentials)
python -m twine upload --repository pypi dist/*

# Upload to TestPyPI first (recommended for testing)
python -m twine upload --repository testpypi dist/*
```

### Project Structure

```
dc-python-sdk/
├── src/
│   └── dc_sdk/
│       ├── __init__.py
│       ├── errors.py          # All error classes
│       ├── cli.py             # dc-sdk CLI entrypoint
│       ├── server.py          # FastAPI HTTP server for connectors
│       ├── handler.py         # AWS Lambda handler
│       ├── loader.py          # Connector loader utilities
│       ├── mapping.py         # Mapping abstraction used by handler
│       ├── session.py         # In-memory session management for HTTP server
│       ├── types.py           # Shared type definitions
│       └── test_connector.py  # Test utilities
├── setup.cfg                  # Legacy setuptools configuration
├── pyproject.toml             # Project & build configuration
├── README.md                 # This file
├── LICENSE                   # MIT License
└── .github/
    └── workflows/
        └── publish.yml       # Automated PyPI publishing
```

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Add tests for your changes
5. Ensure all tests pass (`pytest`)
6. Commit your changes (`git commit -m 'Add amazing feature'`)
7. Push to the branch (`git push origin feature/amazing-feature`)
8. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Support

- **Bug Reports**: [GitHub Issues](https://github.com/data-connector/dc-python-sdk/issues)
- **Documentation**: [Error Handling Guide](https://data-connector.atlassian.net/wiki/spaces/SDL/pages/113705073/Error+Handling+-+Please+Read)
- **Email**: josh@dataconnector.com

---

## Summary

Use these errors to make the user's life easier (and to keep support tickets sane). The system is designed so that **graceful errors = happy users**, while **unhandled errors = bug tickets**.

Remember: Error handling isn't just about catching exceptions—it's about providing a great user experience even when things go wrong.
