Metadata-Version: 2.1
Name: vecx
Version: 0.32.9b1
Summary: Encrypted Vector Database for Secure and Fast ANN Searches
Home-page: https://vectorxdb.ai
Author: LaunchX Labs
Author-email: vineet@launchxlabs.ai
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28.0
Requires-Dist: numpy>=2.2.4
Requires-Dist: msgpack>=1.1.0
Requires-Dist: cffi>=1.17.1

# VectorX - Encrypted Vector Database

VectorX is an encrypted vector database designed for maximum security and speed. Utilizing client-side encryption with private keys, VectorX ensures data confidentiality while enabling rapid Approximate Nearest Neighbor (ANN) searches within encrypted datasets. Leveraging a proprietary algorithm, VectorX provides unparalleled performance and security for applications requiring robust vector search capabilities in an encrypted environment.

## Key Features

- **Client-side Encryption**: Vectors are encrypted using private keys before being sent to the server
- **Fast ANN Searches**: Efficient similarity searches on encrypted vector data
- **Multiple Distance Metrics**: Support for cosine, L2, and inner product distance metrics
- **Metadata Support**: Attach and search with metadata and filters
- **High Performance**: Optimized for speed and efficiency with encrypted data
- **Hybrid Search**: Combine dense and sparse vector search for improved retrieval quality

## Installation

```bash
pip install vecx
```

## Quick Start
":]
\

```python
from vecx.vectorx import VectorX

# Initialize client with your API token
vx = VectorX(token="your-token-here")

# Generate a secure encryption key
encryption_key = vx.generate_key()


# Create a new index
vx.create_index(
    name="my_index",
    dimension=768,  # Your vector dimension
    key=encryption_key,  # Encryption key
    space_type="cosine"  # Distance metric (cosine, l2, ip)
)

# Get index reference
index = vx.get_index(name="my_index", key=encryption_key)

# Insert vectors
index.upsert([
    {
        "id": "doc1",
        "vector": [0.1, 0.2, 0.3, ...],  # Your vector data
        "meta": {"text": "Example document"}
        "filter":{"category": "reference"} # Optional filter
    }
])

# Query similar vectors
results = index.query(
    vector=[0.2, 0.3, 0.4, ...],  # Query vector
    top_k=10,
    filter={"category": {"eq":"reference"}}  # Optional filter
)

# Process results
for item in results:
    print(f"ID: {item['id']}, Similarity: {item['similarity']}")
    print(f"Metadata: {item['meta']}")
```

## Basic Usage

### Initializing the Client

```python
from vecx.vectorx import VectorX

# Production with specific region
vx = VectorX(token="your-token-here")
```

### Managing Indexes

```python
# List all indexes
indexes = vx.list_indexes()

# Create an index with custom parameters
vx.create_index(
    name="my_custom_index",
    dimension=768,
    key=encryption_key,
    space_type="cosine",
    M=16,             # Graph connectivity parameter (default = 16)
    ef_con=128,       # Construction-time parameter (default = 128)
    use_fp16=True     # Use half-precision for storage optimization (default = True)
)

# Delete an index
vx.delete_index("my_custom_index")
```

### Working with Vectors

```python
# Get index reference
index = vx.get_index(name="my_custom_index", key=encryption_key)

# Insert multiple vectors in a batch
index.upsert([
    {
        "id": "vec1",
        "vector": [...],  # Your vector
        "meta": {"title": "First document", "tags": ["important"]}
    },
    {
        "id": "vec2",
        "vector": [...],  # Another vector
        "meta": {"title": "second document", "tags": ["important"]}
        "filter": {"visibility": "public"}  # Optional filter values
    }
])

# Query with custom parameters
results = index.query(
    vector=[...],      # Query vector
    top_k=5,           # Number of results to return
    filter= {"visibility":{"eq":"public"}},   # Filter for matching
    ef=128,            # Runtime parameter for search quality
    include_vectors=True  # Include vector data in results
)

# Delete vectors
index.delete_vector("vec1")
index.delete_with_filter({"visibility":{"eq":"public"}})

# Get a specific vector
vector = index.get_vector("vec1")
```

## Hybrid Search

VectorX now supports hybrid search, combining the strengths of both dense and sparse vector search for enhanced retrieval quality.

### Creating a Hybrid Index

```python
# Create a hybrid index
vx.create_hybrid_index(
    name="my_hybrid_index",
    dimension=768,      # Dimension for dense vectors
    vocab_size=30522,   # Vocabulary size for sparse vectors (default = 30522)
    key=encryption_key,
    space_type="cosine" # Distance metric for dense vectors
)

# Get reference to the hybrid index
hybrid_index = vx.get_hybrid_index(name="my_hybrid_index", key=encryption_key)
```

### Working with Hybrid Vectors

```python
# Insert vectors with both dense and sparse components
hybrid_index.upsert([
    {
        "id": "doc1",
        "vector": [0.1, 0.2, ...],  # Dense vector component
        "sparse_vector": [          # Sparse vector component
            {"index": 5, "value": 0.5},
            {"index": 10, "value": 1.0},
            {"index": 15, "value": 1.5}
        ],
        "meta": {"title": "Hybrid document", "tags": ["important"]}
    }
])

# Perform hybrid search
results = hybrid_index.hybrid_search(
    dense_vector=[0.2, 0.3, ...],   # Dense query vector
    sparse_vector=[                 # Sparse query vector
        {"index": 5, "value": 0.4},
        {"index": 20, "value": 0.8}
    ],
    dense_top_k=10,      # Number of results from dense search
    sparse_top_k=10,     # Number of results from sparse search
    final_top_k=5,       # Number of final results after fusion
    k_rrf=1.0,           # RRF constant for ranking fusion
    include_vectors=True, # Include vectors in results
    filter={"tags": {"eq": "important"}}  # Optional filter
)

# Process hybrid search results
for item in results:
    print(f"ID: {item['id']}, RRF Score: {item['rrf_score']}")
    print(f"Dense Rank: {item['dense_rank']}, Sparse Rank: {item['sparse_rank']}")
    print(f"Metadata: {item['meta']}")

# Delete a vector from both indices
hybrid_index.delete_vector("doc1")

# Delete hybrid index when no longer needed
vx.delete_hybrid_index("my_hybrid_index")
```

## API Reference

### VectorX Class
- `__init__(token=None)`: Initialize with optional API token
- `set_token(token)`: Set API token
- `set_base_url(base_url)`: Set custom API endpoint
- `generate_key()`: Generate a secure encryption key
- `create_index(name, dimension, key, space_type, ...)`: Create a new index
- `create_hybrid_index(name, dimension, vocab_size, key, ...)`: Create a new hybrid index
- `list_indexes()`: List all indexes
- `delete_index(name)`: Delete an index
- `delete_hybrid_index(name)`: Delete a hybrid index
- `get_index(name, key)`: Get reference to an index
- `get_hybrid_index(name, key)`: Get reference to a hybrid index

### Index Class
- `upsert(input_array)`: Insert or update vectors
- `query(vector, top_k, filter, ef, include_vectors)`: Search for similar vectors
- `delete_vector(id)`: Delete a vector by ID
- `delete_with_filter(filter)`: Delete vectors matching a filter
- `get_vector(id)`: Get a specific vector
- `describe()`: Get index statistics and info

### HybridIndex Class
- `upsert(input_array)`: Insert or update hybrid vectors
- `hybrid_search(dense_vector, sparse_vector, ...)`: Perform hybrid search
- `delete_vector(id)`: Delete a vector by ID
- `delete_with_filter(filter)`: Delete vectors matching a filter
- `describe()`: Get hybrid index statistics and info

## Security Considerations

- **Key Management**: Store your encryption key securely. Loss of the key will result in permanent data loss.
- **Client-Side Encryption**: All sensitive data is encrypted before transmission.
