Metadata-Version: 2.1
Name: fastdf
Version: 0.1.3
Summary: A high-performance DataFrame implementation built on top of NumPy
Home-page: https://github.com/stwrn/fastdf
Author: stwrn
Author-email: stwrn@stwrn.com
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: numpy >=1.18.0
Requires-Dist: pandas >=1.0.0
Provides-Extra: dev
Requires-Dist: pytest >=6.0 ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: black ; extra == 'dev'
Requires-Dist: isort ; extra == 'dev'

# FastDF: High-Performance DataFrame for Python

FastDF is a lightning-fast, memory-efficient DataFrame implementation built on top of NumPy, designed to overcome the performance limitations of pandas for basic data operations.

## 🚀 Key Features

- **Blazing Fast**: Up to 126x faster data access compared to pandas
- **Memory Efficient**: Optimized memory usage with NumPy 2D arrays
- **Pandas-Compatible**: Seamless integration with existing pandas-based projects
- **Minimalist**: Focuses on core functionality for maximum performance

## 🎯 Motivation

FastDF was born out of frustration with the sluggish performance of pandas, especially when dealing with large datasets. After exploring various alternatives that either didn't work as expected or introduced complex syntax changes, we realized that for many data analysis tasks, we only need a handful of core features:

- Named columns
- Efficient slicing
- Basic operations like `shift` and `any`

By leveraging the power of NumPy's 2D arrays and implementing only the essential features, FastDF achieves remarkable performance improvements without sacrificing ease of use.

## ⚡ Performance

In our benchmarks, FastDF has shown:

- 40x faster data access compared to pandas
- Significantly faster slicing operations
- Reduced memory footprint

## 🛠 Installation

   You can install FastDF using pip:

   ```bash
   pip install fastdf
   ```

   For the latest development version, you can install directly from GitHub:

   ```bash
   pip install git+https://github.com/stwrn/fastdf.git
   ```

## 🚦 Quick Start

```python
from fastdf import fdf
import pandas as pd
import numpy as np

# Create a pandas DataFrame
pdf = pd.DataFrame({'A': np.random.rand(1000000), 'B': np.random.rand(1000000)})

# Convert to FastDF
fast_df = fdf.from_pandas(pdf)

# Use FastDF with familiar pandas-like syntax
print(fast_df.loc[0:5, 'A'])
print(fast_df['B'].shift(1))
print(fast_df.any())
```

## 🔄 Compatibility

FastDF is designed to be a drop-in replacement for basic pandas operations. You can easily convert your pandas DataFrame to FastDF and continue using the familiar syntax:

```python
# Your existing pandas code
result = pandas_df.loc[1000:2000]['B']
print(f"Pandas result {result}")

# With FastDF from pandas
fast_df = fdf.from_pandas(pandas_df)
result_fdf = fast_df.loc[1000:2000]['B']
print(f"FastDF result {result_fdf}")

# With FastDF
data = np.random.rand(1000, 5)
columns = ['A', 'B', 'C', 'D', 'E']
fast_df = fdf(data, columns)
print(f"FastDF {fast_df}")
```

## 🤝 Contributing

We welcome contributions to FastDF! Whether it's bug reports, feature requests, or code contributions, please feel free to make a pull request or open an issue.

## 📜 License

FastDF is released under the MIT License. See the LICENSE file for more details.

## 🙏 Acknowledgements

Special thanks to the NumPy and pandas teams for their incredible work, which laid the foundation for this project.

---

FastDF is still in active development. We're excited to see how it can help accelerate your data analysis workflows!
