Metadata-Version: 2.1
Name: mockpipe
Version: 0.0.1
Summary: Dummy data generator focusing on customisability and maintained relationships for mocking data pipelines
Author: BenskiBoy
Project-URL: Source, https://github.com/BenskiBoy/mockpipe
Project-URL: Bug Tracker, https://github.com/BenskiBoy/mockpipe/issues
Project-URL: Changes, https://github.com/BenskiBoy/mockpipe/blob/master/CHANGELOG.md
Keywords: mocking data faker testing generator pipeline pipe
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: Utilities
Requires-Python: >=3.8
License-File: LICENSE
Requires-Dist: black==24.10.0
Requires-Dist: click==8.1.7
Requires-Dist: duckdb==1.0.0
Requires-Dist: Faker==26.0.0
Requires-Dist: faker-commerce==1.0.4
Requires-Dist: pytest==8.3.2
Requires-Dist: pytest-cov==5.0.0
Requires-Dist: PyYAML==6.0.1

# mockpipe
There's a lot of sample databases out there and lots of ways to generate some dummy data (i.e. faker, which this project uses), but i couldn't find much in the way of dynamically generating realistic data that could be used to generate some scenarios that one might actually find coming out of a operational systems CDC feed.
This is an attampt to create a utility/library that can be used to setup some .

From a yaml config a set of sample tables can be defined, using dummy default values for any newly generated rows along with a set of actions that can be performed with a certain frequency.

The dummy values actually invoke the Faker library to generate somewhat realistic entries, along with support for other data types that may refer to existing values within the table or other tables so that relationships can be maintained.

Data is persisted onto a duckdb database so the outputs can be persisted between executions and support any other analysis/queries you may want to do.
