Metadata-Version: 2.4
Name: datafaker_ai
Version: 0.1.3
Summary: Generate synthetic datasets from natural language using Gemini 1.5 Flash
Home-page: https://github.com/ahsanraza1457/deepfaker_ai
Author: Ahsan Raza
Author-email: your.email@example.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: pandas
Requires-Dist: python-dotenv
Requires-Dist: google-generativeai
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# 🧬 Gemini Dataset Generator

Generate synthetic datasets from natural language descriptions using Google's Gemini 1.5 Flash model. Ideal for data science, machine learning prototyping, and testing workflows with customizable, structured synthetic data.

---

## ✨ Features

- ⚡ Powered by Gemini 1.5 Flash (fast + cost-effective)
- 🧠 Natural language prompt → structured data
- 📦 Output in `pandas`, `CSV`, and `JSON`
- 🧪 Optional edge case injection for testing
- 🧾 Save datasets to local disk
- 🔐 Easy API key setup using `.env`

---

## 📦 Installation

Install from PyPI:

```bash
pip install gemini-dataset-generator


Or clone manually:


git clone https://github.com/ahsanraza1457/deepfaker_ai.git
cd deepfaker_ai
pip install -r requirements.txt

🔐 Setup
Create a .env file in the root directory of your project
GEMINI_API_KEY=your_google_generativeai_api_key


🚀 Usage
from generator import generate_dataset

df = generate_dataset(
    description="Customer name, email, age, and signup date",
    num_samples=50
)

print(df.head())



Save Output as CSV or JSON
generate_dataset(
    description="IoT device logs with timestamp, device_id, temperature",
    num_samples=100,
    save_as='csv'  # Options: 'csv', 'json', 'both'
)



🗂 Project Structure
├── generator/
│   ├── __init__.py
│   ├── generator.py           # Main interface
│   ├── formatter.py           # Formats model output
│   ├── prompts.py             # Builds prompt from description
│   ├── edge_case_handler.py   # Injects edge cases
│
├── .env                      
├── README.md
├── requirements.txt
└── setup.py / pyproject.toml  
