Metadata-Version: 2.4
Name: fileutils-dir
Version: 0.9.0
Summary: Small utilities for listing files in directories
Author: Jatavallabhula Sarat Anirudh
License-Expression: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown

# fileutils-dir: Fluent Python API for Declarative File Discovery and Filtering

fileutils-dir is a lightweight, zero-dependency Python library designed for efficient filesystem traversal and file selection. It provides a chainable interface to filter files by extension, name patterns, semantic categories, and directory depth without modifying the underlying filesystem.

This library is specifically engineered as a feeder tool for data pipelines, automation scripts, and application logic that requires precise control over batch file processing.

## Installation

Install the package via pip:

```bash
pip install fileutils-dir
```

## Design Philosophy

The library follows a declarative approach to file selection:

1.  **Single Entry Point**: All operations begin with the `in_dir()` function.
2.  **Fluent Interface**: Methods are chainable, allowing complex queries to be built incrementally.
3.  **Lazy Evaluation**: Selection logic is stored and only executed when a terminal method (`list()` or `count()`) is invoked.
4.  **Non-Destructive**: The library strictly performs read-only operations on the filesystem structure. It does not create, delete, or modify files.

## Selection Logic and Composability

Selectors are monotonic and composable. Filters applied later in the chain refine the candidate set established by previous methods.

- **Inclusive Filters** (`include_ext`, `include_type`): Restrict the result set to items matching the specified criteria.
- **Exclusive Filters** (`exclude_ext`, `exclude_type`): Remove items matching the specified criteria from the result set.
- **Precedence**: If a file matches both an inclusive and an exclusive filter, the exclusion rule takes precedence.

## API Documentation

### Initializer

#### `in_dir(*paths)`
Initializes a `DirQuery` object. If no paths are provided, it defaults to the current working directory (`"."`). Accepts multiple path arguments to query across several root directories.

### Selection Methods

#### `.name(pattern: str)`
Filters results using a glob-style name pattern (e.g., `"*.py"`, `"test_*"`).

#### `.include_ext(*exts: str)`
Specifies file extensions to include in the result set. Extensions are case-insensitive and can be provided with or without the leading dot.

#### `.exclude_ext(*exts: str)`
Specifies file extensions to exclude from the result set.

#### `.include_type(*types: str)`
Narrows the selection based on semantic categories (e.g., `"image"`, `"code"`, `"data"`). See Semantic Classification for details.

#### `.exclude_type(*types: str)`
Removes specific semantic categories from the selection.

### Traversal and Mode Methods

#### `.recursive()`
Enables recursive traversal through all subdirectories.

#### `.dirs()`
Configures the query to return directory paths instead of file paths.

#### `.show_hidden()`
Includes hidden files and directories (those starting with a dot) in the results.

### Terminal Methods

#### `.list() -> list[str]`
Executes the defined query and returns a list of absolute or relative file paths as strings.

#### `.count() -> int`
Executes the defined query and returns the total number of matched items.

## Semantic Classification

The library supports high-level semantic filtering, mapping common file categories to their respective extensions:

| Category | Associated Extensions |
| :------- | :-------------------- |
| `image` | .jpg, .jpeg, .png, .webp, .bmp, .gif, .tiff |
| `text` | .txt, .md, .rst, .log |
| `pdf` | .pdf |
| `doc` | .doc, .docx, .odt |
| `sheet` | .xls, .xlsx, .ods, .csv |
| `presentation` | .ppt, .pptx, .odp |
| `code` | .py, .js, .ts, .java, .c, .cpp, .h, .go, .rs, .rb, .php, .sh |
| `data` | .json, .yaml, .yml, .xml, .toml |
| `audio` | .mp3, .wav, .flac, .ogg, .aac, .m4a |
| `video` | .mp4, .mkv, .avi, .mov, .webm |
| `archive` | .zip, .tar, .gz, .bz2, .7z, .rar |

## Examples

### Discovering Python Source Files Recursively
```python
from fileutils import in_dir

source_files = (
    in_dir("src")
    .recursive()
    .include_type("code")
    .include_ext("py")
    .list()
)
```

### Counting Non-PNG Image Files
```python
from fileutils import in_dir

image_count = (
    in_dir("assets")
    .include_type("image")
    .exclude_ext("png")
    .count()
)
```

### Retrieving Subdirectories
```python
from fileutils import in_dir

folders = in_dir().dirs().list()
```

## Technical Specifications

- **Python Version**: Requires Python 3.9 or higher.
- **Operating System**: Platform-independent (Windows, macOS, Linux).
- **Dependencies**: Standard library only (pathlib).
- **License**: MIT.
