Pandasiterator¶
flowtask.components.PandasIterator
¶
PandasIterator
¶
Bases: IteratorBase
PandasIterator
Overview
This component converts data to a pandas DataFrame in an iterator and processes each row.
.. table:: Properties
:widths: auto
+--------------+----------+-----------+------------------------------------------------------------+
| Name | Required | Summary |
+--------------+----------+-----------+------------------------------------------------------------+
| columns | Yes | Names of the columns that we are going to extract. |
+--------------+----------+-----------+------------------------------------------------------------+
| vars | Yes | This attribute organizes names of the columns organized by id. |
+--------------+----------+-----------+------------------------------------------------------------+
| parallelize | No | If True, the iterator will process rows in parallel. Default is False. |
+--------------+----------+-----------+------------------------------------------------------------+
| num_threads | No | Number of threads to use if parallelize is True. Default is 10. |
+--------------+----------+-----------+------------------------------------------------------------+
Returns
-------
This component returns the processed pandas DataFrame after iterating
through the rows and applying the specified jobs.
Example:
```yaml
PandasIterator:
columns:
- formid
- orgid
vars:
form: '{orgid}/{formid}'
```