Metadata-Version: 2.1
Name: vcfpy
Version: 0.13.4
Summary: Python 3 VCF library with good support for both reading and writing
Home-page: https://github.com/bihealth/vcfpy
Author: Manuel Holtgrewe
Author-email: manuel.holtgrewe@bihealth.de
License: MIT license
Keywords: vcfpy
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
License-File: LICENSE
License-File: AUTHORS.rst

=====
VCFPy
=====


.. image:: https://img.shields.io/pypi/v/vcfpy.svg
        :target: https://pypi.python.org/pypi/vcfpy

.. image:: https://img.shields.io/conda/dn/bioconda/vcfpy.svg?label=Bioconda
        :target: https://bioconda.github.io/recipes/vcfpy/README.html

.. image:: https://img.shields.io/travis/bihealth/vcfpy.svg
        :target: https://travis-ci.org/bihealth/vcfpy

.. image:: https://readthedocs.org/projects/vcfpy/badge/?version=latest
        :target: https://vcfpy.readthedocs.io/en/latest/?badge=latest
        :alt: Documentation Status

.. image:: https://api.codacy.com/project/badge/Grade/cfe741307ec34e8fb90dfe37e84a2519
        :target: https://www.codacy.com/app/manuel-holtgrewe/vcfpy?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=bihealth/vcfpy&amp;utm_campaign=Badge_Grade
        :alt: Codacy Analysis

.. image:: https://api.codacy.com/project/badge/Coverage/cfe741307ec34e8fb90dfe37e84a2519
        :alt: Codacy Coverage
        :target: https://www.codacy.com/app/manuel-holtgrewe/vcfpy?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=bihealth/vcfpy&amp;utm_campaign=Badge_Coverage

.. image:: http://joss.theoj.org/papers/edae85d90ea8a49843dbaaa109e47cba/status.svg
        :alt: Publication in The Journal of Open Source Software
        :target: http://joss.theoj.org/papers/10.21105/joss.00085

Python 3 VCF library with good support for both reading and writing

* Free software: MIT license
* Documentation: https://vcfpy.readthedocs.io.


Features
--------

- Support for reading and writing VCF v4.3
- Interface to ``INFO`` and ``FORMAT`` fields is based on ``OrderedDict`` allows for easier modification than PyVCF (also I find this more pythonic)
- Read (and jump in) and write BGZF files just using ``vcfpy``

Why another VCF parser for Python!
----------------------------------

I've been using PyVCF with quite some success in the past.
However, the main bottleneck of PyVCF is when you want to modify the per-sample genotype information.
There are some issues in the tracker of PyVCF but none of them can really be considered solved.
I tried several hours to solve these problems within PyVCF but this never got far or towards a complete rewrite...

For this reason, VCFPy was born and here it is!

What's the State?
-----------------

VCFPy is the result of two full days of development plus some maintenance work later now (right now).
I'm using it in several projects but it is not as battle-tested as PyVCF.

Why Python 3 Only?
------------------

As I'm only using Python 3 code, I see no advantage in carrying around support for legacy Python 2 and maintaining it.
At a later point when VCFPy is known to be stable, Python 2 support might be added if someone contributes a pull request.


=======
History
=======

v0.13.4 (2022-04-13)
--------------------

- Switching to Github Actions for CI
- Fix INFO flag raises TypeError (#146)

v0.13.3 (2020-09-14)
--------------------

- Adding ``Record.update_calls``.
- Making ``Record.{format,calls}`` use list when empty

v0.13.2 (2020-08-20)
--------------------

- Adding ``Call.set_genotype()``.

v0.13.1 (2020-08-20)
--------------------

- Fixed ``Call.ploidy``.
- Fixed ``Call.is_variant``.

v0.13.0 (2020-07-10)
--------------------

* Fixing bug in case ``GT`` describes only one allele.
* Proper escaping of colon and semicolon (or the lack of escaping) in ``INFO`` and ``FORMAT``.

v0.12.2 (2020-04-29)
--------------------

* Fixing bug in case ``GT`` describes only one allele.

v0.12.1 (2019-03-08)
--------------------

* Not warning on ``PASS`` filter if not defined in header.

v0.12.0 (2019-01-29)
--------------------

* Fixing tests for Python >=3.6
* Fixing CI, improving tox integration.
* Applying ``black`` formatting.
* Replacing Makefile with more minimal one.
* Removing some linting errors from flake8.
* Adding support for reading VCF without ``FORMAT`` or any sample column.
* Adding support for writing headers and records without ``FORMAT`` and any sample columns.

v0.11.2 (2018-04-16)
--------------------

* Removing ``pip`` module from ``setup.py`` which is not recommended anyway.

v0.11.1 (2018-03-06)
--------------------

* Working around problem in HTSJDK output with incomplete ``FORMAT`` fields (#127).
  Writing out ``.`` instead of keeping trailing empty records empty.

v0.11.0 (2017-11-22)
--------------------

* The field ``FORMAT/FT`` is now expected to be a semicolon-separated string.
  Internally, we will handle it as a list.
* Switching from warning helper utility code to Python ``warnings`` module.
* Return ``str`` in case of problems with parsing value.

v0.10.0 (2017-02-27)
--------------------

* Extending API to allow for reading subsets of records.
  (Writing for sample subsets or reordered samples is possible through using the appropriate ``names`` list in the ``SamplesInfos`` for the ``Writer``).
* Deep-copying header lines and samples infos on ``Writer`` construction
* Using ``samples`` attribute from ``Header`` in ``Reader`` and ``Writer`` instead of passing explicitely

0.9.0 (2017-02-26)
------------------

* Restructuring of requirements.txt files
* Fixing parsing of no-call ``GT`` fields

0.8.1 (2017-02-08)
------------------

* PEP8 style adjustments
* Using versioneer for versioning
* Using ``requirements*.txt`` files now from setup.py
* Fixing dependency on cyordereddict to be for Python <3.6 instead of <3.5
* Jumping by samtools coordinate string now also allowed

0.8.0 (2016-10-31)
------------------

* Adding ``Header.has_header_line`` for querying existence of header line
* ``Header.add_*_line`` return a ``bool`` no indicating any conflicts
* Construction of Writer uses samples within header and no extra parameter (breaks API)

0.7.0 (2016-09-25)
------------------

* Smaller improvements and fixes to documentation
* Adding Codacy coverage and static code analysis results to README
* Various smaller code cleanup triggered by Codacy results
* Adding ``__eq__``, ``__neq__`` and ``__hash__`` to data types (where applicable)

0.6.0 (2016-09-25
-----------------

* Refining implementation for breakend and symbolic allele class
* Removing ``record.SV_CODES``
* Refactoring parser module a bit to make the code cleaner
* Fixing small typos and problems in documentation

0.5.0 (2016-09-24)
------------------

* Deactivating warnings on record parsing by default because of performance
* Adding validation for ``INFO`` and ``FORMAT`` fields on reading (#8)
* Adding predefined ``INFO`` and ``FORMAT`` fields to ``pyvcf.header`` (#32)

0.4.1 (2016-09-22)
------------------

* Initially enabling codeclimate

0.4.0 (2016-09-22)
------------------

* Exporting constants for encoding variant types
* Exporting genotype constants ``HOM_REF``, ``HOM_ALT``, ``HET``
* Implementing ``Call.is_phased``, ``Call.is_het``, ``Call.is_variant``, ``Call.is_phased``, ``Call.is_hom_ref``, ``Call.is_hom_alt``
* Removing ``Call.phased`` (breaks API, next release is 0.4.0)
* Adding tests, fixing bugs for methods of ``Call``

0.3.1 (2016-09-21)
------------------

* Work around ``FORMAT/FT`` being a string; this is done so in the Delly output

0.3.0 (2016-09-21)
------------------

* ``Reader`` and ``Writer`` can now be used as context manager (with ``with``)
* Including license in documentation, including Biopython license
* Adding support for writing bgzf files (taken from Biopython)
* Adding support for parsing arrays in header lines
* Removing ``example-4.1-bnd.vcf`` example file because v4.1 tumor derival lacks ``ID`` field
* Adding ``AltAlleleHeaderLine``, ``MetaHeaderLine``, ``PedigreeHeaderLine``, and ``SampleHeaderLine``
* Renaming ``SimpleHeaderFile`` to ``SimpleHeaderLine``
* Warn on missing ``FILTER`` entries on parsing
* Reordered parameters in ``from_stream`` and ``from_file`` (#18)
* Renamed ``from_file`` to ``from_stream`` (#18)
* Renamed ``Reader.jump_to`` to ``Reader.fetch``
* Adding ``header_without_lines`` function
* Generally extending API to make it esier to use
* Upgrading dependencies, enabling pyup-bot
* Greatly extending documentation

0.2.1 (2016-09-19)
------------------

* First release on PyPI


