Metadata-Version: 2.4
Name: bkdetect
Version: 0.1.0
Summary: Library for document similarity search
Author-email: Your Name <your@email.com>
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: nltk
Requires-Dist: scikit-learn
Requires-Dist: beautifulsoup4
Requires-Dist: pdfminer.six
Requires-Dist: python-docx
Requires-Dist: pdfplumber
Requires-Dist: numpy
Requires-Dist: scipy
Dynamic: license-file

# bkDetect - Document Similarity Finder

Library for finding source documents by content similarity using TF-IDF and cosine similarity.

Supports multiple file formats:
- PDF, DOCX, TXT, HTML, CSV
- Russian and English languages
- Configurable text processing pipeline

Install with: `pip install bkdetect`
