Metadata-Version: 2.1
Name: binhash
Version: 0.2.1
Summary: Algorithm to compress sparse binary data
Home-page: https://github.com/KarthikRevanuru/binhash
Author: Karthik Revanuru , Raghav Kulkarni , Rameshwar Pratap
Author-email: karthik.revanuru@outlook.com
License: UNKNOWN
Description: # Compression Techniques for sparse binary data
        
        ## Prerequisites ##
        * Python 2.7 or higher
        * [NumPy](http://numpy.org)
        * [scikit learn](https://scikit-learn.org/stable/)
        * Libraries: [Pickle], [random], [re]
        
        ## Usage
        ```
        from BinHash import hasher
        corpus = 'path_to_the_folder_containing_documents'
        d = 10000
        k = 500
        myhasher = hasher(corpus, d, k)
        sample_text = "this is a sample text"
        sample_hash = myhasher.hash_text(sample_text)
        ```
        
        ##Citation
        Please cite these papers in your publications if it helps your research 
        ```
        @inproceedings{DBLP:conf/pakdd/PratapSK18,
          author    = {Rameshwar Pratap and
                       Ishan Sohony and
                       Raghav Kulkarni},
          title     = {Efficient Compression Technique for Sparse Sets},
          booktitle = {Advances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia
                       Conference, {PAKDD} 2018, Melbourne, VIC, Australia, June 3-6, 2018,
                       Proceedings, Part {III}},
          pages     = {164--176},
          year      = {2018},
          crossref  = {DBLP:conf/pakdd/2018-3},
          url       = {https://doi.org/10.1007/978-3-319-93040-4\_14},
          doi       = {10.1007/978-3-319-93040-4\_14},
          timestamp = {Tue, 19 Jun 2018 09:13:55 +0200},
          biburl    = {https://dblp.org/rec/bib/conf/pakdd/PratapSK18},
          bibsource = {dblp computer science bibliography, https://dblp.org}
        }
        
        
        @inproceedings{compression,
         author    = {Rameshwar Pratap and
                       Raghav Kulkarni and
        		Ishan Sohony},
          title     = {Efficient Dimensionality Reduction for Sparse Binary Data},
          booktitle = {IEEE International Conference on BIG DATA, Accepted},
          year      = {2018}
        }
        ```
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
