Metadata-Version: 2.1
Name: snlp
Version: 0.0.2
Summary: Statistical NLP
Home-page: https://github.com/meghdadFar/snlp
Author: meghdadFar
Author-email: meghdad.farahmand@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: pandas (==1.0.3)
Requires-Dist: scikit-learn (==0.22.2.post1)
Requires-Dist: matplotlib (==3.2.1)
Requires-Dist: scipy (==1.4.1)
Requires-Dist: torch (==1.4.0)
Requires-Dist: torchtext (==0.5.0)
Requires-Dist: nltk (==3.5)
Requires-Dist: tqdm (==4.45.0)
Requires-Dist: fasttext (==0.9.1)

# snlp

[![HitCount](http://hits.dwyl.com/meghdadFar/snlp.svg)](http://hits.dwyl.com/meghdadFar/snlp)

Statistical NLP (SNLP): A practical package with statisical natural language processing tools. SNLP is based on statistical and distributional attributes of natural language and hence most of the functionalities are unsupervised.

# Features
- Identifying Multiword Expressions (Collocations) in the corpus. Used for terminology and keyphrase extraction. Can lead to improvement in text classification. 
- Identifying statistically redundant words for filtering. Usually leads to an improvement in document classification. 
## Upcoming Features
- Anamoly Detection. 
- Identifying non-compositional compouds: Can be used for tasks such as profanity/hate-speech detection, and linguistic analysis of a corpus.
# Usage



