Metadata-Version: 2.1
Name: transly
Version: 0.1.2
Summary: Pronunciation and Transliteration module trained on CMU pronouncing dictionary, IIT Bombay and IIT Kharagpur text corpora
Home-page: https://github.com/gitnik17/transly
Author: Nikhil Kothari
Author-email: gitnik17@gmail.com
License: Apache License 2.0
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Requires-Dist: pandas
Requires-Dist: keras (==2.3.1)
Requires-Dist: setuptools
Requires-Dist: tensorflow (==1.15.0)

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
   :target: https://github.com/psf/black
   :alt: Code style

.. image:: https://img.shields.io/badge/License-Apache%202.0-blue.svg
   :target: https://opensource.org/licenses/Apache-2.0
   :alt: License

.. image:: https://img.shields.io/badge/Maintained%3F-yes-green.svg
   :target: https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity
   :alt: Maintenance

.. image:: https://img.shields.io/badge/python-3.above-blue.svg
   :target: https://img.shields.io/badge/python-3.above-blue.svg
   :alt: versions


Transly
=======
Transly is a sequence to sequence Bi-directional LSTM Encoder-Decoder model with Bahdanau Attention
that's trained on the
`CMU pronouncing dictionary`_, `IIT Bombay English-Hindi Parallel Corpus`_
and `IIT Kharagpur transliteration corpus`_.

.. _CMU pronouncing dictionary: http://www.speech.cs.cmu.edu/cgi-bin/cmudict
.. _IIT Bombay English-Hindi Parallel Corpus: http://www.cfilt.iitb.ac.in/iitb_parallel/
.. _IIT Kharagpur transliteration corpus: https://cse.iitkgp.ac.in/resgrp/cnerg/qa/fire13translit/index.html

The *pronunciation module* in Transly can predict pronunciation of any given word *(with an American accent of course!)*

Take any word of any language - just transliterate the word in English (all capitals) and you are good to go.
Be it a new or old, seen or unseen, sensible or insensible word - *Transly can catch'em all!*

Another module in Transly is the *transliteration module*.
It currently supports Hindi to English and English to Hindi transliterations.

Pre-trained models can be found inside the respective trained_models folders. New models can also be trained on custom data.

Installation
============
Use the package manager `pip`_ to install transly

.. _pip: https://pypi.org/project/transly/

.. code-block:: sh

    pip install transly


Usage
=====

Pronunciation
==============
Using the pre-trained pronunciation model

.. code-block:: python

    import transly.pronunciation as tp

    # let's try a hindi word
    # the prediction accent would be American
    QUERY = 'MAKAAN'
    a = tp.load_model(model_path='cmu')
    a.infer(QUERY, separator=" ")
    # use infer_batch function to infer batches
    # use beamsearch function to perform a beam search

    >> 'M AH0 K AA1 N'

**Training a new model on custom data**

.. code-block:: python

    from transly.seq2seq.config import SConfig
    from transly.seq2seq.version0 import Seq2Seq

    config = SConfig(training_data_path=training_data_path, input_mode='character_level', output_mode='word_level')
    s2s = Seq2Seq(config)
    s2s.fit()
    s2s.save_model(path_to_model=model_path, model_file_name=model_file_name)


Training data file should be a csv with two columns, the input and the output

========  ===============
  Input     Output
========  ===============
   AA           AA1
 AABERG     AA1 B ER0 G
 AACHEN     AA1 K AH0 N
AACHENER  AA1 K AH0 N ER0
========  ===============

Transliteration
===============
Hindi to English
----------------
Using the pre-trained model

.. code-block:: python

    import transly.transliteration as tl

    QUERY = 'निखिल'
    a = tl.load_model(model_path='hi2en')
    a.infer(QUERY)
    # use infer_batch function to infer batches
    # use beamsearch function to perform a beam search

    >> 'NIKHIL'


English to Hindi
----------------
Using the pre-trained model

.. code-block:: python

    import transly.transliteration as tl

    QUERY = 'NIKHIL'
    a = tl.load_model(model_path='en2hi')
    a.infer(QUERY)
    # use infer_batch function to infer batches
    # use beamsearch function to perform a beam search

    >> 'निखिल'


**Training a new model on custom data**

.. code-block:: python

    from transly.seq2seq.config import SConfig
    from transly.seq2seq.version0 import Seq2Seq

    config = SConfig(training_data_path=training_data_path)
    s2s = Seq2Seq(config)
    s2s.fit()
    s2s.save_model(path_to_model=model_path, model_file_name=model_file_name)


License
=======
The Python code in this module is distributed with Apache License 2.0

