Metadata-Version: 1.1
Name: log2seq
Version: 0.0.3
Summary: A tool to parse syslog-like messages into word sequences
Home-page: https://github.com/cpflat/log2seq/
Author: Satoru Kobayashi
Author-email: sat@nii.ac.jp
License: BSD 3-Clause "New" or "Revised" License
Description: log2seq
        =======
        
        log2seq is a python package to help parsing syslog-like messages into
        word sequences that is more suitable for further automated analysis. It
        is based on a customizable procedure of rules in order, using regular
        expressions.
        
        Introduction
        ------------
        
        In log analysis, sometimes you may face following format of log
        messages:
        
        ::
        
           Jan  1 12:34:56 host-device1 system[12345]: host 2001:0db8:1234::1 (interface:eth0) disconnected
        
        This message cannot well splitted with str.split or re.split, because
        the usage of ``:`` is not consistent.
        
        log2seq processes this message in multiple steps (in default):
        
        1. Process message header (i.e., timestamp and source hostname)
        2. Split message body into word sequence by standard symbol strings
           (e.g., spaces and brackets)
        3. Fix words that should not be splitted later (e.g., ipv6 addr)
        4. Split words by inconsistent symbol strings (e.g., ``:``)
        
        Following is a sample code:
        
        ::
        
           mes = "Jan  1 12:34:56 host-device1 system[12345]: host 2001:0db8:1234::1 (interface:eth0) disconnected"
        
           import log2seq
           rules = log2seq.load_from_script("./default_parser.py")
           parser = log2seq.init_parser("rules")
        
           d = parser.process_line(mes)
           print(d["words"])
        
        It outputs following sequence.
        
        ::
        
           ['system', '12345', 'host', '2001:0db8:1234::1', 'interface', 'eth0', 'disconnected']
        
        You can see ``:`` in ipv6 addr is left, and other ``:`` are ignored.
        
        To customize parsing rules, see ``log2seq/default_script.py`` .
        
        log2seq also allows rules written in configparser (see
        ``log2seq/data/sample.conf``).
        
        Code
        ----
        
        The source code is available at https://github.com/cpflat/log2seq
        
        License
        -------
        
        3-Clause BSD license
        
        Author
        ------
        
        Satoru Kobayashi
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
