|
|
Cheshire3: normalizer Module |
|
|||
| SimpleNormalizer | Base normalizer. | ||
| DataExistsNormalizer | Return '1' if any data exists, otherwise '0' | ||
| TermExistsNormalizer | Un-stoplist anonymizing normalizer. | ||
| CaseNormalizer | Reduce text to lower case | ||
| ReverseNormalizer | Reverse string (eg for left truncation) | ||
| SpaceNormalizer | Reduce multiple whitespace to single space character | ||
| ArticleNormalizer | Remove leading english articles (the, a, an) | ||
| NumericEntityNormalizer | Replace characters matching regular expression with the equivalent numeric character entity | ||
| RegexpNormalizer | Either strip, replace or keep data which matches a given regular expression | ||
| PossessiveNormalizer | Remove trailing 's or s' from words | ||
| IntNormalizer | Turn a string into an integer | ||
| StringIntNormalizer | Turn an integer into a 0 padded string, 12 chrs long | ||
| StoplistNormalizer | Remove words that match a stopword list | ||
| PhraseStemNormalizer | Use a Snowball stemmer to stem multiple words in a phrase (eg from PosPhraseNormalizer) | ||
| StemNormalizer | Use a Snowball stemmer to stem the terms | ||
| DateStringNormalizer | Turns a Date object into ISO8601 format | ||
| RangeNormalizer | Should normalise ranges... | ||
| KeywordNormalizer | Given a string, keyword it with proximity. | ||
| ExactExpansionNormalizer | |||
| WordExpansionNormalizer | |||
| DiacriticNormalizer | Slow implementation of Unicode 4.0 character decomposition. | ||
| Generated by Epydoc 3.0alpha2 on Wed Aug 9 18:09:56 2006 | http://epydoc.sf.net |