Metadata-Version: 2.4
Name: cltk
Version: 2.0.3
Summary: The Classical Language Toolkit
License-File: LICENSE
Keywords: nlp,ai,nltk,latin,greek,sanskrit,hebrew,arabic,chinese
Author: Kyle P. Johnson
Author-email: kyle@kyle-p-johnson.com
Requires-Python: >=3.13
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Provides-Extra: ollama
Provides-Extra: openai
Provides-Extra: stanza
Requires-Dist: colorama (>=0.4.6,<0.5.0)
Requires-Dist: httpx (<0.29) ; extra == "ollama"
Requires-Dist: numpy (>=2.3.2,<3.0.0)
Requires-Dist: ollama (>=0.3.0,<1.0.0) ; extra == "ollama"
Requires-Dist: openai (>=1.101.0,<2.0.0) ; extra == "openai"
Requires-Dist: pydantic (>=2.11.7,<3.0.0)
Requires-Dist: python-dotenv (>=1.1.1,<2.0.0)
Requires-Dist: stanza (>=1.8.2,<2.0.0) ; extra == "stanza"
Requires-Dist: tqdm (>=4.67.1,<5.0.0)
Project-URL: Documentation, https://docs.cltk.org
Project-URL: Homepage, http://cltk.org
Project-URL: Repository, https://github.com/cltk/cltk
Description-Content-Type: text/markdown

The Classical Language Toolkit (CLTK) is a Python library offering natural language processing (NLP) for pre-modern languages.

## Installation

For the CLTK's latest version:

```bash
pip install cltk
```

Optional extras

- GenAI (OpenAI-backed annotation):

```bash
pip install "cltk[openai]"
```

- Stanza (discriminative NLP backends powered by Stanford Stanza):

```bash
pip install "cltk[stanza]"
```

You can combine extras, for example:

```bash
pip install "cltk[openai,stanza]"

# or include local LLM support as well
pip install "cltk[openai,stanza,ollama]"
```

- Local LLMs via Ollama:

Install the optional extra and ensure an Ollama server is running locally:

```bash
pip install "cltk[ollama]"
```

By default, when using `backend='ollama'`, CLTK uses the model `llama3.1:8b`. To choose a different local model, pass the `model` parameter to `NLP(...)`, e.g. `qwen2.5:14b`, `gemma2:27b`, `llama3.1:70b`, or any Ollama model string.

### Choosing a model

- OpenAI backend (GenAI in the cloud):

```python
from cltk.nlp import NLP

# Default model is "gpt-5-mini" when backend='openai'
nlp = NLP('lati1261', backend='openai')

# Choose a specific model
nlp_big = NLP('lati1261', backend='openai', model='gpt-5')

# Requires OPENAI_API_KEY to be set in the environment
# (e.g., via a .env file or shell env var)
```

- Ollama backend (local LLMs):

```python
from cltk.nlp import NLP

# Default model is "llama3.1:8b" when backend='ollama'
nlp_local = NLP('lati1261', backend='ollama')

# Choose a specific local model (any installed/pullable Ollama model)
nlp_qwen = NLP('lati1261', backend='ollama', model='qwen2.5:14b')

# To use the hosted Ollama Cloud, set OLLAMA_CLOUD_API_KEY
# and choose backend='ollama-cloud'. The same model strings apply.
```

For more information, see [Installation docs](https://docs.cltk.org/en/latest/installation.html) or, to install from source, [Development](https://docs.cltk.org/en/latest/development.html).

Pre-1.0 software remains available on the [branch v0.1.x](https://github.com/cltk/cltk/tree/v0.1.x) and docs at <https://legacy.cltk.org>. Install it with `pip install "cltk<1.0"`.

## Documentation

Documentation at <https://docs.cltk.org>.

## Citation

When using the CLTK, please cite [the following publication](https://aclanthology.org/2021.acl-demo.3), including the DOI:

Johnson, Kyle P., Patrick J. Burns, John Stewart, Todd Cook, Clément Besnier, and William J. B. Mattingly. "The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages." In *Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations*, pp. 20-29. 2021. 10.18653/v1/2021.acl-demo.3

The complete BibTeX entry:

```bibtex
@inproceedings{johnson-etal-2021-classical,
    title = "The {C}lassical {L}anguage {T}oolkit: {A}n {NLP} Framework for Pre-Modern Languages",
    author = "Johnson, Kyle P.  and
      Burns, Patrick J.  and
      Stewart, John  and
      Cook, Todd  and
      Besnier, Cl{\'e}ment  and
      Mattingly, William J. B.",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-demo.3",
    doi = "10.18653/v1/2021.acl-demo.3",
    pages = "20--29",
    abstract = "This paper announces version 1.0 of the Classical Language Toolkit (CLTK), an NLP framework for pre-modern languages. The vast majority of NLP, its algorithms and software, is created with assumptions particular to living languages, thus neglecting certain important characteristics of largely non-spoken historical languages. Further, scholars of pre-modern languages often have different goals than those of living-language researchers. To fill this void, the CLTK adapts ideas from several leading NLP frameworks to create a novel software architecture that satisfies the unique needs of pre-modern languages and their researchers. Its centerpiece is a modular processing pipeline that balances the competing demands of algorithmic diversity with pre-configured defaults. The CLTK currently provides pipelines, including models, for almost 20 languages.",
}
```

## License

Copyright (c) 2014–present Kyle P. Johnson under the [MIT License](https://github.com/cltk/cltk/blob/master/LICENSE).

