Metadata-Version: 2.1
Name: ontogpt
Version: 0.3.14
Summary: OntoGPT
License: BSD-3
Author: Chris Mungall
Author-email: cjmungall@lbl.gov
Requires-Python: >=3.9, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: docs
Provides-Extra: gpt4all
Provides-Extra: huggingface
Provides-Extra: recipes
Provides-Extra: textract
Provides-Extra: web
Requires-Dist: Jinja2 (>=3.1.2) ; extra == "web"
Requires-Dist: aiohttp (>=3.8.4)
Requires-Dist: beautifulsoup4 (>=4.11.1)
Requires-Dist: bioc (>=2.0.post5)
Requires-Dist: cachier (>=2.1.0)
Requires-Dist: click (>=8.1.3)
Requires-Dist: dpath (>=2.1.6,<3.0.0)
Requires-Dist: fastapi (>=0.88.0) ; extra == "web"
Requires-Dist: frontend (>=0.0.3,<0.0.4)
Requires-Dist: gilda (>=1.0.0)
Requires-Dist: gpt4 (>=0.0.1)
Requires-Dist: huggingface_hub[huggingface] (>=0.15.1) ; extra == "huggingface"
Requires-Dist: inflect (>=6.0.2)
Requires-Dist: inflection (>=0.5.1)
Requires-Dist: linkml (>=1.5.7,<2.0.0)
Requires-Dist: linkml-owl (>=0.3.0,<0.4.0)
Requires-Dist: llm (>=0.8)
Requires-Dist: llm-gpt4all[gpt4all] (>=0.2) ; extra == "gpt4all"
Requires-Dist: myst-parser[docs] (>=0.18.1) ; extra == "docs"
Requires-Dist: nlpcloud (>=1.0.39)
Requires-Dist: oaklib (>=0.5.28)
Requires-Dist: openai (>=1.10.0,<2.0.0)
Requires-Dist: pydantic (>=2.4.0)
Requires-Dist: recipe-scrapers[recipes] (>=14.35.0) ; extra == "recipes"
Requires-Dist: requests (>=2.31.0,<3.0.0)
Requires-Dist: requests-cache (>=1.2.0)
Requires-Dist: ruamel.yaml (>=0.17.31)
Requires-Dist: sphinx-autodoc-typehints[docs] (>=1.19.4) ; extra == "docs"
Requires-Dist: sphinx-click[docs] (>=4.3.0) ; extra == "docs"
Requires-Dist: sphinx-rtd-theme[docs] (>=1.0.0) ; extra == "docs"
Requires-Dist: sphinx[docs] (>=5.3.0) ; extra == "docs"
Requires-Dist: streamlit (>=1.22.0)
Requires-Dist: textract[textract] ; extra == "textract"
Requires-Dist: tiktoken (>=0.7.0,<0.8.0)
Requires-Dist: toml (>=0.10.2,<0.11.0)
Requires-Dist: urllib3 (<2)
Requires-Dist: uvicorn (>=0.20.0) ; extra == "web"
Requires-Dist: wikipedia (>=1.4.0)
Requires-Dist: wikipedia-api (>=0.5.8)
Description-Content-Type: text/markdown

# OntoGPT

[![DOI](https://zenodo.org/badge/13996/monarch-initiative/ontogpt.svg)](https://zenodo.org/badge/latestdoi/13996/monarch-initiative/ontogpt)
![PyPI](https://img.shields.io/pypi/v/ontogpt)

## Introduction

_OntoGPT_ is a Python package for extracting structured information from text with large language models (LLMs), _instruction prompts_, and ontology-based grounding.

[For more details, please see the full documentation.](https://monarch-initiative.github.io/ontogpt/)

## Quick Start

OntoGPT runs on the command line, though there's also a minimal web app interface (see `Web Application` section below).

1. Ensure you have Python 3.9 or greater installed.
2. Install with `pip`:

    ```bash
    pip install ontogpt
    ```

3. Set your OpenAI API key:

    ```bash
    runoak set-apikey -e openai <your openai api key>
    ```

4. See the list of all OntoGPT commands:

    ```bash
    ontogpt --help
    ```

5. Try a simple example of information extraction:

    ```bash
    echo "One treatment for high blood pressure is carvedilol." > example.txt
    ontogpt extract -i example.txt -t drug
    ```

    OntoGPT will retrieve the necessary ontologies and output results to the command line. Your output will provide all extracted objects under the heading `extracted_object`.

## Web Application

There is a bare bones web application for running OntoGPT and viewing results.

First, install the required dependencies with `pip` by running the following command:

```bash
pip install ontogpt[web]
```

Then run this command to start the web application:

```bash
web-ontogpt
```

NOTE: We do not recommend hosting this webapp publicly without authentication.

## Evaluations

OntoGPT's functions have been evaluated on test data. Please see the full documentation for details on these evaluations and how to reproduce them.

## Related Projects

* [TALISMAN](https://github.com/monarch-initiative/talisman/), a tool for generating summaries of functions enriched within a gene set. TALISMAN uses OntoGPT to work with LLMs.

## Tutorials and Presentations

- Presentation: "Staying grounded: assembling structured biological knowledge with help from large language models" - presented by Harry Caufield as part of the AgBioData Consortium webinar series (September 2023)
  - [Slides](https://docs.google.com/presentation/d/1rMQVWaMju-ucYFif5nx4Xv3bNX2SVI_w89iBIT1bkV4/edit?usp=sharing)
  - [Video](https://www.youtube.com/watch?v=z38lI6WyBsY)
- Presentation: "Transforming unstructured biomedical texts with large language models" - presented by Harry Caufield as part of the BOSC track at ISMB/ECCB 2023 (July 2023)
  - [Slides](https://docs.google.com/presentation/d/1LsOTKi-rXYczL9vUTHB1NDkaEqdA9u3ZFC5ANa0x1VU/edit?usp=sharing)
  - [Video](https://www.youtube.com/watch?v=a34Yjz5xPp4)
- Presentation: "OntoGPT: A framework for working with ontologies and large language models" - talk by Chris Mungall at Joint Food Ontology Workgroup (May 2023)
  - [Slides](https://docs.google.com/presentation/d/1CosJJe8SqwyALyx85GWkw9eOT43B4HwDlAY2CmkmJgU/edit)
  - [Video](https://www.youtube.com/watch?v=rt3wobA9hEs&t=1955s)

## Citation

The information extraction approach used in OntoGPT, SPIRES, is described further in: Caufield JH, Hegde H, Emonet V, Harris NL, Joachimiak MP, Matentzoglu N, et al. Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning. _Bioinformatics_, Volume 40, Issue 3, March 2024, btae104, [https://doi.org/10.1093/bioinformatics/btae104](https://doi.org/10.1093/bioinformatics/btae104).

## Acknowledgements

This project is part of the [Monarch Initiative](https://monarchinitiative.org/). We also gratefully acknowledge [Bosch Research](https://www.bosch.com/research) for their support of this research project.

