Metadata-Version: 2.1
Name: ALLMDEV
Version: 1.2.1
Summary: A simple and efficient python library for fast inference of GGUF Large Language Models.
Author: All Advance AI
Author-email: allmdev@allaai.com
Maintainer: Soham Ghadge
Maintainer-email: soham.ghadge@allaai.com
Keywords: GGUF,GGUF Large Language Model,GGUF Large Language Models,GGUF Large Language Modeling,GGUF Large Language Modeling Library
Description-Content-Type: text/markdown
Requires-Dist: Flask
Requires-Dist: click
Requires-Dist: llama-index
Requires-Dist: llama-cpp-python
Requires-Dist: aiohttp
Requires-Dist: llama-index-llms-llama-cpp
Requires-Dist: huggingface-hub

# ALLM

ALLM is a Python library designed for fast inference of GGUF (Generic Global Unsupervised Features) Large Language Models (LLMs) on both CPU and GPU. It provides a convenient interface for loading pre-trained GGUF models and performing inference using them. This library is ideal for applications where quick response times are crucial, such as chatbots, text generation, and more.

## Features

- **Efficient Inference**: ALLM leverages the power of GGUF models to provide fast and accurate inference.
- **CPU and GPU Support**: The library is optimized for both CPU and GPU, allowing you to choose the best hardware for your application.
- **Simple Interface**: With a straightforward command line support, you can easily load models and perform inference with just a single command.
- **Flexible Configuration**: Customize inference settings such as temperature and model path to suit your needs.

## Installation

You can install ALLM using pip:

```bash
pip install allm
```

## Usage

You can start inference with a simple 'allm-run' command. The command takes name or path, temperature(optional), max new tokens(optional) and additional model kwargs(optional) as arguments.

```bash
allm-run --name model_name_or_path
```

## API

After initialising or downloading the model you can start inference API with a simple 'allm-serve' command. The command starts the API server on the default 127.0.0.1:5000 host. If you want to run the API server on a different port and host, you can customize the apiconfig,txt file in your model directory.

```bash
allm-serve
```


## Supported Model names
Llama2, llama, llama2_chat, Llama_chat, Mistral, Mistral_instruct

