Metadata-Version: 2.4
Name: cortex-llm
Version: 1.0.18
Summary: GPU-Accelerated LLM Terminal for Apple Silicon
Home-page: https://github.com/faisalmumtaz/Cortex
Author: Cortex Development Team
License-Expression: MIT
Project-URL: Homepage, https://github.com/faisalmumtaz/Cortex
Project-URL: Bug Tracker, https://github.com/faisalmumtaz/Cortex/issues
Project-URL: Documentation, https://github.com/faisalmumtaz/Cortex/wiki
Keywords: llm,gpu,metal,mps,apple-silicon,ai,machine-learning,terminal,mlx,pytorch
Platform: darwin
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: MacOS
Classifier: Environment :: Console
Classifier: Environment :: GPU
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.1.0
Requires-Dist: mlx>=0.30.4
Requires-Dist: mlx-lm>=0.30.5
Requires-Dist: transformers>=4.36.0
Requires-Dist: safetensors>=0.4.0
Requires-Dist: huggingface-hub>=0.19.0
Requires-Dist: accelerate>=0.25.0
Requires-Dist: llama-cpp-python>=0.2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: rich>=13.0.0
Requires-Dist: psutil>=5.9.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: packaging>=23.0
Requires-Dist: requests>=2.31.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.8.0; extra == "dev"
Provides-Extra: optional
Requires-Dist: sentencepiece>=0.1.99; extra == "optional"
Requires-Dist: auto-gptq>=0.7.0; extra == "optional"
Requires-Dist: autoawq>=0.2.0; extra == "optional"
Requires-Dist: bitsandbytes>=0.41.0; extra == "optional"
Requires-Dist: optimum>=1.16.0; extra == "optional"
Requires-Dist: torchvision>=0.16.0; extra == "optional"
Requires-Dist: torchaudio>=2.1.0; extra == "optional"
Dynamic: home-page
Dynamic: license-file
Dynamic: platform
Dynamic: requires-python

# Cortex

GPU-accelerated local LLMs on Apple Silicon, built for the terminal.

![Cortex preview](docs/assets/cortex-llm.png)

Cortex is a fast, native CLI for running and fine-tuning LLMs on Apple Silicon using MLX and Metal. It automatically detects chat templates, supports multiple model formats, and keeps your workflow inside the terminal.

## Highlights

- Apple Silicon GPU acceleration via MLX (primary) and PyTorch MPS
- Multi-format model support: MLX, GGUF, SafeTensors, PyTorch, GPTQ, AWQ
- Built-in LoRA fine-tuning wizard
- Chat template auto-detection (ChatML, Llama, Alpaca, Gemma, Reasoning)
- Conversation history with autosave and export

## Quick Start

```bash
pipx install cortex-llm
cortex
```

Inside Cortex:

- `/download` to fetch a model from HuggingFace
- `/model` to load or manage models
- `/status` to confirm GPU acceleration and current settings

## Installation

### Option A: pipx (recommended)

```bash
pipx install cortex-llm
```

### Option B: from source

```bash
git clone https://github.com/faisalmumtaz/Cortex.git
cd Cortex
./install.sh
```

The installer checks Apple Silicon compatibility, creates a venv, installs dependencies from `pyproject.toml`, and sets up the `cortex` command.

## Requirements

- Apple Silicon Mac (M1/M2/M3/M4)
- macOS 13.3+
- Python 3.11+
- 16GB+ unified memory (24GB+ recommended for larger models)
- Xcode Command Line Tools

## Model Support

Cortex supports:

- **MLX** (recommended)
- **GGUF** (llama.cpp + Metal)
- **SafeTensors**
- **PyTorch** (Transformers + MPS)
- **GPTQ** / **AWQ** quantized models

## Advanced Features

- **Dynamic quantization fallback** for PyTorch/SafeTensors models that do not fit GPU memory (INT8 preferred, INT4 fallback)
  - `docs/dynamic-quantization.md`
- **MLX conversion with quantization recipes** (4/5/8-bit, mixed precision) for speed vs quality control
  - `docs/mlx-acceleration.md`
- **LoRA fine-tuning wizard** for local adapters (`/finetune`)
  - `docs/fine-tuning.md`
- **Template registry and auto-detection** for chat formatting (ChatML, Llama, Alpaca, Gemma, Reasoning)
  - `docs/template-registry.md`
- **Inference engine details** and backend behavior
  - `docs/inference-engine.md`
- **Tooling (experimental, WIP)** for repo-scoped read/search and optional file edits with explicit confirmation
  - `docs/cli.md`

**Important (Work in Progress):** Tooling is actively evolving and should be considered experimental. Behavior, output format, and available actions may change; tool calls can fail; and UI presentation may be adjusted. Use tooling on non-critical work first, and always review any proposed file changes before approving them.

## Configuration

Cortex reads `config.yaml` from the current working directory. For tuning GPU memory limits, quantization defaults, and inference parameters, see:

- `docs/configuration.md`

## Documentation

Start here:

- `docs/installation.md`
- `docs/cli.md`
- `docs/model-management.md`
- `docs/troubleshooting.md`

Advanced topics:

- `docs/mlx-acceleration.md`
- `docs/inference-engine.md`
- `docs/dynamic-quantization.md`
- `docs/template-registry.md`
- `docs/fine-tuning.md`
- `docs/development.md`

## Contributing

Contributions are welcome. See `docs/development.md` for setup and workflow.

## License

MIT License. See `LICENSE`.

---

Note: Cortex requires Apple Silicon. Intel Macs are not supported.
