Metadata-Version: 2.4
Name: gemcli
Version: 2.0.3
Summary: GemCLI: Autonomous AI coding agent with local web UI, Telegram remote control, and unlimited Gemini access — all from your terminal
Author: 89P13
License: MIT
Project-URL: Homepage, https://pypi.org/project/gemcli/
Keywords: gemini,ai,cli,terminal,coding-assistant,automation,autonomous-agent,web-ui,telegram-bot,code-generation,system-commands,image-generation,git-integration
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: Topic :: Software Development :: Code Generators
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: gemini-webapi>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: questionary>=2.0.0
Requires-Dist: prompt-toolkit>=3.0.0
Requires-Dist: requests>=2.28.0
Requires-Dist: websocket-client>=1.6.0
Requires-Dist: pywin32>=306; sys_platform == "win32"
Requires-Dist: pycryptodome>=3.20.0
Requires-Dist: selenium>=4.15.0
Provides-Extra: telegram
Requires-Dist: python-telegram-bot>=20.0; extra == "telegram"
Provides-Extra: cookies
Requires-Dist: browser_cookie3>=0.19.0; extra == "cookies"
Provides-Extra: all
Requires-Dist: python-telegram-bot>=20.0; extra == "all"
Requires-Dist: browser_cookie3>=0.19.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: browser_cookie3>=0.19.0; extra == "dev"
Dynamic: license-file

# GemCLI v2.1 — Autonomous AI Agent for Your Terminal

[![Python](https://img.shields.io/badge/python-3.8%2B-3776ab?style=flat-square&logo=python&logoColor=white)](https://python.org)
[![PyPI](https://img.shields.io/pypi/v/gemcli?style=flat-square&color=4f8ff7)](https://pypi.org/project/gemcli/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green?style=flat-square)](LICENSE)
[![Built with Rich](https://img.shields.io/badge/built%20with-Rich-purple?style=flat-square)](https://github.com/Textualize/rich)
[![Visitors](https://visitor-badge.laobi.icu/badge?page_id=Aniketh78.Gemini-Terminal-Tool-GEM-CLI&left_text=%20)](https://github.com/Aniketh78/Gemini-Terminal-Tool-GEM-CLI)

An autonomous AI agent that leverages Google's consumer Gemini — unlimited code generation, file operations, system automation, **autonomous browser control**, a local web interface, and Telegram remote control. All from your terminal. No API key. No server. Everything stays on your machine.

---

## What's New in v2.1

**v2.1 introduces autonomous browser control.** GemCLI can now open a browser, navigate websites, fill forms, click buttons, and complete web tasks — all driven by AI.

| Feature | Details |
|:--------|:--------|
| **🌐 Browser Control** | AI-driven browser automation — navigate sites, fill forms, do quizzes, click buttons, manage tabs. Uses Playwright with persistent login sessions |
| **Local Web UI** | A bundled browser-based interface served on `localhost` — dark/light mode, real-time session sync, analytics dashboard, session management |
| **Modular Architecture** | Refactored from a single 3,500-line file into a clean `gemcli/` package with 15 focused modules |
| **Unified Agent Mode** | Chat, System Agent, and AutoBot merged into a single intelligent agent that reads, writes, and executes autonomously |
| **Undo System** | Every file change is backed up. Run `/undo` to revert instantly — no data loss |
| **Session Management** | Persistent conversation history with titles, search, export, and analytics per session |
| **Cookie Auto-Capture** | Selenium-based automatic browser cookie extraction — no manual DevTools needed |
| **Telegram Bridge** | Full remote control via Telegram bot — send commands, receive live output on your phone |
| **Inline Diff Preview** | See exactly what changed before any file is modified, right in the terminal |

---

## Architecture

```
gemcli/
├── cli.py            Entry point, session selection, authentication flow
├── agent.py          Unified chat loop — handles all user interactions
├── autobot.py        Autonomous JSON action executor (cmd, create_file, read_file, etc.)
├── backup.py         File backup and undo system
├── browser.py        🌐 Autonomous browser agent — Playwright-based web control
├── client.py         Gemini client initialization
├── config.py         Centralized theme, preferences, and console
├── context.py        Workspace context builder for AI prompts
├── cookies.py        Cookie management — manual, saved, and auto-capture via Selenium
├── git_ops.py        Git integration — status, commit, push, branch selection
├── history.py        Session history — logging, listing, search, export, analytics
├── mark_page.js      DOM element annotator for browser agent
├── remote_monitor.py Telegram bridge monitor loop for the terminal
├── settings.py       Interactive settings menu — theme, git, telegram, image config
├── ui.py             Rich-based terminal UI — banner, spinner, animated status
├── web.py            Local HTTP server — REST API + static file serving
├── workspace.py      File search and workspace scanning utilities
└── web_assets/
    ├── index.html    Single-page web interface shell
    ├── styles.css    Full design system — dark/light themes, responsive layout
    └── app.js        Client-side logic — sessions, chat, analytics, theme toggle
```

---

## Installation

```bash
pip install gemcli
```

### Optional Extras

```bash
# Browser control (autonomous web agent)
pip install playwright
playwright install chromium

# Telegram remote control
pip install gemcli[telegram]

# Auto cookie capture from browser
pip install gemcli[cookies]

# Everything
pip install gemcli[all]
```

### Requirements

- Python 3.8+
- A Google account signed into [gemini.google.com](https://gemini.google.com)
- Git (optional — for `/commit`, `/push`, `/status`)

---

## Quick Start

```bash
gemcli
```

1. **Connect to Gemini** — select this from the main menu
2. **Authenticate** — paste cookies manually, or use auto-capture (Selenium grabs them from your browser)
3. **Pick a session** — start a new chat or continue a previous one
4. **Start building** — ask GemCLI to create files, run commands, explain code, or generate images

---

## Authentication

GemCLI uses browser cookie authentication. **No API key required.** Your cookies are stored locally in `.gemini_cookies.json` and never leave your machine.

### Option A: Auto-Capture (Easiest)

GemCLI can extract cookies automatically using Selenium:

1. Run `gemcli` → Connect to Gemini
2. Select **"Auto-capture from browser"** when prompted
3. A browser window opens, you log in, and cookies are captured automatically

### Option B: Manual

1. Visit [gemini.google.com](https://gemini.google.com) and log in
2. Open DevTools (`F12`) → **Application** tab → **Cookies** → `https://gemini.google.com`
3. Copy `__Secure-1PSID` (required) and `__Secure-1PSIDTS` (recommended)
4. Paste into GemCLI when prompted

### Privacy & Security

| Question | Answer |
|:---------|:-------|
| Where are cookies stored? | Locally in `.gemini_cookies.json` — your working directory |
| Do cookies leave my machine? | **No** — sent only to Google's Gemini servers |
| Is there a remote server? | **No** — 100% client-side |
| Should I share my cookies? | **Never** — treat them like passwords |

---

## Slash Commands

All commands work inside any session:

| Command | Description |
|:--------|:------------|
| `/help` | Show all available commands |
| `/exit` | End the session and return to session list |
| `/clear` | Clear the terminal screen |
| `/undo [N]` | Revert the last N file change-sets (default: 1) |
| `/history` | View file modification timeline |
| `/recall <keyword>` | Search past conversations by keyword |
| `/export` | Export the current session to a Markdown file |
| `/summary` | Generate an AI-powered session recap |
| `/analytics` | Show token usage, request counts, and backup stats |
| `/status` | Display git repository status |
| `/commit` | Create a commit with an AI-generated message |
| `/push` | Push commits to the remote repository |
| `/view <path>` | Read and display a file with syntax highlighting |
| `/image <prompt>` | Generate an AI image from a text description |

---

## The Agent

GemCLI v2.1 uses a unified autonomous agent. Every message you send goes through the same intelligent pipeline:

1. **Understand** — the agent reads your prompt and decides what to do
2. **Act** — it can run shell commands, create/edit files, read files, search the workspace, open URLs, or **control a browser**
3. **Iterate** — it loops through actions until the task is complete (up to 50 steps)
4. **Report** — it gives you a summary of what it did

The agent responds in a structured JSON format internally, but you only see clean human-readable output. All file changes are backed up automatically — run `/undo` at any time.

### What the Agent Can Do

| Action | Description |
|:-------|:------------|
| `cmd` | Execute any shell command (non-destructive) |
| `create_file` | Create or overwrite a file (with automatic backup) |
| `read_file` | Read file contents into context |
| `search` | Search the workspace by filename pattern |
| `cd` | Change the working directory |
| `open` | Open a URL in the browser or launch a file |
| `browser_navigate` | 🌐 Open a URL in the AI-controlled browser |
| `browser_interact` | 🌐 Click, type, or scroll on page elements |
| `browser_list_tabs` | 🌐 List all open browser tabs |
| `browser_switch_tab` | 🌐 Switch to a specific tab |
| `browser_key_press` | 🌐 Press keyboard keys (Enter, Tab, Escape, etc.) |
| `done` | Signal task completion |

---

## Web Interface

GemCLI includes a local web UI served on `localhost`. No external dependencies — it's bundled as static HTML/CSS/JS inside the Python package.

### Launching

From the CLI session menu, select **"Open Web Version"**. A browser tab opens automatically.

Or, if the server is already running, visit: `http://127.0.0.1:8765`

### Features

- **Session sidebar** — browse, switch, and delete sessions
- **Real-time sync** — messages from CLI sessions appear in the web UI automatically (polled every 2.5s)
- **Dark / Light mode** — toggle with the theme button, preference saved to `localStorage`
- **Analytics dashboard** — token usage, request counts, backup stats per session and overall
- **Remote Control tab** — monitor Telegram bridge events
- **Help tab** — quick reference for all slash commands
- **Chat interface** — send messages, see typing indicators, clean bubble layout
- **Session deletion** — remove sessions from the sidebar without affecting analytics data

### Design

The web UI uses Space Grotesk, DM Sans, and JetBrains Mono. Blue accent palette. Fully responsive. No frameworks, no build step — pure HTML, CSS, and vanilla JavaScript.

---

## 🌐 Browser Control

GemCLI can autonomously control a browser to interact with websites — fill forms, click buttons, navigate pages, do quizzes, sign up for services, and more.

### How It Works

1. **You ask** — "Go to GitHub and star this repo" or "Fill out the Google Form at this link"
2. **GemCLI launches a browser** — a visible Chromium window opens (powered by Playwright)
3. **The AI reads the page** — it extracts all interactive elements (buttons, inputs, links) as numbered items
4. **The AI acts** — it decides which element to click or type into, and executes the action
5. **Repeat** — after each action, the page state is re-read and the AI decides the next step

### Setup

```bash
pip install playwright
playwright install chromium
```

### Features

- **Persistent logins** — cookies are saved in `%APPDATA%/gemcli/browser_profile/` (Windows) or `~/.config/gemcli/browser_profile/` (Mac/Linux). Log in once, stay logged in.
- **DOM-based intelligence** — the AI reads actual HTML elements, not screenshots. Fast, reliable, and works with any site.
- **Invisible to users** — no overlays or visual clutter on the page. The AI reads the DOM silently 
- **Tab management** — the AI can list tabs, switch between them, and open new ones.
- **Keyboard control** — press Enter, Tab, Escape, or any key.
- **No conflicts** — runs in its own Chromium instance, doesn't interfere with your personal browser.

### Example Prompts

```
"Open YouTube and search for lofi music"
"Go to my Gmail and read the first email"
"Navigate to this Google Form and fill it out with my name Aniketh"
"Open Amazon and add wireless earbuds to my cart"
"Go to GitHub and create a new repository called test-project"
```

---

## Telegram Remote Control

Control GemCLI from your phone via a Telegram bot.

### Setup

1. Create a bot via [@BotFather](https://t.me/BotFather) on Telegram
2. Get your Chat ID from [@userinfobot](https://t.me/userinfobot)
3. In GemCLI: main menu → **Settings** → **Telegram Remote Control**
4. Enter your bot token and chat ID
5. Enable the bridge

### Usage

Once connected, you can:

- Send text messages to the bot — they're executed as GemCLI prompts
- Receive real-time output from command execution
- Monitor task progress from anywhere
- View bridge events in the web UI's Remote tab

Install the Telegram extra: `pip install gemcli[telegram]`

---

## Undo System

Every file modification made by the agent is backed up before changes are applied.

```
/undo        # Revert the last change-set
/undo 3      # Revert the last 3 change-sets
/history     # View all recorded change-sets
```

Backups are stored in `.gemcli/backups/` with timestamped directories. Each backup contains the original file contents so you can always roll back.

---

## Configuration

Access from the main menu under **Settings**:

| Setting | Description |
|:--------|:------------|
| **Theme** | Cyan, Pink, Gold, Green, Purple, or White color schemes for the terminal UI |
| **Git** | Toggle git integration, commit timing (immediate / on-exit), auto-push, branch selection |
| **Image** | Set the save directory for AI-generated images |
| **Telegram** | Bot token, chat ID, workspace path, enable/disable bridge |

Settings are stored in `.gemcli/settings.json`.

---

## Session History

All conversations are persisted as JSONL files in `.gemcli/history/`.

- Each session gets a unique ID and an auto-generated title
- Messages are logged with timestamps and roles (`user`, `ai`, `system`)
- Use `/recall <keyword>` to search across all sessions
- Use `/export` to save the current session as a Markdown document
- Use `/summary` to get an AI-generated TL;DR of the session
- Delete sessions from the web UI sidebar — analytics data is preserved

---

## Project Structure

```
Gemini-Terminal-Tool-GEM-CLI/
├── gemcli/                  Core Python package (14 modules)
│   ├── web_assets/          Bundled web UI (HTML + CSS + JS)
│   └── ...
├── services/
│   └── telegram_bridge.py   Telegram bot integration
├── pyproject.toml           Package metadata and dependencies
├── LICENSE                  MIT License
└── README.md                This file
```

---

## Upgrading from v1.x

v2.1 is a breaking change in architecture but **not in user experience**. The same commands work, the same cookie auth works, and your existing `.gemcli/` history directory is fully compatible.

Key differences:

- The old multi-mode selection (Chat / System Agent / AutoBot / Image Gen) is gone — replaced by a single unified agent that handles everything
- `gemcli-ui` entry point has been removed
- `textual` and `python-telegram-bot` are now optional dependencies
- The web UI is new and runs alongside the CLI
- **New in v2.1**: Autonomous browser control via Playwright

```bash
pip install --upgrade gemcli
```

---

## Uninstall

```bash
pip uninstall gemcli
```

To also remove local data:

```bash
# Remove session history, backups, and settings
rm -rf .gemcli/

# Remove saved cookies
rm -f .gemini_cookies.json
```

---

## License

MIT License — see [LICENSE](LICENSE) for details.

---

## Links

- **PyPI**: [pypi.org/project/gemcli](https://pypi.org/project/gemcli/)

---

Made by [89P13](https://github.com/Aniketh78)
