Metadata-Version: 2.4
Name: onecode-cli
Version: 0.1.4
Summary: Multi-agent codebase evaluation, analysis and refactoring in natural language.
Author-email: Shoaib Rahman <shoaibeee@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/shoaibur/OneCode
Project-URL: Repository, https://github.com/shoaibur/OneCode.git
Project-URL: Issues, https://github.com/shoaibur/OneCode/issues
Keywords: codebase,evaluation,analysis,llm,agents,code-generation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: openai>=1.0.0
Requires-Dist: anthropic>=0.7.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: faiss-cpu>=1.7.0
Requires-Dist: numpy<2.0.0,>=1.24.0
Requires-Dist: ragas>=0.1.0
Requires-Dist: pandas>=1.5.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"

# OneCode

**Talk to your code. Let AI understand, refactor, and improve it.**

Have a codebase that needs analysis? Want to refactor without getting lost in the details? OneCode is your AI assistant for code.

Point OneCode at any project folder, and ask it anything:
- 🤔 "What does this authentication flow do?"
- ✏️ "Add input validation to the login function"
- 🔍 "Where is the database connection code?"
- 🧪 "Write a test for this function and run it"
- 📁 "Move all test files to a tests/ directory"

No complex commands. No context switching. Just natural conversation.

<!-- System Design Diagram
**System Design** [Edit Image](https://excalidraw.com/#json=zWksWYwZIGlGCuJ5yZLZ6,jt0P-JqopZFhDAvkvTYzYg)
<img width="948" height="802" alt="image" src="https://github.com/user-attachments/assets/fb59ff93-9c8c-4419-98a4-3ca7b7902a10" />
-->

---

## Why OneCode?

| Problem | Solution |
|---------|----------|
| Understanding unfamiliar codebases | Semantic search + AI analysis |
| Time-consuming refactoring | AI-powered code modification |
| Manual testing & debugging | Automated test generation & self-correction |
| Context switching between tools | Single natural language interface |
| Code maintenance at scale | Intelligent file operations & git integration |

---

<!-- Project Structure (for contributors)
## Project structure

```
OneCode/
├── main.py                  Entry point — CLI loop and agent registration
├── config.py                Env vars + shared OpenAI client (max_retries=3)
├── requirements.txt
│
├── kg/                      Knowledge graph layer
│   ├── graph.py             Vertex table + relationship table + adjacency index
│   ├── store.py             FAISS vector store with embedding cache
│   ├── embedder.py          OpenAI text-embedding-3-small wrapper
│   ├── builder.py           Full build + incremental refresh
│   └── persistence.py       Save/load KG to .onecode/ cache
│
├── agents/
│   ├── supervisor.py        Orchestrator — tool-calling loop, streaming, history
│   ├── reader.py            Semantic code retrieval via FAISS + 1-hop expansion
│   ├── coder.py             Write / modify code with diff preview + confirmation
│   ├── executor.py          Run shell commands, return stdout/stderr/exit code
│   ├── search.py            Exact / regex text search across files
│   ├── summarizer.py        Condense long content with LLM
│   ├── renamer.py           Rename / move files and directories
│   ├── deleter.py           Delete files and directories
│   └── git_agent.py         Git operations via natural language
│
└── utils/
    ├── file_parser.py       Walk codebase, read supported file types
    └── retry.py             OpenAI error handling wrapper
```
-->


## Installation

**From PyPI:**
```bash
pip install onecode-cli
```

---

## Setup

**1. Configure environment**

Provide API keys using one of two methods:

**Method A: Create `.env` file**
Add API keys to `.env` in your project or home directory. `OPENAI_API_KEY` is always required (used for embeddings):
```
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...   # only needed for Claude models
```

**Method B: Export environment variables**
```bash
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...   # only needed for Claude models
```

---

## How to run

After installation, the `onecode` command is available globally:

```bash
# Default model (claude-sonnet-4-6) with explicit path
onecode ~/path/to/project

# Use current directory (default if no path specified)
onecode

# Specify a different model
onecode --model gpt-4o

# From within the codebase directory (same as above)
cd ~/myproject
onecode
```

**First installation check:**
```bash
onecode --help
```

**First run** — output looks like this:
```
$ onecode ~/myproject
OneCode - Codebase Analyzer
----------------------------------------
Model:    claude-sonnet-4-6
Indexing: /Users/you/myproject
Ready:    42 nodes (class:12, file:18, function:12) | 42 embeddings

Type a question or task (or 'exit' to quit).
----------------------------------------

You: 
```

**Subsequent runs** — output looks like this:
```
$ onecode ~/myproject
OneCode - Codebase Analyzer
----------------------------------------
Model:    claude-sonnet-4-6
Indexing: /Users/you/myproject
Ready:    42 nodes (class:12, file:18, function:12) | 42 embeddings

Type a question or task (or 'exit' to quit).
----------------------------------------

You: 
```

---

## Development

### Setup for development

```bash
pip install -e ".[dev]"
```

This installs OneCode in development mode with test dependencies.

### Run tests

```bash
pytest tests/              # Run all tests
pytest tests/ -v           # Verbose output with details
pytest tests/ -v --tb=short  # With short error tracebacks
```

### Test coverage

The test suite validates the evaluation system:

- **Query parsing** (6 tests) — Natural language query parsing for target module, metrics, and sample count
  - Extracts agent names and metric aliases correctly
  - Handles sample count keywords (quick=2, comprehensive=10)

- **Target selection** (4 tests) — Module matching and filtering logic
  - Exact filename matching (prevents false positives)
  - Substring fallback for flexibility
  - "codebase" target returns all modules

- **Metrics filtering** (3 tests) — Metrics display selection
  - Shows only requested metrics when specified
  - Shows all metrics when none specified
  - Gracefully handles missing metric values

---

## Example queries

### Understand the codebase
```
You: what does this codebase do?
You: explain the authentication flow
You: what classes exist and what are their responsibilities?
You: how does the database connection work?
You: analyze the modules in this codebase
You: what agents/modules are in this project?
```

### Find specific code
```
You: search for all calls to connect_db
You: search for TODO comments
You: where is the retry logic implemented?
You: find all async functions
```

### Write and modify code
```
You: add input validation to the login function
You: write a utility function that paginates a list and add it to utils.py
You: refactor the parse_config function to handle missing keys gracefully
```

### Write, run, and self-correct
```
You: create a function that reverses a string, write a test for it, and run the test
You: add a health check endpoint and run the server to verify it starts
You: write a script that counts lines of code per file and run it
```

### File management
```
You: rename src/helpers.py to src/utils.py
You: delete the tmp/ directory
You: move all test files into a tests/ directory
```

### Git operations
```
You: show git status
You: show the diff of uncommitted changes
You: commit all staged files with message "add retry logic"
You: show the last 5 commits
```

### Evaluate code quality with RAGAS metrics
```
You: evaluate the codebase
You: quick evaluation of reader agent focusing on faithfulness
You: what is the accuracy of the coder agent?
You: comprehensive evaluation of all modules
You: evaluate the code writing agent
You: which agent has the best answer relevancy?
```

OneCode can evaluate any module using RAGAS metrics:
- **Faithfulness** — How faithful is the output to the context
- **Hallucination** — How much the output diverges from the context (1 - faithfulness)
- **Answer Relevancy** — How relevant the output is to the input
- **Context Precision** — How much of the context is relevant
- **Context Recall** — How much of relevant context was retrieved
- **Answer Accuracy** — How accurate the answer is compared to ground truth
- **Response Groundedness** — How grounded the response is in retrieved context
- **Agent Goal Accuracy** — How accurately the agent achieved its goal
- **Tool Call F1** — Precision and recall of tool calls made by agents

**Intelligent Agent Selection:** Describe what an agent does and OneCode will find it.
- `"evaluate the code modifier"` → coder_agent
- `"which agent searches for patterns?"` → reader_agent or search_agent
- `"evaluate the file manager"` → renamer_agent or deleter_agent

**Evaluation Speed:**
- Use `quick` for fast evaluation (2 samples)
- Use `comprehensive` for detailed analysis (10 samples)
- Default: 5 samples

**Note:** Evaluation uses `gpt-4o-mini` for reliable metrics computation, independent of your chosen model.

<!-- Workflow and Architecture Details (for developers)

### Combined workflow
```
You: find where user authentication is handled, add rate limiting to it, run the tests, and commit if they pass
```

## How the pipeline works

```
User query
    │
    ▼
Supervisor (LLM, streaming, conversation history)
    │
    ├── reader_agent     → FAISS semantic search over KG + 1-hop neighbour expansion
    ├── search_agent     → Regex/literal search, returns file:line:content
    ├── coder_agent      → LLM generates code → diff preview → y/N → write to disk
    ├── executor_agent   → subprocess → stdout/stderr/exit code
    │       └── on error: supervisor calls coder to fix → re-runs executor
    ├── renamer_agent    → LLM extracts paths → y/N → shutil.move
    ├── deleter_agent    → preview → y/N → unlink / rmtree
    ├── git_agent        → LLM generates git command → runs (confirms write ops)
    └── summarizer_agent → LLM condenses long output
    │
    ▼
Final answer (streamed to terminal)

After any file change → KG refreshed → cache saved to .onecode/
```

## User confirmation

Any operation that modifies or deletes files shows a preview and waits for `[y/N]`:

- **coder_agent** — shows a unified diff for modifications, line preview for new files
- **deleter_agent** — shows file path or directory size before deleting
- **renamer_agent** — shows from/to paths before moving
- **git_agent** — shows the generated git command before write operations

## Cache

The knowledge graph is persisted to `<codebase>/.onecode/` (excluded from git). It stores:
- `vertices.json` — all nodes (files, classes, functions)
- `relationships.json` — edges (CONTAINS, IMPORTS)
- `embeddings.json` — FAISS embedding vectors per node
- `manifest.json` — file mtimes for change detection

The cache is updated automatically after every write, delete, or rename operation.

-->
