Metadata-Version: 2.4
Name: srcodex
Version: 0.2.0
Summary: Semantic code explorer with AI-powered search and analysis
Author-email: Jonathan Antoun <jonathan.antoun@amd.com>
License: MIT
Project-URL: Homepage, https://github.com/Jonathan03ant/srcodex
Project-URL: Repository, https://github.com/Jonathan03ant/srcodex
Project-URL: Issues, https://github.com/Jonathan03ant/srcodex/issues
Keywords: code-search,semantic-analysis,ai,llm,code-exploration
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.1.0
Requires-Dist: textual[syntax]>=0.47.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: uvicorn>=0.23.0
Requires-Dist: anthropic>=0.7.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: tqdm>=4.66.0
Dynamic: license-file

# srcodex

**Semantic code explorer with AI-powered search and analysis**

srcodex builds a semantic graph of your codebase and provides AI-powered exploration through natural language queries. Think of it as an intelligent code search that understands relationships, call graphs, and architecture.

## Why srcodex?

Unlike generic code assistants (Claude CLI, GitHub Copilot, etc.) that read entire files to answer questions, srcodex uses a **semantic graph database** to understand your code:

| Question | Generic Assistant | srcodex |
|----------|------------------|---------|
| "Who calls function X?" | Grep entire codebase (20K tokens) | `get_callers('X')` (200 tokens) |
| "Show call chain A→B" | Read multiple files, manual tracing | Graph query (500 tokens) |
| "Find all ioctls" | Grep + read matches (15K tokens) | Database search (300 tokens) |
| "Explain module Y" | Read 10+ files (30K tokens) | Aggregate query (2K tokens) |

**Result:** 90% more token-efficient, instant relationship queries, and unique capabilities impossible for file-based tools (call chains, data flow analysis, architecture visualization).

## Features

- **Semantic Indexing**: Builds a persistent graph of symbols, functions, types, and their relationships
- **AI-Powered Search**: Ask questions in natural language about your code
- **Call Graph Analysis**: Trace function calls, dependencies, and execution paths
- **Terminal UI**: Beautiful terminal interface with file browser and AI chat
- **Multi-Language**: Supports C, C++, Python, and more
- **Fast**: SQLite-backed graph queries with intelligent caching
- **Portable**: `.srcodex/` directory makes indexed projects shareable

## Installation

```bash
pip install srcodex
```

## Quick Start

```bash
# Index your codebase (first time)
cd /path/to/your/project
srcodex

# Output:
# No .srcodex/ found. Index this directory? (y/n) y
# [Indexing happens...]
# [TUI launches]

# Next time - instant launch
srcodex
```

## Usage

Once indexed, use the TUI to:
- Browse files and symbols
- Search across your codebase
- Chat with AI about your code architecture
- Trace call chains and dependencies

### Example AI Queries

```
"What does the init_system function do?"
"Show me all functions that call malloc"
"Trace the execution path from main to shutdown"
"What structs are defined in config.h?"
```

## Configuration

Copy `.env.example` to `.env` and configure your API key:

```bash
# Public Anthropic API
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Or enterprise gateway (if applicable)
AMD_LLM_API_KEY=your-subscription-key
```

## Requirements

- Python 3.9+
- Universal CTags (`brew install universal-ctags` or `apt install universal-ctags`)
- Cscope (optional, for call graph)
- Claude API key (Anthropic or enterprise gateway)

## How It Works

1. **Indexing**: Extracts symbols, relationships, and metadata using CTags and Cscope
2. **Graph Building**: Creates semantic graph with typed edges (CALLS, INCLUDES, ACCESSES)
3. **AI Integration**: Claude queries the graph using specialized tools (not reading full files)
4. **Token Efficiency**: **99%+ reduction** in tokens vs. traditional code assistants
   - **Breakthrough caching architecture**: 25-100 tokens per query after initial cache build
   - Aggressive parallel tool batching (20-40 tools per iteration)
   - 3-iteration cache strategy: iterations 1-3 cached, iteration 4 answers with cached data
   - Semantic graph queries instead of file reads (10-100x more efficient)
   - **Real example**: 500 input tokens vs 60,000+ for traditional file-based approaches
   - Cache persists across queries - subsequent questions cost nearly nothing!

## Project Structure

After indexing, your project will have:

```
your-project/
├── .srcodex/
│   ├── metadata.json       # Project stats
│   ├── config.toml         # Indexing config
│   ├── data/
│   │   └── project.db      # Semantic graph
│   └── logs/               # Debug logs
└── [your source files...]
```

## Development

```bash
# Clone repository
git clone https://github.com/Jonathan03ant/srcodex.git
cd srcodex

# Install in development mode
pip install -e .

# Run tests
pytest
```

## License

MIT License - see LICENSE file for details

## Contributing

Contributions welcome! Please open an issue or pull request.

## Links

- [GitHub Repository](https://github.com/Jonathan03ant/srcodex)
- [Issue Tracker](https://github.com/Jonathan03ant/srcodex/issues)
- [Documentation](https://github.com/Jonathan03ant/srcodex/wiki)
