Metadata-Version: 2.4
Name: scantool
Version: 0.11.7
Summary: MCP server for multi-language code analysis with structure extraction, metadata parsing, and search capabilities across Python, JavaScript, TypeScript, Rust, Go, C/C++, Java, PHP, C#, Ruby, Zig, HTML, CSS, SCSS, Markdown, Swift, plain text, and images
Project-URL: Homepage, https://github.com/mariusei/file-scanner-mcp
Project-URL: Repository, https://github.com/mariusei/file-scanner-mcp
Project-URL: Issues, https://github.com/mariusei/file-scanner-mcp/issues
Author: Marius Berge Eide
License: MIT
License-File: LICENSE
Keywords: code-analysis,file-scanner,mcp,structure-extraction,tree-sitter
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: fastmcp>=2.12.4
Requires-Dist: numpy>=1.24.0
Requires-Dist: pglast>=7.7
Requires-Dist: pillow>=10.0.0
Requires-Dist: tree-sitter-c-sharp>=0.23.0
Requires-Dist: tree-sitter-cpp>=0.23.0
Requires-Dist: tree-sitter-css>=0.23.0
Requires-Dist: tree-sitter-go>=0.23.3
Requires-Dist: tree-sitter-html>=0.23.0
Requires-Dist: tree-sitter-java>=0.23.2
Requires-Dist: tree-sitter-javascript>=0.23.1
Requires-Dist: tree-sitter-markdown>=0.3.2
Requires-Dist: tree-sitter-php>=0.23.4
Requires-Dist: tree-sitter-python>=0.23.6
Requires-Dist: tree-sitter-ruby>=0.23.0
Requires-Dist: tree-sitter-rust>=0.23.2
Requires-Dist: tree-sitter-scss>=1.0.0
Requires-Dist: tree-sitter-sql>=0.3.11
Requires-Dist: tree-sitter-swift>=0.0.1
Requires-Dist: tree-sitter-typescript>=0.23.2
Requires-Dist: tree-sitter-zig>=1.1.2
Requires-Dist: tree-sitter>=0.23.2
Requires-Dist: uvicorn>=0.32.1
Description-Content-Type: text/markdown

```
  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
  ░                                                                          ░
  ░   ███████╗ ██████╗ █████╗ ███╗   ██╗████████╗ ██████╗  ██████╗ ██╗       ░
  ░   ██╔════╝██╔════╝██╔══██╗████╗  ██║╚══██╔══╝██╔═══██╗██╔═══██╗██║       ░
  ░   ███████╗██║     ███████║██╔██╗ ██║   ██║   ██║   ██║██║   ██║██║       ░
  ░   ╚════██║██║     ██╔══██║██║╚██╗██║   ██║   ██║   ██║██║   ██║██║       ░
  ░   ███████║╚██████╗██║  ██║██║ ╚████║   ██║   ╚██████╔╝╚██████╔╝███████╗  ░
  ░   ╚══════╝ ╚═════╝╚═╝  ╚═╝╚═╝  ╚═══╝   ╚═╝    ╚═════╝  ╚═════╝ ╚══════╝  ░
  ░                                                                          ░
  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

                ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓

                ╔═══════════════════════════════════════════╗
                ║  ▶ [▓▓▓▓▓▓▓▓▓░░] Scanning codebase...     ║
                ║                                           ║
                ║  ╭───────────────────────────────────╮    ║
                ║  │ ✓ Classes    ▓▓▓▓▓▓▓▓▓▓ 100%      │    ║
                ║  │ ✓ Functions  ▓▓▓▓▓▓▓▓▓▓ 100%      │    ║
                ║  │ ✓ Metadata   ▓▓▓▓▓▓▓▓▓▓ 100%      │    ║
                ║  ╰───────────────────────────────────╯    ║
                ║                                           ║
                ║    tree-sitter powered  •  MCP ready      ║
                ╚═══════════════════════════════════════════╝

                ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
```

# Scantool - File Scanner MCP

MCP server for analyzing source code structure across multiple languages. Extracts classes, functions, methods, and metadata (signatures, decorators, docstrings) with precise line numbers.

## Features

### Multi-language Support
Python, JavaScript, TypeScript, Rust, Go, C/C++, Java, PHP, C#, Ruby, SQL (PostgreSQL, MySQL, SQLite), Markdown, Plain Text, Images

### Structure Extraction
- Classes, methods, functions, imports
- Function signatures with type annotations
- Decorators and attributes
- Docstrings and JSDoc comments
- Precise line numbers (from-to ranges)

### Analysis Tools
- **preview_directory**: Intelligent codebase analysis with entry points, import graph, call graph, and hot functions (5-10s)
- **scan_file**: Detailed file structure with signatures and metadata
- **scan_directory**: Compact directory tree with inline function/class names
- **search_structures**: Filter by type, name pattern, decorator, or complexity
- **list_directories**: Directory tree (folders only)

### Output Formats
- Tree format with box-drawing characters (├─, └─, │)
- JSON format for programmatic use
- Configurable display options

## Installation

### Install with uvx

```bash
# From GitHub
uvx --from git+https://github.com/mariusei/file-scanner-mcp scantool

# Or from PyPI
uvx scantool
```

### Install from Smithery

https://smithery.ai/server/@mariusei/file-scanner-mcp

### Install from Source

```bash
git clone https://github.com/mariusei/file-scanner-mcp.git
cd file-scanner-mcp
uv sync
uv run scantool
```

## Configuration

Add to Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):

```json
{
  "mcpServers": {
    "scantool": {
      "command": "uvx",
      "args": ["scantool"]
    }
  }
}
```

Or if installed from source:

```json
{
  "mcpServers": {
    "scantool": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/file-scanner-mcp", "scantool"]
    }
  }
}
```

Restart Claude Desktop after configuration.

## Usage

### preview_directory - Code analysis (primary tool)

Analyzes codebase structure including entry points, import graph, call graph, and hot functions.

```python
preview_directory(
    directory=".",
    depth="deep",             # "quick", "normal", or "deep" (default: "deep")
    max_files=10000,          # Safety limit (default: 10000)
    max_entries=20,           # Entries per section (default: 20)
    respect_gitignore=True    # Honor .gitignore (default: True)
)
```

**Depth levels:**
- `"quick"`: Metadata only (0.5s) - file counts, sizes, types
- `"normal"`: Architecture analysis (2-5s) - imports, entry points, clusters
- `"deep"`: Full analysis (5-10s) - includes hot functions and call graph (default)

**Example output (depth="deep"):**

```
📂 project/

━━━ ENTRY POINTS ━━━
  main.py:main() @1
  backend/application.py:Flask app @15
  frontend/index.ts:export default

━━━ CORE FILES (by centrality) ━━━
  backend/database.py: imports 0, used by 15 files
  backend/auth.py: imports 1, used by 8 files
  shared/utils.py: imports 2, used by 12 files

━━━ ARCHITECTURE ━━━
  Entry Points: 25 files
  Core Logic: 68 files
  Plugins: 15 files
  Tests: 42 files

━━━ HOT FUNCTIONS (most called) ━━━
  get_database() (function): called by 41, calls 1 @backend/database.py
  authenticate() (function): called by 23, calls 5 @backend/auth.py
  validate_input() (function): called by 15, calls 2 @shared/utils.py

Analysis: 486 files in 4.82s (layer1+layer2)
```

**Use cases:**
- First-time codebase exploration
- Understanding multi-modality projects (frontend/backend/database)
- Finding critical functions (hot spots)
- Identifying entry points

### scan_file - Detailed file analysis

```python
scan_file(
    file_path="path/to/file.py",
    show_signatures=True,      # Include function signatures with types
    show_decorators=True,      # Include @decorator annotations
    show_docstrings=True,      # Include first line of docstrings
    show_complexity=False,     # Show complexity metrics
    output_format="tree"       # "tree" or "json"
)
```

**Example output:**

```
example.py (1-57)
├─ file-info: 1.4KB modified: 2 hours ago
├─ imports: import statements (3-5)
├─ class: DatabaseManager (8-26)
│    "Manages database connections and queries."
│  ├─ method: __init__ (self, connection_string: str) (11-13)
│  ├─ method: connect (self) (15-17)
│  │    "Establish database connection."
│  └─ method: query (self, sql: str) -> list (24-26)
│       "Execute a SQL query."
└─ function: main () (53-57)
     "Main entry point."
```

### scan_file_content - Analyze content directly

Scan content without requiring a file path. Works with remote files, APIs, or in-memory content.

```python
scan_file_content(
    content="def hello(): pass\n\nclass MyClass:\n    pass",
    filename="example.py",     # Extension determines parser
    show_signatures=True,
    show_decorators=True,
    show_docstrings=True,
    show_complexity=False,
    output_format="tree"
)
```

### scan_directory - Compact overview

Shows directory tree with inline class/function names.

```python
scan_directory(
    directory="./src",
    pattern="**/*",                 # Glob pattern
    max_files=None,                 # File limit
    respect_gitignore=True,         # Honor .gitignore
    exclude_patterns=None,          # Additional exclusions
    output_format="tree"            # "tree" or "json"
)
```

**Example output:**

```
src/ (22 files, 15 classes, 127 functions, 89 methods)
├─ scanners/
│  ├─ python_scanner.py (1-329) [11.9KB, 2 hours ago] - PythonScanner
│  ├─ typescript_scanner.py (1-505) [18.9KB, 1 day ago] - TypeScriptScanner
│  └─ rust_scanner.py (1-481) [17.6KB, 3 days ago] - RustScanner
├─ scanner.py (1-232) [8.8KB, 5 mins ago] - FileScanner
└─ server.py (1-353) [12.2KB, just now] - scan_file, scan_directory, ...
```

**Pattern examples:**

```python
# Specific file types
scan_directory("./src", pattern="**/*.py")

# Multiple types
scan_directory("./src", pattern="**/*.{py,ts,js}")

# Shallow scan (1 level deep)
scan_directory(".", pattern="*/*")

# Exclude directories
scan_directory(".", exclude_patterns=["tests/**", "docs/**"])
```

### search_structures - Find and filter

```python
# Find test functions
search_structures(
    directory="./tests",
    type_filter="function",
    name_pattern="^test_"
)

# Find classes ending in "Manager"
search_structures(
    directory="./src",
    type_filter="class",
    name_pattern=".*Manager$"
)

# Find functions with @staticmethod
search_structures(
    directory="./src",
    has_decorator="@staticmethod"
)

# Find complex functions (>100 lines)
search_structures(
    directory="./src",
    type_filter="function",
    min_complexity=100
)
```

### list_directories - Folder structure

Shows directory tree without files.

```python
list_directories(
    directory=".",
    max_depth=3,              # Maximum depth (default: 3)
    respect_gitignore=True    # Honor .gitignore (default: True)
)
```

**Example output:**

```
/Users/user/project/
├─ src/
│  ├─ components/
│  ├─ services/
│  └─ utils/
├─ tests/
│  ├─ unit/
│  └─ integration/
└─ docs/
```

## Supported Languages

| Extension | Language | Extracted Elements |
|-----------|----------|-------------------|
| `.py`, `.pyw` | Python | classes, methods, functions, imports, decorators, docstrings |
| `.js`, `.jsx`, `.mjs`, `.cjs` | JavaScript | classes, methods, functions, imports, JSDoc comments |
| `.ts`, `.tsx`, `.mts`, `.cts` | TypeScript | classes, methods, functions, imports, type annotations, JSDoc |
| `.rs` | Rust | structs, enums, traits, impl blocks, functions, use statements |
| `.go` | Go | types, structs, interfaces, functions, methods, imports |
| `.c`, `.h` | C | functions, structs, enums, includes |
| `.cpp`, `.hpp`, `.cc`, `.hh` | C++ | classes, functions, namespaces, templates, includes |
| `.java` | Java | classes, methods, interfaces, enums, annotations, imports |
| `.php` | PHP | classes, methods, functions, traits, interfaces, namespaces |
| `.cs` | C# | classes, methods, properties, structs, enums, namespaces |
| `.rb` | Ruby | modules, classes, methods, singleton methods |
| `.sql` | SQL | tables, views, functions, procedures, indexes, columns |
| `.md` | Markdown | headings (h1-h6), code blocks with hierarchy |
| `.txt` | Plain Text | sections, paragraphs |
| `.png`, `.jpg`, `.gif`, `.webp` | Images | format, dimensions, colors, content type |

All files include metadata (size, modified date, permissions) automatically.

## Use Cases

### Code Navigation
- Structural overview of unfamiliar codebases
- File organization understanding
- Navigation using precise line ranges

### Refactoring
- Identify class and function boundaries for safe splitting
- Find implementations of specific patterns
- Locate functions above complexity thresholds

### Code Review
- Generate structural diffs
- Find functions with specific decorators
- Identify test coverage gaps

### Documentation
- Auto-generate table of contents with line numbers
- Extract API signatures
- Feed structured data to analysis tools (JSON output)

### AI Code Assistance
- Primary exploration tool (replaces ls/grep/find workflows)
- Partition large files intelligently for LLM context windows
- Extract code sections with exact boundaries
- Search patterns across codebases
- Reduce token usage: get structure first, read content only when needed

## Architecture

```
scantool/
├── scanner.py       # Core scanning logic using tree-sitter
├── formatter.py     # Tree formatting with box-drawing characters
├── server.py        # FastMCP server implementation
├── code_map.py      # Code analysis orchestrator
├── analyzers/       # Language-specific analyzers
│   ├── base.py
│   ├── python_analyzer.py
│   ├── typescript_analyzer.py
│   ├── go_analyzer.py
│   └── skip_patterns.py
└── scanners/        # Language-specific scanners
    ├── base.py
    ├── python_scanner.py
    ├── typescript_scanner.py
    └── ...
```

## Testing

```bash
# Run all tests
uv run pytest

# Run specific tests
uv run pytest tests/analyzers/
uv run pytest tests/python/
uv run pytest tests/typescript/

# Run with coverage
uv run pytest --cov=src/scantool

# Run with verbose output
uv run pytest -v
```

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for details on adding language support.

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Dependencies

- [FastMCP](https://github.com/jlowin/fastmcp) - MCP server framework
- [tree-sitter](https://tree-sitter.github.io/) - Parsing library
- [uv](https://github.com/astral-sh/uv) - Python package installer

## Known Limitations

### MCP Tool Response Size Limit

Claude Desktop enforces a 25,000 token limit on MCP tool responses.

**Built-in mitigations:**
- `scan_directory()` uses compact inline format
- Respects `.gitignore` by default (excludes node_modules, .venv, etc.)
- Shows file metadata with relative timestamps

**Manual controls:**
- Use `pattern` to limit scope: `"**/*.py"` vs `"*/*"` (shallow)
- Use `max_files` to cap number of files processed
- Use `exclude_patterns` for additional exclusions
- Scan specific subdirectories instead of entire codebase

**For large codebases:**
```python
# Scan specific areas
scan_directory("./src", pattern="**/*.py")
scan_directory("./tests", pattern="**/*.py")
```

### Agent Delegation

When using Claude Code, asking to "explore the codebase" may delegate to the Explore agent which doesn't have access to MCP tools. Be explicit: "use scantool to scan the codebase" to ensure the MCP tool is used directly.

## Support

- [GitHub Issues](https://github.com/mariusei/file-scanner-mcp/issues)
- [GitHub Discussions](https://github.com/mariusei/file-scanner-mcp/discussions)
