Metadata-Version: 2.4
Name: icsf-cli
Version: 0.1.5
Summary: ICSF – Intelligent Code Security & Fixing Platform (CLI)
Author-email: Ramu Venkatesan <Ramu.Venkatesan@infoservices.com>
License: MIT
Project-URL: Homepage, https://github.com/icsf-testing/icsf-poc
Project-URL: Source, https://github.com/icsf-testing/icsf-poc
Project-URL: Issues, https://github.com/icsf-testing/icsf-poc/issues
Keywords: security,cli,github,java,maven,icsf
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: fastapi<1.0.0,>=0.104.1
Requires-Dist: uvicorn[standard]<0.32.0,>=0.24.0
Requires-Dist: httpx<1,>=0.28.1
Requires-Dist: pydantic<3.0.0,>=2.10.0
Requires-Dist: pydantic[email]<3.0.0,>=2.10.0
Requires-Dist: python-dotenv==1.0.0
Requires-Dist: pandas<3.0.0,>=2.0.0
Requires-Dist: python-multipart<1.0.0,>=0.0.6
Requires-Dist: boto3<2.0.0,>=1.34.0
Requires-Dist: botocore<2.0.0,>=1.34.0
Requires-Dist: pyyaml<7.0.0,>=6.0.1
Requires-Dist: PyGithub<3.0.0,>=2.1.1
Requires-Dist: GitPython<4.0.0,>=3.1.40
Requires-Dist: numpy<3.0.0,>=1.26.0
Requires-Dist: streamlit<2.0.0,>=1.28.0
Requires-Dist: starlette<1.0.0,>=0.27.0

# ICSF – Intelligent Code Security & Fixing Platform

ICSF is a full-stack, AI-powered platform that automates the discovery, analysis, and remediation of security vulnerabilities in Java/Maven codebases. It combines a multi-agent cognitive fixing pipeline with an autonomous self-healing testing framework (**Atlas**) to deliver a closed-loop system: from vulnerability report → verified, PR-ready fix.

---

## Table of Contents

- [Architecture Overview](#-architecture-overview)
- [End-to-End Application Flow](#-end-to-end-application-flow)
- [Backend Deep Dive](#-backend-deep-dive)
  - [FastAPI Main (`main.py`)](#1-fastapi-main-mainpy--1636-lines)
  - [Configuration & Credentials](#2-configuration--credentials)
  - [Pydantic Data Models](#3-pydantic-data-models-modelsagent_modelspy)
  - [Services Layer](#4-services-layer-services)
  - [Agents Layer (Cognitive Fixing Loop)](#5-agents-layer-agents--cognitive-fixing-loop)
  - [Atlas Subsystem (Self-Healing Testing)](#6-atlas-subsystem-atlas--self-healing-testing-framework)
- [Frontend Deep Dive](#-frontend-deep-dive)
- [RAG (Retrieval-Augmented Generation)](#-rag-retrieval-augmented-generation)
- [AI / LLM Integration](#-ai--llm-integration)
- [Input Requirements](#-input-requirements)
- [Technical Stack](#-technical-stack)
- [Getting Started](#-getting-started)
- [Project Structure](#-project-structure)
- [API Reference](#-api-reference)

---

## 🏗️ Architecture Overview

ICSF follows a layered, modular architecture:

```
┌─────────────────────────────────────────────────────────────────────┐
│                    Frontend (Streamlit – 5676 lines)                │
│   Premium dark-mode dashboard · Real-time progress · Lineage graph │
├─────────────────────────────────────────────────────────────────────┤
│                   Backend (FastAPI – 1636 lines)                    │
│         REST API · WebSocket/SSE · Request ID middleware            │
├────────────┬──────────────────────┬─────────────────────────────────┤
│  Services  │       Agents         │      Atlas Subsystem            │
│ (14 files) │  (Cognitive Loop)    │  (Self-Healing Testing)         │
│            │  5 agents + helpers  │  14 sub-packages                │
├────────────┼──────────────────────┼─────────────────────────────────┤
│            │   AWS Bedrock (LLM)  │  SQLite RAG Store               │
│            │  Claude 3.5 Sonnet   │  Titan Embeddings               │
│            │  Titan Embeddings    │  Cosine Similarity Search       │
└────────────┴──────────────────────┴─────────────────────────────────┘
```

### Key Design Principles

| Principle | Implementation |
|---|---|
| **Single AI Provider** | All LLM/embedding calls route through AWS Bedrock only |
| **Multi-Agent Pipeline** | 5 specialized agents, each with a single responsibility |
| **Self-Healing** | Atlas auto-repairs build failures and test regressions |
| **Cross-Repo Awareness** | Dependency analysis spans across multiple repositories |
| **Cost Control** | `CostGuardService` enforces per-run budget limits |
| **Resilience** | Retry with exponential backoff + circuit breakers |

---

## 🔄 End-to-End Application Flow

```mermaid
flowchart TD
    %% Entry & Configuration
    U((User)) -->|1. Setup| UI[Streamlit Dashboard]
    UI -->|2. Upload CSV| BE[FastAPI Backend]

    subgraph "Phase I: Discovery & Mapping"
        direction TB
        BE -->|3. Fetch Projects| GH[GitHub API]
        GH -->|4. List Repos| REPO[(Repository Store)]
        VS[Vulnerability Mapper] -->|5. Map Assets| FT[(Local Workspace)]
    end

    BE --> VS

    subgraph "Phase II: Quality Baseline – Atlas"
        direction LR
        BASE[Lightweight Baseline] -->|6. Verify Build| COV[Capture Coverage]
    end

    VS --> BASE

    subgraph "Phase III: Impact Analysis"
        direction TB
        DS[Dependency Service] -->|7. Map Blast Radius| MAP[Call Tree & Usage]
    end

    COV --> DS

    subgraph "Phase IV: Cognitive Fixing Loop"
        direction TB
        CC[Code Context Agent] --> FS[Fix Strategy Agent]
        FS --> CF[Code Fixer Agent]
        CF --> SV[Safety Validator Agent]
    end

    MAP -->|8. Start Fix| CC
    CC <-->|Rich Context| DS
    CF -->|9. Apply Fix| FT

    subgraph "Phase V: Self-Healing Pipeline – Atlas"
        direction TB
        BM[BuildMechanic] --> TH[TestHealer]
        TH --> TG[AI Test Generator]
        TG --> VR[Validation Report]
    end

    SV -->|10. Final Verify| BM
    BM -->|Self-Heal Build| FT
    FT --> TH

    subgraph "Phase VI: Delivery & Sync"
        direction TB
        VR --> PR[Batch PR Manager]
        PR -->|11. Create Sync PR| GH
        PR -->|12. Update UI| UI
    end

    %% RAG Knowledge Loop
    TG -.->|Save Success Patterns| RAG[(SQLite RAG Store)]
    RAG -.->|Context Enrichment| CC
```

### Phase-by-Phase Walkthrough

| Phase | What Happens | Key Service |
|---|---|---|
| **I. Discovery** | Upload CSV → fetch repos from GitHub API → match vulnerability file paths to repositories using intelligent path normalization | `VulnerabilityService`, `GitHubService` |
| **II. Baseline** | Run `mvn compile test` on the unmodified code to establish Ground Truth coverage & build health | `AtlasService.run_baseline_only()` |
| **III. Impact Analysis** | Parse Java files, build global dependency graph, find cross-repo callers of the vulnerable method | `DependencyService` |
| **IV. Cognitive Fixing** | 4-agent pipeline: Analyze context → Plan strategy → Generate fix → Validate safety | `FixOrchestrator` + 4 Agents |
| **V. Self-Healing** | BuildMechanic auto-repairs compilation; TestHealer fixes broken tests; AI generates new security-targeted tests | Atlas pipeline |
| **VI. Delivery** | Aggregate all fixes into a single PR with rich markdown body, push to GitHub | `BatchPRService`, `PRManagerService` |

---

## 🔧 Backend Deep Dive

### 1. FastAPI Main (`main.py`) — 1636 lines

The central orchestration hub. Defines the REST API, middleware, and all endpoint routes.

#### Startup & Middleware

| Component | Purpose |
|---|---|
| `_startup_validation()` | Smoke-checks AWS + GitHub credentials on boot |
| `RequestIDMiddleware` | Injects a UUID `X-Request-ID` header into every request for log correlation |
| CORS middleware | Configurable via `ALLOWED_ORIGINS` env var |

#### Pydantic Request/Response Models (inline)

| Model | Fields | Used By |
|---|---|---|
| `GitHubRepoRequest` | `username`, `email`, `token` | `POST /api/github/repos` |
| `Repository` | `id`, `name`, `full_name`, `clone_url`, `language`, etc. | All repo endpoints |
| `RepositoriesResponse` | `username`, `total_repos`, `repositories[]` | Repo listing |
| `TestingRequest` | `repo_url`, `repo_path`, `fixed_files`, `create_pr`, `vulnerability`, etc. | Testing pipeline |
| `Vulnerability` | `file_name`, `line_no` | Vulnerability mapping |
| `MappedVulnerability` | `repo: Repository`, `vulnerabilities[]` | Mapping results |

#### API Endpoints

| Method | Route | Description |
|---|---|---|
| `GET` | `/` | Root welcome |
| `GET` | `/api/health` | Health check for Docker/LB probes |
| `GET` | `/api/credentials/github` | Retrieve loaded GitHub credentials |
| `GET` | `/api/credentials/verify` | Debug credential loading |
| `POST` | `/api/github/repos` | Fetch repos (POST with body) |
| `GET` | `/api/github/repos` | Fetch repos (GET with query params) |
| `POST` | `/api/vulnerabilities/map` | Upload CSV + map vulnerabilities to repos |
| `POST` | `/api/dependencies/analyze` | Analyze dependencies for a single vulnerability |
| `POST` | `/api/dependencies/batch-analyze` | Batch dependency analysis for multiple vulnerabilities |
| `POST` | `/api/fix/orchestrate` | Run the full multi-agent fixing pipeline |
| `POST` | `/api/pr/create` | Create a single PR with fixed code |
| `POST` | `/api/testing/start` | Start async testing pipeline job |
| `GET` | `/api/testing/job/{job_id}` | Poll job status |
| `GET` | `/api/testing/stream/{job_id}` | SSE event stream for real-time progress |
| `GET` | `/api/testing/runs` | List recent pipeline runs |
| `POST` | `/api/testing/run` | Legacy sync testing endpoint |
| `POST` | `/api/fix/batch` | Batch fix multiple vulnerabilities |
| `POST` | `/api/pr/merge` | Merge PR with conflict resolution |
| `POST` | `/api/pr/check-mergeability` | Check PR mergeability |
| `POST` | `/api/pr/create-batch` | Create single aggregated PR for all fixes |

---

### 2. Configuration & Credentials

#### `config.py` — The Config Class

| Attribute | Source | Default |
|---|---|---|
| `AWS_ACCESS_KEY_ID` | `.env` | — |
| `AWS_SECRET_ACCESS_KEY` | `.env` | — |
| `AWS_REGION` | `.env` | `us-east-1` |
| `AWS_SESSION_TOKEN` | `.env` | `None` |
| `BEDROCK_MODEL_ID` | `.env` | `anthropic.claude-3-5-sonnet-20240620-v1:0` |
| `BEDROCK_EMBED_MODEL_ID` | `.env` | `amazon.titan-embed-text-v1` |

**Key Methods:**

- `get_github_credentials(force_reload=False)` — Reads `credentials.yaml` for GitHub PAT, username, email
- `validate_bedrock_credentials()` — Returns `(is_valid, error_msg)` tuple
- `get_bedrock_config()` — Returns dict with `access_key`, `secret_key`, `region`

#### `credentials.yaml`

```yaml
github:
  token: ghp_xxxxx
  username: your-username
  email: your-email@example.com
```

---

### 3. Pydantic Data Models (`models/agent_models.py`)

These 10 models define the complete data flow through the multi-agent pipeline:

```mermaid
flowchart LR
    VFR[VulnerabilityFixRequest] --> VA[VulnerabilityAnalysis]
    VA --> CC[CodeContext]
    CC --> FS[FixStrategy]
    FS --> CF[CodeFix]
    CF --> SV[SafetyValidation]
    SV --> FE[FixExplanation]
    VFR --> FOR[FixOrchestrationResult]
    VA --> FOR
    CC --> FOR
    FS --> FOR
    CF --> FOR
    SV --> FOR
    FE --> FOR
```

| Model | Role | Key Fields |
|---|---|---|
| `VulnerabilityFixRequest` | Input to pipeline | `vulnerability_type`, `file_path`, `line_number`, `repo_path` |
| `VulnerabilityAnalysis` | Agent 1 output | `severity`, `security_impact`, `root_causes`, `fix_category` |
| `CodeContext` | Agent 2 output | `code_snippet`, `class_name`, `dependent_files_intra/inter`, `data_flow` |
| `FixStrategy` | Agent 3 output | `fix_approach`, `code_changes_plan`, `files_to_modify_primary/secondary` |
| `CodeFix` | Agent 4 output | `fixed_code` (Dict[path→code]), `diff`, `change_summary`, `reasoning` |
| `SafetyValidation` | Agent 5 output | `validation_status`, `correctness_score`, `breaking_changes`, `issues_found` |
| `FixExplanation` | Agent 6 output | `vulnerability_summary`, `fix_explanation`, `markdown_report` |
| `FixOrchestrationResult` | Complete result | Aggregates all agent outputs + `overall_status`, `errors` |
| `VulnerabilitySeverity` | Enum | `CRITICAL`, `HIGH`, `MEDIUM`, `LOW`, `INFO` |
| `ValidationResult` | Enum | `APPROVED`, `REJECTED`, `NEEDS_REVIEW` |

---

### 4. Services Layer (`services/`)

The services layer contains 14 files providing the core business logic.

#### 4.1 `bedrock_service.py` — AWS Bedrock LLM Wrapper (439 lines)

The primary AI gateway used by the **Agents** layer.

| Method | Description |
|---|---|
| `invoke_claude(prompt, model_id, max_tokens, temperature, system_prompt)` | Synchronous Claude invocation via Bedrock `invoke_model` API |
| `ainvoke_claude(...)` | Async wrapper using `asyncio.to_thread` |
| `invoke_llama(prompt, ...)` | Llama 3 70B invocation (different payload format) |
| `ainvoke_llama(...)` | Async Llama wrapper |
| `embed_text(text, embed_model_id)` | Generate embeddings via Amazon Titan Embed |
| `invoke_model(model_id, prompt, ...)` | Generic dispatcher — auto-selects Claude/Llama based on model ID |
| `test_connection()` | Smoke test with simple prompt |

**Supported Model Constants:**

| Constant | Model ID |
|---|---|
| `CLAUDE_3_5_SONNET` | `anthropic.claude-3-5-sonnet-20240620-v1:0` |
| `CLAUDE_3_SONNET` | `anthropic.claude-3-sonnet-20240229-v1:0` |
| `LLAMA_3_70B` | `meta.llama3-70b-instruct-v1:0` |

#### 4.2 `github_service.py` — GitHub API Client (455 lines)

| Method | Description |
|---|---|
| `verify_token_and_get_user(username)` | Validate PAT + retrieve user info |
| `get_user_by_username(username)` | Public API user lookup |
| `get_username_from_email(email)` | Reverse email → username lookup |
| `get_user_organizations()` | List authenticated user's orgs |
| `get_organization_repositories(org_name)` | List all repos in an org (paginated) |
| `get_all_repositories(username, include_private, include_orgs)` | Aggregated repo fetch (user + org repos) |
| `get_repository_details(owner, repo_name)` | Single repo metadata |
| `get_repository_file_tree(owner, repo_name, branch)` | Recursive file tree via Git Tree API |

#### 4.3 `vulnerability_service.py` — CSV Parser & Repo Mapper (833 lines)

Parses vulnerability reports from Fortify, Checkmarx, SonarQube, Snyk, etc.

| Method | Description |
|---|---|
| `parse_csv_file(file_content, filename)` | Parse CSV into DataFrame; auto-detects column names |
| `extract_repo_name_from_url(url)` | Handles HTTPS, SSH, `.git` suffix URLs |
| `normalize_repo_identifier(repo_name, repo_url)` | Lowercase normalization for matching |
| `normalize_file_path(file_path)` | Cross-platform path normalization |
| `get_path_variations(file_path)` | Generates multiple path format variations for fuzzy matching |
| `match_file_in_repo(file_name, repo_files)` | Intelligent file matching with early-exit optimization |
| `clone_repository_and_get_files(repo_url, clone_dir)` | Git clone + file tree extraction |
| `map_vulnerabilities_to_repos(df, repositories, repo_files_map, clone_repos)` | Core mapping: CSV rows → repository + file matches |

#### 4.4 `dependency_service.py` — Java Dependency Graph Engine (2037 lines)

The largest service file. Performs static analysis of Java source code and Maven POM files.

| Method | Description |
|---|---|
| `parse_java_file(file_path)` | Extracts package, imports, classes, methods, interfaces, method calls via regex/AST parsing |
| `parse_pom_xml(pom_path)` | Extracts `groupId`, `artifactId`, `version`, dependencies, parent POM |
| `find_java_files(repo_path)` | Recursive `.java` file discovery |
| `find_pom_files(repo_path)` | Recursive `pom.xml` file discovery |
| `build_global_dependency_graph(all_repos, artifact_index)` | Builds both intra-repo and inter-repo dependency edges. Node identity = `(repo_name, file_path)` |
| `build_intra_repo_dependencies(repo_path)` | File-to-file dependencies within a single repo (import-graph) |
| `find_maven_artifact_for_file(file_path, repo_path)` | Map a `.java` file to its Maven artifact coordinates |
| `find_cross_repo_dependent_files(...)` | **Inter-repo blast radius**: finds files in other repos that depend on the vulnerable file |
| `_build_cross_repo_dependency_chains(...)` | Transitive dependency chain traversal across repos (up to `max_depth=5`) |
| `build_maven_artifact_index(all_repos)` | Maps `(groupId, artifactId)` → repository metadata |

#### 4.5 `fix_orchestrator.py` — Multi-Agent Pipeline Controller (564 lines)

Coordinates the sequential agent execution:

```
Agent 2 (Code Context) → Agent 3 (Fix Strategy) → Agent 4 (Code Fix) → Agent 5 (Safety Validator)
```

| Method | Description |
|---|---|
| `orchestrate_fix(request, stop_at_agent, validate_fix, max_validation_retries, all_repositories_info)` | Main entry point. Runs agents 2→5 sequentially, with optional validation loop |
| `get_orchestration_status(result)` | Human-readable status summary |
| `_create_skeleton_analysis(request)` | Generates a default `VulnerabilityAnalysis` from request data |

Supports `stop_at_agent` for incremental testing (e.g., run only agents 2-3).

#### 4.6 `batch_fix_service.py` — Batch Vulnerability Processing (753 lines)

Processes multiple vulnerabilities in sequence or with controlled concurrency.

| Method | Description |
|---|---|
| `_process_vulnerability_fix(current_idx, vuln_idx, vuln, ...)` | Process a single vulnerability with logging |
| `_run_testing_agent(repo_path, repo_name, phase, fixed_files)` | Run Atlas testing (baseline or validation phase) |
| `fix_single_vulnerability(vulnerability, repo_path, ...)` | Single fix with full orchestration |
| `fix_batch_vulnerabilities(vulnerabilities, repo_path, ..., max_concurrent, auto_create_pr, run_tests_after_fix)` | Main batch entry point. Runs baseline → sequential fixes → validation → optional PR |

Workflow: **Baseline → Fix each vulnerability → Run Atlas validation → Create aggregated PR**

#### 4.7 `pr_manager_service.py` — Git & PR Operations (1359 lines)

Complete Git workflow management.

| Method | Description |
|---|---|
| `_run_git_command(repo_path, command, timeout)` | Safe subprocess wrapper for git commands |
| `create_branch(repo_path, branch_name, base_branch)` | Create and checkout new branch |
| `commit_changes(repo_path, files_to_commit, commit_message, author_name, author_email)` | Stage + commit with configurable author |
| `push_branch(repo_path, branch_name, remote)` | Git push to remote |
| `create_pull_request(owner, repo, title, body, head_branch, base_branch)` | GitHub API PR creation |
| `_validate_compilation(repo_path, files_modified)` | Best-effort Maven/Gradle compilation check |
| `_clean_code_before_validation(code)` | Removes markdown artifacts, separator lines from LLM output |
| `_validate_java_code(code, file_path)` | Basic Java structure validation (package, class, brace matching) |
| `apply_fixed_code(repo_path, files_modified, fixed_code_map)` | Write fixed code to files with validation |
| `create_pr_for_fix(repo_path, repo_owner, ..., include_all_repo_changes)` | Complete workflow: apply → branch → commit → push → PR |

#### 4.8 `batch_pr_service.py` — Aggregated PR Creation (429 lines)

| Method | Description |
|---|---|
| `_extract_files_and_code(fix_result)` | Parse fix result into `files_modified` + `fixed_code_map` |
| `create_single_pr(fix_result, ...)` | Create PR for one vulnerability |
| `create_batch_prs(successful_fixes, ...)` | One PR per vulnerability |
| `create_single_batch_pr(successful_fixes, ..., test_results)` | **Single aggregated PR** combining all fixes + test results |

#### 4.9 `atlas_service.py` — Testing Pipeline Façade (430 lines)

Bridges the backend API to the Atlas subsystem.

| Method | Description |
|---|---|
| `_check_required_tools()` | Validates `git`, `mvn`, `java` are on PATH |
| `run_testing_pipeline(repo_url, create_pr, job_id)` | Full pipeline on remote repo (clone → test → coverage → PR) |
| `run_testing_pipeline_local(repo_path, repo_url, fixed_files, ...)` | Full pipeline on already-cloned local repo |
| `run_baseline_only(repo_path, repo_url)` | Lightweight: build + existing tests + coverage — NO AI |

#### 4.10 `fix_validator_service.py` — Post-Fix Validation (277 lines)

| Method | Description |
|---|---|
| `validate_fix(repo_path, files_modified)` | Run Maven build + tests on the fixed repo. Uses `BuildMechanic` for auto-repair |
| `get_validation_feedback(validation_result)` | Generate feedback string for retry loop |

#### 4.11 `job_manager.py` — Async Job & SSE Streaming (86 lines)

| Method | Description |
|---|---|
| `create_job()` | Create UUID-identified job with `asyncio.Queue` |
| `update_job(job_id, status, message, progress)` | Update status + push to SSE queue |
| `end_job(job_id)` | Signal `[DONE]` to SSE stream |
| `stream_job_events(job_id)` | Async generator for `StreamingResponse` |

#### 4.12 `run_history.py` — SQLite Run Persistence (115 lines)

| Method | Description |
|---|---|
| `create_run(repo_url, repo_path)` | Insert new run record |
| `update_run(run_id, status, result_data, error_msg, cost)` | Update with test/coverage/regression/quality gate reports |
| `get_recent_runs(limit)` | Fetch recent runs with JSON report parsing |

Schema: `pipeline_runs(run_id, repo_url, repo_path, status, start_time, end_time, total_cost, test_report, coverage_report, regression_report, quality_gate_report, error_message)`

#### 4.13 `cost_guard.py` — LLM Cost Limiter (50 lines)

| Method | Description |
|---|---|
| `start_run(run_id)` | Initialize per-run cost tracking |
| `add_cost(run_id, prompt_tokens, completion_tokens, model_id)` | Accumulate cost; returns `False` if budget exceeded |
| `get_run_cost(run_id)` | Query accumulated cost |

Pricing: Claude 3.5 Sonnet — $0.003/1K prompt tokens, $0.015/1K completion tokens. Default budget: **$5.00/run**.

---

### 5. Agents Layer (`agents/`) — Cognitive Fixing Loop

#### 5.1 `code_context_agent.py` — Blast Radius Mapper (642 lines)

**Purpose:** Understand the full context around a vulnerability — local code, dependent files, data flow.

| Method | Logic |
|---|---|
| `_read_file_with_context(file_path, line_number, context_lines)` | Extract code snippet + surrounding context. Includes class/method even if vulnerability is in imports |
| `_extract_class_and_method(file_path, line_number, code_content)` | Regex-based Java class/method extraction |
| `_analyze_data_flow_and_usage(code_snippet, vulnerability_type, ...)` | **LLM call**: Analyze how user input flows from source → vulnerable sink |
| `_analyze_method_usage_in_dependents(vulnerable_class, ..., dependent_files)` | Check if the fix will break dependent files by analyzing their imports and usage |
| `_discover_other_repositories(current_repo_path)` | Scan `temp_cloned_repos/` directory for cross-repo analysis |
| `analyze(request, vulnerability_analysis, all_repositories_info)` | Main entry: reads file, finds dependents (intra + inter-repo), runs LLM data flow analysis |

#### 5.2 `fix_strategy_agent.py` — Surgical Planner (633 lines)

**Purpose:** Design a backward-compatible fix plan.

| Method | Logic |
|---|---|
| `_get_available_java_files(repo_path, max_files)` | Inventory of Java files for validation |
| `_analyze_file_imports_and_usage(repo_path, file_path)` | Static import analysis to find related files |
| `_build_strategy_prompt(request, analysis, context)` | Constructs a detailed LLM prompt with vulnerability info, dependents, constraint rules |
| `_parse_strategy_response(response_content)` | JSON extraction from LLM response |
| `analyze(request, vulnerability_analysis, code_context)` | **LLM call**: Generate fix strategy; categorizes files as Primary (logic change) or Secondary (impacted usage) |

**Key Decision Logic:**
- If a method is called by 50+ files → force **backward-compatible fix** (overloaded method, not breaking change)
- Uses `FrameworkDetector` for framework-specific recommendations (Spring Security, Jakarta, etc.)

#### 5.3 `code_fix_agent.py` — Multi-File Code Generator (1071 lines)

**Purpose:** Generate actual fixed Java code across multiple files.

| Method | Logic |
|---|---|
| `_read_file(file_path)` | Read source file |
| `_find_nearest_pom_xml(repo_path, file_rel_path)` | Walk up directories to find `pom.xml` |
| `_project_allows_spring_security(repo_path, file_rel_path, original_code)` | Check if Spring Security dependencies exist before generating SS code |
| `_dependency_constraints_text(repo_path, ...)` | Generate constraint text for LLM prompt |
| `_postprocess_for_project_dependencies(code, ...)` | **Deterministic safety net**: strip Spring Security constructs if project doesn't include it |
| `_generate_diff(original, fixed, file_path)` | Unified diff generation |
| `_clean_generated_code(code, file_path)` | Aggressive cleanup: removes markdown, `<thinking>` blocks, ensures valid Java |
| `_generate_fixed_code(original_code, request, ...)` | **LLM call**: Generate complete fixed file with prompt-chain reasoning |
| `fix_code(request, ..., fix_strategy)` | Main entry: fixes ALL files in `files_to_modify_primary`, runs post-processing |

Uses `ImportManager.add_missing_imports()` and `SyntaxValidator.validate()` for post-processing.

#### 5.4 `safety_validator_agent.py` — Logic Gate (371 lines)

**Purpose:** Verify the fix is correct, introduces no regressions.

| Method | Logic |
|---|---|
| `_format_fixed_code(fixed_code)` | Format code dict for display |
| `_format_dependent_files_for_validation(code_context)` | Format dependent files context |
| `_build_validation_prompt(request, ..., code_fix)` | Comprehensive validation prompt |
| `_parse_validation_response(response_content)` | Extract structured validation data |
| `_normalize_validation_data(parsed)` | Ensure correct types for downstream consumption |
| `validate(request, ..., code_fix)` | **LLM call**: Returns `APPROVED`/`REJECTED`/`NEEDS_REVIEW` with `correctness_score` (0-1) |

#### 5.5 `codebase_analysis_agent.py` — Repository Intelligence (594 lines)

**Purpose:** Deep structural analysis of the codebase (similar to AI coding assistants).

| Method | Logic |
|---|---|
| `analyze_codebase_structure(repo_path, focus_file)` | Full repo analysis with in-memory cache (TTL=300s) |
| `find_dependent_files(repo_path, target_file, max_depth)` | Find all files depending on target file |
| `analyze_code_flow(repo_path, file_path, line_number)` | Data flow analysis around a specific line |
| `_analyze_architecture(repo_path, java_files)` | Detect project layers (Controller, Service, DAO, etc.) |
| `_build_dependency_graph(repo_path, java_files)` | Build import-based dependency graph |
| `_detect_patterns(repo_path, java_files)` | Detect design patterns (Singleton, Factory, Builder, Observer) |
| `_parse_java_file(file_path)` | Extract package, imports, classes (regex-based) |

#### 5.6 `agent_improvements.py` — Helper Utilities (368 lines)

Four static helper classes:

| Class | Purpose |
|---|---|
| `ImportManager` | Auto-detect and add missing Java imports (maps common security classes to their import statements) |
| `SyntaxValidator` | Basic Java syntax validation (brace matching, package declaration, class structure) |
| `FrameworkDetector` | Detect frameworks in `pom.xml` (Spring Boot, Spring Security, JPA, Jackson, etc.) with framework-specific fix recommendations |
| `ContextEnhancer` | Extract full method/class definitions from source code for enhanced prompt context |

---

### 6. Atlas Subsystem (`atlas/`) — Self-Healing Testing Framework

Atlas is a comprehensive, autonomous testing and quality assurance pipeline with 14 sub-packages.

#### 6.1 `orchestrator/run_pipeline.py` — Pipeline Core (1412 lines)

The brain of Atlas. Orchestrates the entire testing lifecycle.

| Function | Description |
|---|---|
| `run_full_pipeline(repo_url, ...)` | Clone remote repo → full pipeline |
| `run_full_pipeline_local(repo_path, ...)` | Full pipeline on local repo |
| `run_baseline_only(repo_path, ...)` | Lightweight: build + test + coverage only |
| `_run_baseline_phase(repo_path, ...)` | Build (with restricted auto-fix) + existing tests + JaCoCo coverage  |
| `_run_validation_phase(repo_path, ...)` | Diff-aware test generation, healing, regression detection (630+ lines) |
| `_run_full_pipeline_core(...)` | Core pipeline: baseline → validation → quality gate → PR |
| `evaluate_quality_gate(coverage, unit, min_coverage_pct, max_failures)` | Pass/fail decision on release readiness |
| `calculate_regression_report(state_mgr, ...)` | Compare current vs baseline to detect regressions/improvements |
| `_calculate_usage(llm)` | Compute estimated cost from Bedrock token metrics |
| `run_organization_pipeline(org_url, ...)` | Scan entire GitHub org: run pipeline on each Java/Maven repo |

#### 6.2 `agents/build_mechanic.py` — Build Failure Auto-Repair (1133 lines)

The SRE agent. Diagnoses and fixes compilation failures.

| Method | Description |
|---|---|
| `analyze(stdout, stderr)` | Parse Maven build output → `BuildDiagnosis` (root cause, confidence, hints) |
| `generate_fix(diagnosis, workspace_path, ...)` | **LLM call**: Generate concrete fix (file patches, POM changes, config files) |

**Domain Expertise:**
- Spring Security 6 migration patterns (`WebSecurityConfigurerAdapter` → lambda DSL)
- Deprecated API detection and deletion
- Missing dependency resolution (maps class names → Maven coordinates)
- `COMMON_TEST_HINTS` dictionary: 30+ patterns mapping class names to imports
- Test assertion guidelines (status codes, JSON paths, mock strategies)

#### 6.3 `agents/test_healer.py` — Test Failure Doctor (151 lines)

| Method | Description |
|---|---|
| `heal(failed_tests, workspace_path)` | Group failures by class → **LLM call**: generate fixed test file → `AgentFix` |
| `_find_test_file(root, classname)` | Locate `.java` test file by class name |

Processes top 10 failures, max 3 classes, max 5 failures per class.

#### 6.4 `rag/store.py` — SQLite Vector RAG Store (210 lines)

Lightweight persistent RAG store for test pattern learning.

| Method | Description |
|---|---|
| `upsert(id, kind, embedding, text, metadata)` | Insert/update with normalized float32 embedding blob |
| `query(embedding, top_k, kind, kinds, score_threshold, include_expired)` | Cosine similarity search via dot product |
| `get_by_id(id)` | Direct ID lookup |
| `count(kind)` | Count entries by kind |
| `evict_expired()` | TTL-based cleanup (default 30 days) |

**Schema:** `rag_items(id TEXT PK, kind TEXT, created_at INT, embedding BLOB, metadata_json TEXT, text TEXT)`
**Indexes:** `kind`, `created_at`

#### 6.5 `llm/bedrock.py` — Atlas Bedrock Client (163 lines)

Dedicated Bedrock client for the Atlas subsystem.

| Method | Description |
|---|---|
| `embed_text(text)` | Titan Embeddings: `inputText` → embedding vector |
| `generate_text(system, user, max_tokens)` | Claude Messages API via Bedrock `invoke_model` |

Tracks `total_input_tokens`, `total_output_tokens`, `total_embedding_tokens` for cost calculation.
**Security:** Permanent credentials (`AKIA*`) do NOT use session tokens; temporary (`ASIA*`) require them.

#### 6.6 `generation/java_unit_test_generator.py` — RAG-Enhanced Test Gen (441 lines)

| Method | Description |
|---|---|
| `generate_minimal_tests_for_repo(target_count, preferred_classes, ...)` | Discover main classes → prioritize by scoring → generate tests |
| `_generate_single_test(src, repo_path, ...)` | **LLM + RAG call**: Check fingerprint → query RAG for similar patterns → generate JUnit 5 test |
| `_set_fingerprint(class_key, sha, test_path)` | Store source hash in RAG for idempotent re-runs |

**Scoring heuristic for class prioritization:**
- +10 if in preferred classes list
- +5 for service/controller/repository classes
- +3 for `@RestController`/`@Service`/`@Repository` annotations
- −2 for test/config/model classes

Uses `RepoContractRegistry` for constructor/method signature validation in generated tests.

#### 6.7 `build/` — Build Infrastructure (5 files)

| File | Purpose |
|---|---|
| `maven.py` | Maven command runner (`mvn compile`, `mvn test`, etc.) with subprocess management |
| `jacoco_injector.py` | Inject JaCoCo Maven plugin into `pom.xml` for code coverage |
| `spring_test_injector.py` | Inject `spring-boot-starter-test` dependency |
| `failsafe_injector.py` | Inject Maven Failsafe plugin for integration tests |
| `dependency_governance.py` | Enforce dependency version governance (BOM alignment, conflict resolution) |

#### 6.8 `core/` — Core Infrastructure (5 files)

| File | Purpose |
|---|---|
| `config.py` | Atlas-specific configuration (data dirs, model IDs, etc.) |
| `logging.py` | `RunLogger` class for structured pipeline logging |
| `state.py` | `PipelineStateManager` — manages baseline/validation state persistence |
| `shell.py` | Safe shell command execution with timeout |
| `resilience.py` | **Retry with exponential backoff** (configurable attempts, jitter) + **Circuit Breaker** pattern (CLOSED/OPEN/HALF-OPEN states) + **Rate Limiter** |

#### 6.9 `analysis/` — Code Analysis (3 files)

| File | Purpose |
|---|---|
| `java_maven.py` | Java project analysis: `detect_repo_facts()`, `count_existing_tests()`, `find_domain_models()` |
| `contract_service.py` | `RepoContractRegistry`: extract class constructors, method signatures for test generation validation |
| `diff_analyzer.py` | Analyze git diffs to identify functional changes for targeted test generation |

#### 6.10 `reporting/` — Test Reporting (2 files)

| File | Purpose |
|---|---|
| `models.py` | Report dataclasses: `TestReport`, `CoverageReport`, `BreakageReport`, `GenerationReport`, `RegressionReport`, `QualityGateReport`, `UsageReport`, `FullRunReport` |
| `parsers.py` | Parse Surefire XML reports, JaCoCo CSV coverage data, classify test failures |

#### 6.11 `gitops/` — GitHub Integration (3 files)

| File | Purpose |
|---|---|
| `github_pr.py` | Create PRs for Atlas-generated tests |
| `github_issues.py` | Create GitHub issues for persistent test failures |
| `github_org.py` | List repos in a GitHub organization for org-wide scanning |

#### 6.12 `repo/` — Repository Management (2 files)

| File | Purpose |
|---|---|
| `cloner.py` | `RepoCloner`: Clone repos with token authentication |
| `history.py` | Run history tracking for the Atlas pipeline |

---

## 🎨 Frontend Deep Dive

**Technology:** Streamlit (5676 lines, single `app.py` + utility modules)

### UI Components

The frontend is a premium dark-mode dashboard with glassmorphism styling, gradient headers, and micro-animations. Key CSS tokens:

- Background: `#0f172a` (dark slate), Secondary: `#1e293b`
- Accent: `linear-gradient(135deg, #3b82f6, #2dd4bf)` (blue → teal)
- Font: Inter (body), JetBrains Mono (code)

### Core Functions

| Function | Lines | Purpose |
|---|---|---|
| `main()` | 45 | Entry point: mode selector (Vulnerability Workflow vs Repository Explorer) |
| `display_vulnerability_workflow(api_url)` | ~600 | Streamlined flow: Upload → Map → Test → Fix → Verify |
| `display_repositories(data)` | ~2400 | Full repository explorer with vulnerability cards, dep trees, fix controls |
| `process_active_batch_fix(selected_repo_id, ...)` | ~1040 | Real-time batch fix processing with progress bars |
| `display_lineage_graph(result, repo_name, vuln_idx)` | ~275 | NetworkX-based dependency graph visualization |
| `fetch_repositories(api_base_url, ...)` | 30 | Call backend to fetch GitHub repos |
| `map_vulnerabilities(api_url, repositories_data, csv_file)` | 28 | Upload CSV and map vulnerabilities |
| `run_testing_agent(api_url, repo_url)` | 70 | SSE streaming of Atlas pipeline progress |
| `batch_fix_vulnerabilities(api_url, vulnerabilities, ...)` | ~210 | Call batch fix endpoint with progress callbacks |
| `display_lineage_graph._extract_paths(items)` | 10 | Extract file paths from dependent files list |
| `display_setup_progress(current_step)` | ~160 | Animated 3-step progress tracker (Upload → Fetch → Map) |
| `display_run_history(api_url)` | 70 | Fetch and display pipeline run history table |

### Frontend Utility Modules

| File | Purpose |
|---|---|
| `src/vulnerability_ui.py` (52KB) | Advanced vulnerability display: cards, severity badges, fix result rendering |
| `src/lineage.py` (10KB) | Lineage graph data transformations |
| `utils/atlas_report_comprehensive.py` (21KB) | Comprehensive Atlas report rendering |
| `utils/integrate_render.py` (4KB) | Report integration helpers |

---

## 🧠 RAG (Retrieval-Augmented Generation)

ICSF uses a custom RAG implementation for test pattern learning:

### Architecture

```
                    ┌──────────────────┐
                    │   Titan Embed    │
                    │   (Bedrock)      │
                    └────────┬─────────┘
                             │ embedding vector
                    ┌────────▼─────────┐
                    │  SqliteVectorRag │
                    │     Store        │
                    │  (cosine search) │
                    └────────┬─────────┘
                             │ similar patterns
                    ┌────────▼─────────┐
                    │  Test Generator  │
                    │  (LLM prompt)    │
                    └──────────────────┘
```

### How RAG is Used

1. **Fingerprint Check**: Before generating a test, hash the source file → query RAG for existing fingerprint → skip if unchanged
2. **Pattern Retrieval**: Query RAG store for similar test patterns (`kind=test_pattern`) with cosine similarity ≥ 0.25
3. **Context Injection**: Retrieved patterns are injected into the LLM prompt as examples
4. **Pattern Storage**: After successful test generation, store the pattern in RAG for future use

### RAG Store Configuration

| Setting | Value |
|---|---|
| **Database** | SQLite (`data/atlas_rag.db`) |
| **Embedding Model** | Amazon Titan Embed Text v1 |
| **Embedding Dimension** | 1536 (float32) |
| **Similarity Metric** | Cosine (via dot product on normalized vectors) |
| **TTL** | 30 days (auto-eviction of stale entries) |
| **Score Threshold** | 0.25 minimum cosine similarity |

---

## 🤖 AI / LLM Integration

### Models Used

| Model | Use Case | Provider |
|---|---|---|
| **Claude 3.5 Sonnet** | All reasoning: code analysis, fix generation, strategy planning, safety validation, build repair, test healing | AWS Bedrock |
| **Amazon Titan Embed Text v1** | Text embeddings for RAG store | AWS Bedrock |
| **Llama 3 70B** (optional) | Alternative generation model | AWS Bedrock |

### LLM Call Sites

| Component | # of LLM Calls | Purpose |
|---|---|---|
| `CodeContextAgent` | 1 | Data flow analysis |
| `FixStrategyAgent` | 1 | Fix strategy planning |
| `CodeFixAgent` | 1 per file | Code generation |
| `SafetyValidatorAgent` | 1 | Fix validation |
| `BuildMechanic` | 1–3 per build failure | Build error diagnosis + fix |
| `TestHealer` | 1 per test class | Test repair |
| `JavaUnitTestGenerator` | 1 per source class | Test generation |
| Total per vulnerability | ~6–12 | Depending on file count and failure iterations |

### Cost Management

- `CostGuardService` tracks cost per run with **$5.00 default budget**
- Pricing model: Claude 3.5 Sonnet @ $0.003/1K input, $0.015/1K output
- `_calculate_usage()` in the pipeline reports total tokens + estimated cost
- `BedrockClient` tracks `total_input_tokens`, `total_output_tokens`, `total_embedding_tokens`

---

## 📥 Input Requirements

### 1. Security Vulnerability Report (CSV)

Supported scanners: **Fortify**, **Checkmarx**, **SonarQube**, **Snyk**

| Required Column | Example |
|---|---|
| `vulnerability_type` or `category` | Cross-Site Scripting |
| `file_name` or `file_path` | `src/main/java/com/example/Controller.java` |
| `line_no` or `line_number` | `42` |
| `severity` | Critical / High / Medium / Low |
| `description` | User input is rendered without encoding |
| `recommendation` | Use OWASP encoder for output encoding |
| `repo_name` or `link` | `my-app` or `https://github.com/org/my-app` |

### 2. Version Control Credentials

- **GitHub PAT**: Requires `repo` and `read:user` scopes
- Stored in `backend/credentials.yaml`

### 3. AI Model Access (AWS Bedrock)

- **AWS credentials**: `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` in `.env`
- **Region**: `us-east-1` (default) or any Bedrock-enabled region
- **Model access**: Must have Claude 3.5 Sonnet + Titan Embeddings enabled in your AWS account

### 4. Build Environment

- **Java JDK 17+** on PATH
- **Maven** on PATH
- **Git** on PATH

---

## 🛠️ Technical Stack

| Layer | Technology | Version |
|---|---|---|
| **Language** | Python | 3.10+ |
| **Backend Framework** | FastAPI | ≥0.104 |
| **Frontend Framework** | Streamlit | ≥1.28 |
| **LLM Provider** | AWS Bedrock (Boto3) | ≥1.34 |
| **Embedding Model** | Amazon Titan Embed Text v1 | — |
| **Reasoning Model** | Claude 3.5 Sonnet | — |
| **Database** | SQLite | (stdlib) |
| **HTTP Client** | httpx | ≥0.28 |
| **Data Processing** | pandas | ≥2.0 |
| **Version Control** | GitPython + GitHub API | ≥3.1 |
| **Graph Analysis** | NetworkX | ≥3.0 |
| **Validation** | Pydantic | ≥2.10 |
| **Containerization** | Docker Compose | 3.8 |
| **Build Tools** | Maven, JDK 17+ | — |

---

## 🚀 Getting Started

### Prerequisites

- **Git**, **Java JDK 17+**, and **Maven** installed and on PATH
- **Python 3.10+**
- **AWS credentials** with Bedrock access (Claude 3.5 Sonnet + Titan Embeddings enabled)
- **GitHub PAT** with `repo` and `read:user` scopes

### Environment Setup

1. **Create `.env`** in `backend/`:

```env
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20240620-v1:0
BEDROCK_EMBED_MODEL_ID=amazon.titan-embed-text-v1
```

2. **Create `credentials.yaml`** in `backend/`:

```yaml
github:
  token: ghp_your_personal_access_token
  username: your-github-username
  email: your-email@example.com
```

### Run with Docker (Recommended)

```bash
docker-compose up --build
```

- **Backend**: http://localhost:8000
- **Frontend**: http://localhost:8501
- Backend has 4GB memory limit, frontend has 1GB
- Health checks are configured for both services

### Manual Installation

**Backend:**
```bash
cd backend
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```

**Frontend:**
```bash
cd frontend
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt
streamlit run app.py --server.port 8501
```

---

## 📂 Project Structure

```
ICSF/
├── backend/
│   ├── main.py                          # FastAPI entrypoint (1636 lines, 20+ endpoints)
│   ├── config.py                        # Config class (AWS, Bedrock, GitHub credentials)
│   ├── credentials.yaml                 # GitHub PAT + user info
│   ├── logging_config.py               # Global logging configuration
│   ├── .env                             # AWS credentials (not committed)
│   │
│   ├── models/
│   │   └── agent_models.py              # 10 Pydantic models for pipeline data flow
│   │
│   ├── services/                        # 14 service files
│   │   ├── bedrock_service.py           # AWS Bedrock LLM wrapper (Claude, Llama, Titan)
│   │   ├── github_service.py            # GitHub API client (repos, orgs, file trees)
│   │   ├── vulnerability_service.py     # CSV parsing & repo mapping (833 lines)
│   │   ├── dependency_service.py        # Java dependency graph engine (2037 lines)
│   │   ├── fix_orchestrator.py          # Multi-agent pipeline controller
│   │   ├── batch_fix_service.py         # Batch vulnerability processing
│   │   ├── pr_manager_service.py        # Git operations & PR creation (1359 lines)
│   │   ├── batch_pr_service.py          # Aggregated PR creation
│   │   ├── atlas_service.py             # Testing pipeline façade
│   │   ├── fix_validator_service.py     # Post-fix build/test validation
│   │   ├── job_manager.py               # Async job & SSE streaming
│   │   ├── run_history.py               # SQLite run persistence
│   │   └── cost_guard.py                # LLM cost limiter ($5/run default)
│   │
│   ├── agents/                          # 7 agent files (Cognitive Fixing Loop)
│   │   ├── code_context_agent.py        # Blast radius mapper (642 lines)
│   │   ├── fix_strategy_agent.py        # Surgical planner (633 lines)
│   │   ├── code_fix_agent.py            # Multi-file code generator (1071 lines)
│   │   ├── safety_validator_agent.py    # Logic gate validator (371 lines)
│   │   ├── codebase_analysis_agent.py   # Repository intelligence (594 lines)
│   │   └── agent_improvements.py        # Helpers: ImportManager, SyntaxValidator, etc.
│   │
│   ├── atlas/                           # Self-Healing Testing Framework
│   │   ├── orchestrator/
│   │   │   └── run_pipeline.py          # Pipeline core (1412 lines)
│   │   ├── agents/
│   │   │   ├── build_mechanic.py        # Build failure auto-repair (1133 lines)
│   │   │   ├── test_healer.py           # Test failure doctor (151 lines)
│   │   │   └── models.py               # Agent data models
│   │   ├── rag/
│   │   │   └── store.py                 # SQLite vector RAG store (210 lines)
│   │   ├── llm/
│   │   │   └── bedrock.py               # Atlas Bedrock client (163 lines)
│   │   ├── generation/
│   │   │   └── java_unit_test_generator.py  # RAG-enhanced test gen (441 lines)
│   │   ├── build/
│   │   │   ├── maven.py                 # Maven command runner
│   │   │   ├── jacoco_injector.py       # JaCoCo coverage plugin injection
│   │   │   ├── spring_test_injector.py  # Spring test dependency injection
│   │   │   ├── failsafe_injector.py     # Failsafe plugin injection
│   │   │   └── dependency_governance.py # Dependency version governance
│   │   ├── core/
│   │   │   ├── config.py                # Atlas configuration
│   │   │   ├── logging.py              # RunLogger
│   │   │   ├── state.py                # Pipeline state manager
│   │   │   ├── shell.py                # Safe shell execution
│   │   │   └── resilience.py           # Retry, circuit breaker, rate limiter
│   │   ├── analysis/
│   │   │   ├── java_maven.py           # Java project fact detection
│   │   │   ├── contract_service.py     # Constructor/method signature registry
│   │   │   └── diff_analyzer.py        # Git diff → functional change detection
│   │   ├── reporting/
│   │   │   ├── models.py               # Report dataclasses
│   │   │   └── parsers.py              # Surefire XML & JaCoCo CSV parsers
│   │   ├── gitops/
│   │   │   ├── github_pr.py            # PR creation for generated tests
│   │   │   ├── github_issues.py        # Issue creation for failures
│   │   │   └── github_org.py           # Organization repo listing
│   │   └── repo/
│   │       ├── cloner.py               # Repository cloning
│   │       └── history.py              # Run history tracking
│   │
│   ├── scripts/                         # Utility & test scripts
│   │   ├── test_bedrock_connection.py
│   │   ├── test_cross_repo_dependencies.py
│   │   ├── test_dependency_analysis.py
│   │   ├── test_orchestrator.py
│   │   ├── analyze_all_matched_files.py
│   │   └── visualize_dependency_mapping.py
│   │
│   └── data/                            # SQLite databases & logs
│       ├── runs.db                      # Pipeline run history
│       └── atlas_rag.db                 # RAG vector store
│
├── frontend/
│   ├── app.py                           # Streamlit UI (5676 lines)
│   ├── src/
│   │   ├── vulnerability_ui.py          # Vulnerability display components
│   │   └── lineage.py                   # Lineage graph data transforms
│   ├── utils/
│   │   ├── atlas_report_comprehensive.py # Atlas report rendering
│   │   └── integrate_render.py          # Report integration helpers
│   ├── requirements.txt
│   └── Dockerfile
│
├── docker-compose.yml                   # Multi-container setup
├── start_frontend.bat                   # Windows frontend launcher
└── start_frontend.sh                    # Linux/Mac frontend launcher
```

---

## 📡 API Reference

### Health & Credentials

| Endpoint | Method | Description |
|---|---|---|
| `/api/health` | GET | Health check (Docker/LB probes) |
| `/api/credentials/github` | GET | Retrieve loaded GitHub credentials |
| `/api/credentials/verify` | GET | Debug credential loading |

### Repository Management

| Endpoint | Method | Description |
|---|---|---|
| `/api/github/repos` | POST/GET | Fetch GitHub repositories |

### Vulnerability Management

| Endpoint | Method | Description |
|---|---|---|
| `/api/vulnerabilities/map` | POST | Upload CSV + map vulnerabilities |
| `/api/dependencies/analyze` | POST | Single vulnerability dependency analysis |
| `/api/dependencies/batch-analyze` | POST | Batch dependency analysis |

### Fix Operations

| Endpoint | Method | Description |
|---|---|---|
| `/api/fix/orchestrate` | POST | Full multi-agent fix pipeline |
| `/api/fix/batch` | POST | Batch fix multiple vulnerabilities |

### Testing Pipeline

| Endpoint | Method | Description |
|---|---|---|
| `/api/testing/start` | POST | Start async testing job |
| `/api/testing/job/{job_id}` | GET | Poll job status |
| `/api/testing/stream/{job_id}` | GET | SSE event stream |
| `/api/testing/runs` | GET | Pipeline run history |
| `/api/testing/run` | POST | Legacy sync testing |

### Pull Request Management

| Endpoint | Method | Description |
|---|---|---|
| `/api/pr/create` | POST | Create single PR |
| `/api/pr/create-batch` | POST | Create aggregated PR |
| `/api/pr/merge` | POST | Merge with conflict resolution |
| `/api/pr/check-mergeability` | POST | Check PR mergeability |

---

*ICSF — Making code security intelligent, automated, and reliable.*
