HEDGEHOG Pipeline Report

Generated: {{ generated_at }}

Run path: {{ data.metadata.run_path }}

Version: {{ data.metadata.hedgehog_version }}

{% if data.metadata.stage_audit_notebook %}

Audit notebook: {{ data.metadata.stage_audit_notebook.name }}

{% endif %}
{{ data.summary.initial_molecules }}
Initial Molecules
{{ data.summary.final_molecules }}
Final Molecules
{{ data.summary.retention_percent }}
Retention Rate
{{ data.summary.stages_completed }}/{{ data.summary.stages_enabled }}
Stages Completed
{% if data.metadata.stage_audit_notebook %}

Stage Audit Notebook

Open {{ data.metadata.stage_audit_notebook.name }} to inspect passed and dropped molecules in mols2grid, compare them against descriptor or docking thresholds, and document why specific filters are justified.

{% endif %} {% if data.summary.stage_statuses %}

Stage Execution Summary

{% for stage in data.summary.stage_statuses %} {% endfor %}
Stage Status
{{ stage.name }} {% if stage.completed %} ✓ COMPLETED {% elif stage.enabled %} ✗ FAILED {% else %} − DISABLED {% endif %}
{% endif %} {% if data.weighted_scores.models %}

Generator Reality Assessment

{% if plots.weighted_score_components %}
{{ plots.weighted_score_components | safe }}
{% else %}

No generator reality component data available

{% endif %}
{% for model_name, model in data.weighted_scores.models.items() %} {% set overall = (model.overall | default(0)) | float %} {% set grade = model.grade | default("N/A") %} {% set confidence = model.confidence | default("N/A") %} {% set candidate_pool = model.candidate_pool_quality | default({}) %} {% set candidate_pool_overall = candidate_pool.overall | default(none) %}

{{ model_name }}

{{ "%.1f"|format(overall) }}
Generator Reality /100
{% if candidate_pool_overall is not none %}
{{ "%.1f"|format((candidate_pool_overall | default(0)) | float) }}
Final Candidate Pool Quality
{% endif %}
{{ grade }}
Grade
{{ confidence }}
Confidence

Bottlenecks: {% if model.bottlenecks %} {% for bottleneck in model.bottlenecks %} {{ bottleneck }}{% if not loop.last %}, {% endif %} {% endfor %} {% else %} None {% endif %}

{% if model.warnings %}

Warnings: {% for warning in model.warnings %} {{ warning }}{% if not loop.last %}, {% endif %} {% endfor %}

{% endif %}

Generator Component Evidence

{% if model.components %} {% for component_name, component in model.components.items() %} {% endfor %}
Component Score Weight Evidence
{{ component_name }} {% if component.available and component.score is not none %} {{ "%.1f"|format((component.score | default(0)) | float) }} {% else %} — {% endif %} {{ "%.0f"|format(((component.weight | default(0)) | float) * 100) }}% {{ component.evidence_text | default("-") }}
{% else %}

No component evidence available.

{% endif %}
{% endfor %}
{% endif %}

Pipeline Flow

{% if data.available_models %}
{% for model in data.available_models %} {% endfor %}
{% endif %} {% if plots.sankey %}
{{ plots.sankey | safe }}
{% else %}

No data available

{% endif %}
{% if data.models %}

Model Comparison

{% if plots.model_comparison %}
{{ plots.model_comparison | safe }}
{% endif %} {% for model in data.models %} {% endfor %}
Model Initial Final Retention
{{ model.model_name }} {{ model.initial }} {{ model.final }} {% set retention = (model.final / model.initial * 100) if model.initial > 0 else 0 %} {{ "%.1f"|format(retention) }}%
{% if plots.model_losses %}

Molecule Losses by Stage

{{ plots.model_losses | safe }}
{% endif %}
{% endif %} {% if data.descriptors.distributions or data.descriptors_detailed.raw_data %}

Descriptor Analysis

{% set descriptor_filter_summary = data.descriptors_detailed.filter_summary or {} %} {% if descriptor_filter_summary.by_filter %}

Descriptor Filter Failures

{{ descriptor_filter_summary.total_molecules }}
Checked
{{ descriptor_filter_summary.failed_molecules }}
Rejected
{{ descriptor_filter_summary.passed_molecules }}
Passed
{{ "%.1f"|format(descriptor_filter_summary.failure_rate) }}%
Rejected

Counts are non-exclusive: one molecule can fail multiple descriptor filters.

{% for item in descriptor_filter_summary.by_filter %} {% endfor %}
Descriptor Rejected % of Checked Threshold
{{ item.descriptor }} {{ item.failed }} {{ "%.1f"|format(item.failed_pct) }}% {{ item.threshold or "Not configured" }}
{% if descriptor_filter_summary.top_combinations %}

Top Failure Combinations

{% for item in descriptor_filter_summary.top_combinations[:5] %} {% endfor %}
Failed Filters Molecules % of Checked
{{ item.filters }} {{ item.failed }} {{ "%.1f"|format(item.failed_pct) }}%
{% endif %} {% endif %} {% if data.available_models and plots.descriptors_data %}
{% for model in data.available_models %} {% endfor %}
{% endif %} {% if plots.descriptors_data %}
Mean Median Filter Threshold Out of Range
{% else %} {% if plots.descriptors_violin %}

Descriptor Distributions by Model

{{ plots.descriptors_violin | safe }}
{% endif %} {% endif %} {% if plots.descriptors_table %}

Mean Values by Model

{{ plots.descriptors_table | safe }} {% endif %}
{% endif %} {% if data.filters.by_filter or data.filters.totals or data.filters_detailed.by_filter or plots.filters_data or data.common_alert_diagnostics %}

Structural Filters Analysis

{% if data.filters.totals %}
{{ data.filters.totals|length }}
Filters Applied
{{ data.filters.totals.values()|sum }}
Total Rejected
{% if data.filters_detailed.filter_metrics %} {% for filter_name, metrics in data.filters_detailed.filter_metrics.items() %} {% if metrics.avg_demerit_score %}
{{ "%.1f"|format(metrics.avg_demerit_score) }}
Avg Demerit (Lilly)
{% endif %} {% if metrics.avg_sas %}
{{ "%.2f"|format(metrics.avg_sas) }}
Avg SAS
{% endif %} {% endfor %} {% endif %}
{% endif %} {% if plots.filters_data %}
Filter Descriptions:
NIBR — Novartis structural alerts: flags reactive warheads, toxic motifs, and metabolically labile groups that cause safety failures in preclinical studies
bredt — Bredt's rule: rejects molecules with impossible bridgehead double bonds — these cannot exist or be synthesized
common_alerts — PAINS, toxicophores, skin sensitizers, frequent hitters — removes promiscuous binders and false positives from biochemical assays
lilly — Eli Lilly MedChem rules: demerit scoring of 275+ structural features linked to poor ADMET, toxicity, or synthetic infeasibility
molcomplexity — Synthetic accessibility (SAS), QED drug-likeness, Bertz complexity — ensures practical synthesizability and drug-like properties
molgraph_stats — Topological constraints on ring count, heavy atoms, connectivity — excludes chemically unreasonable structures
protecting_groups — Detects residual protecting groups (Boc, Fmoc, Cbz, etc.) that should be removed before biological testing
ring_infraction — Flags strained ring systems and small heterocycles below minimum size — chemically unstable or synthetically challenging
stereo_center — Limits number of stereo centers and undefined stereocenters — excessive chirality makes synthesis difficult and costly
halogenicity — Limits halogen atom counts (F, Cl, Br) — over-halogenated molecules often have poor metabolic stability and toxicity
symmetry — Rejects highly symmetric molecules (off by default) — many real drugs are symmetric, use cautiously
{% endif %} {% if data.common_alert_diagnostics %} {% set ca = data.common_alert_diagnostics %}

Common Alert Diagnostics

{{ ca.overview.input_molecules }}
Input Molecules
{{ ca.overview.passed }}
Passed
{{ ca.overview.failed }}
Failed
{{ ca.overview.pass_rate }}%
Pass Rate
{{ ca.overview.mean_ruleset_alerts_failed }}
Mean Rulesets / Failed
{{ ca.overview.max_ruleset_alerts }}
Max Rulesets / Molecule
{% endif %}
{% endif %} {% if data.synthesis.distributions or data.synthesis_detailed.score_distributions %}

Synthesis Analysis

{% if data.available_models and data.synthesis_detailed.by_model %}
{% for model in data.available_models %} {% endfor %}
{% endif %}
{% if plots.synthesis_aligned_dist %} {% endif %} {% if data.retrosynthesis and data.retrosynthesis.summary %} {% endif %}

Synthesis Scores

Mean Threshold Out of Range
{% for item in plots.synthesis_score_meta or [] %}

{{ item.label }}

{% endfor %}
{% if plots.synthesis_aligned_dist %}

Aligned Synthesis Scorer Comparison

All scores below are normalized to 0..1 with 1 meaning easier to synthesize. Raw score columns are unchanged in CSV outputs.

{{ plots.synthesis_aligned_dist | safe }}
{% if plots.synthesis_aligned_corr %} {{ plots.synthesis_aligned_corr | safe }} {% else %}
No correlation data available
{% endif %}
{% if plots.synthesis_aligned_route %} {{ plots.synthesis_aligned_route | safe }} {% else %}
No route comparison data available
{% endif %}
{% if plots.synthesis_aligned_score_meta %} {% for item in plots.synthesis_aligned_score_meta %} {% endfor %}
Scorer Raw Direction Normalization Aligned Mean
{{ item.label }} {{ item.raw_direction }} {{ item.normalization }} {{ "%.3f"|format(item.mean) }}
{% endif %}
{% endif %} {% if data.retrosynthesis and data.retrosynthesis.summary %}

AiZynthFinder Retrosynthesis

{% if data.synthesis_detailed.summary.pct_solved is defined %}
{{ "%.1f"|format(data.synthesis_detailed.summary.pct_solved) }}%
Routes Found
{% endif %} {% if data.synthesis_detailed.summary.avg_search_time %}
{{ "%.1f"|format(data.synthesis_detailed.summary.avg_search_time) }}s
Avg Search Time
{% endif %} {% if data.retrosynthesis.summary.avg_route_score is defined %}
{{ "%.2f"|format(data.retrosynthesis.summary.avg_route_score) }}
Avg Route Score
{% endif %} {% if data.retrosynthesis.summary.avg_steps is defined %}
{{ "%.1f"|format(data.retrosynthesis.summary.avg_steps) }}
Avg Steps
{% endif %} {% if data.retrosynthesis.summary.avg_precursors is defined %}
{{ "%.1f"|format(data.retrosynthesis.summary.avg_precursors) }}
Avg Precursors
{% endif %}

Route Status

{% if plots.synthesis_solved_pie %}{{ plots.synthesis_solved_pie | safe }}{% endif %}
{% if plots.synthesis_time_box %}

Search Time by Model

{{ plots.synthesis_time_box | safe }}
{% endif %}

Route Score Distribution

{% if plots.retrosynthesis_route_score_hist %} {{ plots.retrosynthesis_route_score_hist | safe }} {% else %}
No route score data available
{% endif %}

Synthesis Steps

{% if plots.retrosynthesis_steps_hist %} {{ plots.retrosynthesis_steps_hist | safe }} {% else %}
No steps data available
{% endif %}
{% endif %}
{% endif %} {% if data.docking.gnina or data.docking.smina or data.docking_detailed.gnina.raw_data or data.docking_detailed.smina.raw_data %}

Docking Results

{% if data.available_models and (data.docking_detailed.gnina.by_model or data.docking_detailed.smina.by_model) %}
{% for model in data.available_models %} {% endfor %}
{% endif %}
{% if data.docking_detailed.gnina.summary %}
{{ "%.2f"|format(data.docking_detailed.gnina.summary.avg_affinity) }}
Avg Affinity (GNINA)
{{ "%.2f"|format(data.docking_detailed.gnina.summary.best_affinity) }}
Best Affinity (GNINA)
{{ data.docking_detailed.gnina.summary.count }}
Molecules Docked
{% elif data.docking_detailed.smina.summary %}
{{ "%.2f"|format(data.docking_detailed.smina.summary.avg_affinity) }}
Avg Affinity (SMINA)
{{ "%.2f"|format(data.docking_detailed.smina.summary.best_affinity) }}
Best Affinity (SMINA)
{{ data.docking_detailed.smina.summary.count }}
Molecules Docked
{% endif %}
{% if data.docking_detailed.gnina.raw_data or plots.docking_gnina %}

GNINA Results

{% if plots.docking_gnina_affinity_hist %}{{ plots.docking_gnina_affinity_hist | safe }} {% elif plots.docking_gnina %}{{ plots.docking_gnina | safe }}{% endif %}
{% if plots.docking_gnina_top_molecules %}{{ plots.docking_gnina_top_molecules | safe }}{% endif %}
{% if plots.docking_gnina_affinity_box %}

Binding Affinity by Model

{{ plots.docking_gnina_affinity_box | safe }}
{% endif %}
{% endif %} {% if data.docking_detailed.smina.raw_data or plots.docking_smina %}

SMINA Results

{% if plots.docking_smina_affinity_hist %}{{ plots.docking_smina_affinity_hist | safe }} {% elif plots.docking_smina %}{{ plots.docking_smina | safe }}{% endif %}
{% if plots.docking_smina_top_molecules %}{{ plots.docking_smina_top_molecules | safe }}{% endif %}
{% if plots.docking_smina_affinity_box %}

Binding Affinity by Model

{{ plots.docking_smina_affinity_box | safe }}
{% endif %}
{% endif %}
{% endif %} {% if data.docking_filters_detailed and data.docking_filters_detailed.per_filter %}

Docking Filters

{% if data.available_models and data.docking_filters_detailed.by_model %}
{% for model in data.available_models %} {% endfor %}
{% endif %}
{{ data.docking_filters_detailed.total_poses }}
Total Poses
{{ data.docking_filters_detailed.passed_poses }}
Passed Poses
{{ data.docking_filters_detailed.pass_rate }}%
Pass Rate
{{ data.docking_filters_detailed.unique_molecules_passed }}
Unique Molecules
{{ data.docking_filters_detailed.aggregation_mode }}
Aggregation Mode
{% if plots.docking_filters_pass_fail %}

Per-Filter Results

{{ plots.docking_filters_pass_fail | safe }}
{% endif %} {% if plots.docking_filters_metric_hists %}

Metric Distributions

Filter Threshold
{{ plots.docking_filters_metric_hists | safe }}
{% endif %} {% if plots.docking_filters_by_model %}

Pass Rate by Model

{{ plots.docking_filters_by_model | safe }}
{% endif %} {% set interaction_summary = data.docking_filters_detailed.interaction_summary %} {% if interaction_summary and (interaction_summary.total_events or interaction_summary.top_residues or interaction_summary.type_distribution or interaction_summary.matrix) %}

Interaction Profile

{{ interaction_summary.total_events or 0 }}
Interaction Events
{{ interaction_summary.poses_with_interactions or 0 }}
Poses With Interactions
{{ interaction_summary.unique_residues or 0 }}
Unique Residues
{{ interaction_summary.unique_interaction_types or 0 }}
Interaction Types
{% if plots.docking_filters_interactions_top_residues %}

Top Residues

{{ plots.docking_filters_interactions_top_residues | safe }}
{% endif %} {% if plots.docking_filters_interactions_type_distribution %}

Interaction Type Distribution

{{ plots.docking_filters_interactions_type_distribution | safe }}
{% endif %} {% if plots.docking_filters_interactions_matrix %}

Residue x Interaction Type Matrix

{{ plots.docking_filters_interactions_matrix | safe }}
{% endif %} {% endif %}
{% endif %} {% if data.descriptors_final_detailed and data.descriptors_final_detailed.raw_data %}

Final Descriptors

{% if data.descriptors_final.summary %}
{% for desc_name, desc_stats in data.descriptors_final.summary.items() %}
{{ "%.2f"|format(desc_stats.mean) }}
{{ desc_name }} (mean)
{% endfor %}
{% endif %} {% if plots.descriptors_comparison_data %}

Initial vs Final Descriptor Comparison

Initial Final Initial Mean Final Mean
{% endif %}
{% endif %} {% if data.moleval and data.moleval.by_stage %}

Generative Metrics (MolEval)

Distribution quality metrics computed at key pipeline stages. Shows how molecular diversity, uniqueness, and filter passage rates evolve through the filtering funnel.

Detailed Values

{% for stage in data.moleval.stages %} {% endfor %} {% for metric in data.moleval.metrics %} {% for stage in data.moleval.stages %} {% set val = data.moleval.by_stage[stage][metric] if metric in data.moleval.by_stage.get(stage, {}) else None %} {% endfor %} {% endfor %}
Metric{{ stage }}
{{ metric }} {% if val is not none %}{{ "%.4f"|format(val) }}{% else %}—{% endif %}
{% endif %}