Metadata-Version: 2.4
Name: codeboost
Version: 0.2.0
Summary: External reasoning framework for LLMs. Cognitive drift + fieldmap knowledge injection. Qwen 7B +10% on HumanEval, zero training.
Project-URL: Homepage, https://github.com/jkdkr2439/codeboost
Project-URL: Repository, https://github.com/jkdkr2439/codeboost
Project-URL: Issues, https://github.com/jkdkr2439/codeboost/issues
Author-email: "Kevin T.N" <jkdkr2439@gmail.com>
License: MIT
License-File: LICENSE
Keywords: ai-augmentation,code-generation,coding,context,fieldmap,knowledge-graph,llm,reasoning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Requires-Dist: pydantic>=2.0
Provides-Extra: all
Requires-Dist: fastapi>=0.111; extra == 'all'
Requires-Dist: rich>=13.0; extra == 'all'
Requires-Dist: typer>=0.9; extra == 'all'
Requires-Dist: uvicorn>=0.30; extra == 'all'
Provides-Extra: cli
Requires-Dist: rich>=13.0; extra == 'cli'
Requires-Dist: typer>=0.9; extra == 'cli'
Provides-Extra: dev
Requires-Dist: httpx>=0.27; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Provides-Extra: server
Requires-Dist: fastapi>=0.111; extra == 'server'
Requires-Dist: uvicorn>=0.30; extra == 'server'
Description-Content-Type: text/markdown

# codeboost

**External reasoning framework for LLMs. Zero training, zero fine-tuning. `pip install` and go.**

Injects structured coding knowledge into any LLM prompt. Weak models gain the most — up to +19.5% on HumanEval.

## Install

```bash
pip install codeboost
```

## Quick Start

```python
from codeboost import query

result = query("Find all pairs in sorted array that sum to target")
print(result.context)  # inject this into your LLM prompt
```

## Inject into any LLM

```python
from openai import OpenAI
from codeboost import get_context

client = OpenAI()
context = get_context("binary search on rotated array")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": f"Use this knowledge:\n{context}"},
        {"role": "user", "content": "Implement binary search on rotated sorted array"},
    ],
)
```

Works with **any LLM**: OpenAI, Anthropic, Ollama, Groq, DeepSeek, Gemini, Mistral...

## How It Works

Five-stage pipeline:

1. **Gates** — classify task into IPOD phase (Input/Process/Output/Data), detect constraints and edge cases
2. **Drift** — 6-field cognitive analysis (Logic, Spatial, Pattern, Transform, Compose, Metaphor)
3. **Fieldmap** — query 5,766-node knowledge graph with boosted keywords
4. **Describer** — generate natural-language approach description (small models understand text > labels)
5. **Compact** — optional 89% compression for tiny context windows

```
Task → Gates → Drift → Fieldmap → Describer → Context string → LLM
```

## Benchmark Results (HumanEval, pass@1)

All models: Qwen 2.5 Coder family. Full 164 problems. Tests executed, not string-matched.

| Model | Raw | + Codeboost | Delta | Wins / Losses |
|-------|:---:|:---:|:---:|:---:|
| 1.5B | 29.9% | **49.4%** | **+19.5%** | 42W / 10L |
| 3B | 72.6% | **76.8%** | +4.3% | 14W / 7L |
| 7B | 64.6% | **73.2%** | +8.5% | 19W / 5L |
| 14B | 62.2% | **73.8%** | **+11.6%** | 22W / 3L |

**7B + codeboost ≈ 14B raw** — same accuracy, half the VRAM.

### Hardness Analysis

Codeboost helps most on hard problems and rarely hurts easy ones.

| Model | Hard (raw fail) | Boost rescues | Easy (raw pass) | Boost damages |
|-------|:---:|:---:|:---:|:---:|
| 7B | 58 problems | **19 solved (32.8%)** | 106 problems | 5 broken (4.7%) |
| 14B | 62 problems | **22 solved (35.5%)** | 102 problems | 3 broken (2.9%) |

On 14B: **12:1 rescue-to-damage ratio** — codeboost rescues 35.5% of hard problems while only breaking 2.9% of easy ones.

### Ablation (7B)

| Condition | pass@1 | Delta |
|-----------|:---:|:---:|
| Raw (no codeboost) | 64.6% | — |
| Fieldmap only (no drift) | 70.7% | +6.1% |
| Full (fieldmap + drift + describer) | 73.2% | +8.5% |
| Compact mode | 68.3% | +3.7% |

## Feedback Loop

Track what works, avoid what doesn't:

```python
from codeboost import query, feedback

result = query("two sum problem")
# ... use result.context with your LLM ...

feedback("two sum problem", "positive")   # next query reuses good patterns
feedback("two sum problem", "negative")   # next query avoids bad keywords
```

## CLI

```bash
pip install codeboost[cli]

codeboost query "sliding window maximum"
codeboost stats
codeboost serve --port 8000
```

## API Server

```bash
pip install codeboost[server]
codeboost serve

# POST http://localhost:8000/v1/query
# GET  http://localhost:8000/v1/stats
# GET  http://localhost:8000/health
```

## Who Is This For

- **Developers running local models** (Ollama, vLLM) who want better results without upgrading hardware
- **Startups** self-hosting inference — cut model size in half, keep accuracy
- **Offline / air-gapped environments** — no API calls, no internet needed
- **Batch processing** — 10K solutions with a small model + codeboost costs 10x less than API

## License

MIT
