Metadata-Version: 2.4
Name: bmodels
Version: 1.0.0
Summary: High-performance YOLO inference library with automatic GPU optimization, temporal smoothing, and flexible output modes for images and videos
Author-email: Bulut Müftüoğlu <bulutmuf@criai.art>
License: Apache-2.0
Project-URL: Homepage, https://github.com/bulutmuf/bmodels
Project-URL: Documentation, https://github.com/bulutmuf/bmodels#readme
Project-URL: Repository, https://github.com/bulutmuf/bmodels
Project-URL: Bug Tracker, https://github.com/bulutmuf/bmodels/issues
Keywords: yolo,object-detection,computer-vision,deep-learning,inference,tracking,gpu-optimization,temporal-smoothing,aircraft-detection,real-time-detection
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ultralytics>=8.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: opencv-python>=4.5.3
Requires-Dist: numpy>=1.21.0
Requires-Dist: torch>=2.0.0
Provides-Extra: gpu-nvidia
Requires-Dist: onnxruntime-gpu>=1.15.0; extra == "gpu-nvidia"
Provides-Extra: gpu-amd
Requires-Dist: onnxruntime-directml>=1.15.0; extra == "gpu-amd"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: license-file

# bmodels

High-performance YOLO inference library supporting multiple detection models with automatic GPU optimization, temporal smoothing, and flexible output modes for both images and videos.

## Overview

bmodels is a production-ready inference engine for YOLO-based object detection models. It automatically optimizes performance based on your hardware while maintaining high accuracy through advanced temporal smoothing algorithms for video processing. Models are automatically downloaded and cached on first use.

The library currently features the [bplane-small](https://github.com/bulutmuf/bplane-small) model, a specialized YOLO architecture achieving 80% mAP@50 for military aircraft detection at 1024x1024 resolution. Additional models will be added to support diverse detection tasks beyond aviation.

## Key Features

**Multiple Model Support**
Extensible architecture supporting various YOLO-based detection models. Currently includes [bplane-small](https://github.com/bulutmuf/bplane-small) for military aircraft recognition (F-16, F-18, F-35, C-130, F-15, J-20, EF2000, Rafale, A-10, C-2, and general aircraft classification). Future releases will expand to additional detection domains.

**Automatic GPU Optimization**
Detects your GPU (CUDA for NVIDIA, DirectML for AMD/Intel, or CPU fallback) and configures inference parameters automatically. GPU acceleration packages are installed on-demand when needed.

**Temporal Smoothing for Video**
Advanced algorithm stabilizes classifications across video frames. Objects maintain consistent classifications even during brief occlusions or challenging angles, eliminating flickering between similar classes. Particularly effective for the bplane-small model during high-G maneuvers and extreme banking angles.

**Image and Video Processing**
Process single images, image batches, video files, or real-time streams. Export results to files, numpy arrays, or bytes for API integration. Flexible workflows for different use cases.

**Professional Rendering**
Customizable visualization with anti-aliased text, shadow effects, and transparency. Production-quality annotated outputs with class names, track IDs, and confidence scores.

**Zero Configuration**
Works immediately with sensible defaults. Advanced users can fine-tune detection thresholds, tracking parameters, and visual styling.

## Installation

Install via pip and start detecting immediately. GPU acceleration is configured automatically. Licensed under Apache 2.0.

## Quick Start

Process a video with automatic optimization:

```python
from bmodels import Load, process_video

model = Load("bplane-small-v1")
process_video(model, "input.mp4", "output.mp4")
```

The library automatically detects your GPU, selects optimal parameters, applies temporal smoothing, and renders professional annotations.

## Usage Examples

### Basic Usage

**Quick Start - Video Processing**

```python
from bmodels import Load, process_video

model = Load("bplane-small-v1")
process_video(model, "input.mp4", "output.mp4")
```

**Quick Start - Image Processing**

```python
from bmodels import process_image

rendered, detection = process_image(model, "aircraft.jpg", "output.jpg")
print(f"Detected {len(detection)} objects")
```

**Model Information**

```python
# Get model metadata and available classes
info = model.get_info()
print(f"Model: {info['model_name']}")
print(f"Device: {info['device']}")
print(f"Classes: {info['classes']}")
print(f"Total classes: {info['num_classes']}")
```

### Image Processing

**Single Image Detection**

```python
from bmodels import process_image

# Automatic detection and rendering
rendered, detection = process_image(
    model, 
    "aircraft.jpg", 
    "output.jpg",
    log_level="verbose"  # Options: "silent", "normal", "verbose"
)
```

**Detection Without Rendering**

```python
from bmodels import detect_image, draw_detections

# Detect only, no automatic rendering
image, detection, class_names = detect_image(model, "aircraft.jpg")

# Manual rendering with custom style
rendered = draw_detections(
    image,
    detection.boxes,
    detection.scores,
    detection.class_ids,
    track_ids=None,
    class_names=class_names,
    box_color=(0, 255, 255),  # Yellow
    box_thickness=3,
    text_color=(0, 0, 0),  # Black
    text_scale=0.8,
)
```

**Batch Image Processing**

```python
from bmodels import process_images

# Process multiple images at once
results = process_images(
    model,
    sources=["img1.jpg", "img2.jpg", "img3.jpg"],
    output_dir="detections",
    log_level="normal"
)

for rendered, detection in results:
    print(f"Detected {len(detection)} objects")
```

### Video Processing

**Basic Video Processing**

```python
from bmodels import process_video

# Process with automatic settings
process_video(model, "input.mp4", "output.mp4")
```

**Custom Styling**

```python
# Customize visual appearance
process_video(
    model, 
    "input.mp4", 
    "output.mp4",
    box_color=(80, 80, 80),      # Gray boxes
    box_thickness=2,
    text_color=(200, 200, 200),  # Light gray text
    text_scale=0.6,
    text_thickness=1,
    show_conf=True,
    show_class=True,
    show_track_id=True,
)
```

**Stream Processing (No File Output)**

```python
from bmodels import process_frames

# Process frames without saving
for frame, detection, class_names in process_frames(model, "input.mp4"):
    print(f"Frame {detection.frame_idx}: {len(detection)} objects")
    
    for i in range(len(detection)):
        class_name = class_names[detection.class_ids[i]]
        confidence = detection.scores[i]
        track_id = detection.track_ids[i] if detection.track_ids is not None else -1
        print(f"  {class_name} (ID:{track_id}): {confidence*100:.1f}%")
```

**Custom Frame Rendering**

```python
from bmodels import process_frames, draw_detections
import cv2

# Full control over rendering
for frame, detection, class_names in process_frames(model, "input.mp4"):
    if len(detection) > 0:
        rendered = draw_detections(
            frame,
            detection.boxes,
            detection.scores,
            detection.class_ids,
            detection.track_ids,
            class_names,
            box_color=(255, 0, 255),  # Magenta
            text_color=(0, 255, 255),  # Cyan
        )
        cv2.imshow("Custom Render", rendered)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

cv2.destroyAllWindows()
```

**Export Video to Array**

```python
from bmodels import process_to_array

# Get entire video as numpy array
frames, detections = process_to_array(model, "input.mp4", render=True)
print(f"Video shape: {frames.shape}")
print(f"Total frames: {len(frames)}")
print(f"Total detections: {sum(len(d) for d in detections)}")
```

**Export Video to Bytes**

```python
from bmodels import process_to_bytes

# Encode to bytes for API upload
video_bytes = process_to_bytes(model, "input.mp4")

# Upload to API
import requests
response = requests.post(
    "https://api.example.com/upload",
    files={"video": ("output.mp4", video_bytes, "video/mp4")}
)
```

### Camera/Webcam Processing

**Real-Time Camera Detection**

```python
from bmodels import process_camera

# Process webcam with live preview
process_camera(
    model,
    camera_id=0,           # 0 for default webcam
    output="webcam.mp4",   # Optional: save to file
    display=True,          # Show live preview
    show_fps=True,         # Display FPS counter
    max_frames=None,       # None for unlimited
)
```

**Camera Without Display**

```python
# Process camera without preview (headless)
process_camera(
    model,
    camera_id=0,
    output="recording.mp4",
    display=False,
    max_frames=300,  # Record 300 frames
    log_level="normal"
)
```

### Filtering and Export

**Filter Detections by Confidence**

```python
from bmodels import detect_image

image, detection, class_names = detect_image(model, "aircraft.jpg")

# Filter by confidence threshold
high_conf = detection.filter(min_conf=0.8)
medium_conf = detection.filter(min_conf=0.5)

print(f"Total: {len(detection)}")
print(f"High confidence (>80%): {len(high_conf)}")
print(f"Medium confidence (>50%): {len(medium_conf)}")
```

**Filter by Class**

```python
# Get available classes
info = model.get_info()
print(f"Available classes: {info['classes']}")

# Filter specific classes
image, detection, class_names = detect_image(model, "aircraft.jpg")
fighters_only = detection.filter(classes=[0, 1, 2])  # F-16, F-18, F-35
print(f"Fighter aircraft: {len(fighters_only)}")
```

**Process Video with Class Filter**

```python
# Detect only specific classes in video
process_video(
    model,
    "input.mp4",
    "output.mp4",
    classes=[0, 2, 5],  # Only F-16, F-35, etc.
    log_level="normal"
)
```

**Export Detections to JSON**

```python
from bmodels import process_frames, export_to_json

# Collect detections from video
detections = []
for frame, detection, class_names in process_frames(model, "input.mp4"):
    detections.append(detection)

# Export to JSON with class names
export_to_json(
    detections, 
    "results.json", 
    class_names=class_names,
    pretty=True
)
```

**Export Detections to CSV**

```python
from bmodels import export_to_csv

# Export to CSV format
export_to_csv(detections, "results.csv", class_names=class_names)
```

**Export to YOLO/COCO Format**

```python
from bmodels import export_to_txt

# Export in YOLO format (normalized xywh)
export_to_txt(
    detections, 
    "results.txt", 
    format="yolo",
    img_width=1920,
    img_height=1080
)

# Export in COCO format (xyxy)
export_to_txt(detections, "results_coco.txt", format="coco")
```

### Performance Optimization

**Benchmark Your Hardware**

```python
from bmodels import benchmark

# Test performance on your system
results = benchmark(model, "test.mp4", max_frames=100)

print(f"Average FPS: {results['avg_fps']}")
print(f"Frame time: {results['avg_frame_time_ms']}ms ± {results['std_frame_time_ms']}ms")
print(f"Detections per frame: {results['avg_detections_per_frame']}")
print(f"Hardware: {results['hardware']}")
print(f"GPU Type: {results['gpu_type']}")
```

**Frame Skipping for Speed**

```python
# Process every other frame (2x faster)
process_video(
    model,
    "input.mp4",
    "output.mp4",
    skip_frames=1,  # Skip 1 frame between each processed frame
)

# Process every 3rd frame (3x faster)
process_video(
    model,
    "input.mp4",
    "output.mp4",
    skip_frames=2,  # Skip 2 frames between each processed frame
)
```

**Silent Mode for Production**

```python
# No console output except final result
model = Load("bplane-small-v1", log_level="silent")
process_video(model, "input.mp4", "output.mp4", log_level="silent")
# Output: "DONE Processed 464 frames in 32.3s (14.4 fps) -> output.mp4"
```

### Advanced Configuration

**Manual Parameter Override**

```python
# Override automatic GPU optimization
process_video(
    model,
    "input.mp4",
    "output.mp4",
    imgsz=416,              # Inference resolution
    conf=0.35,              # Confidence threshold
    iou=0.6,                # NMS IOU threshold
    tracker="bytetrack",    # Tracker algorithm
    temporal_smoothing=True,
    stability_frames=20,    # Frames for stable classification
    tolerance_frames=8,     # Tolerance for brief changes
)
```

**Disable Temporal Smoothing**

```python
# Process without temporal smoothing
process_video(
    model,
    "input.mp4",
    "output.mp4",
    temporal_smoothing=False,  # Faster but may flicker
)
```

**Custom Tracker Selection**

```python
from bmodels import AVAILABLE_TRACKERS

print(f"Available trackers: {AVAILABLE_TRACKERS}")

# Use specific tracker
process_video(
    model,
    "input.mp4",
    "output.mp4",
    tracker="botsort",  # Options: bytetrack, botsort, ocsort, sort, strongsort
)
```

**Verbose Logging**

```python
# Detailed per-frame logging
process_video(
    model,
    "input.mp4",
    "output.mp4",
    log_level="verbose"  # Shows every frame's detections
)
```

## Performance

Performance scales automatically based on hardware:

- **High-end CUDA GPU (8GB+)**: 640px inference, BoT-SORT tracker, FP16 precision
- **Mid-range CUDA GPU (4-8GB)**: 512px inference, ByteTrack tracker, FP16 precision  
- **DirectML GPU (AMD/Intel)**: 320px inference, ByteTrack tracker, optimized for integrated graphics
- **CPU**: 256px inference, aggressive NMS, minimal overhead

Typical performance on AMD Radeon 540 (DirectML): 14-15 FPS at 320px resolution with full tracking.

Advanced users can override any parameter including inference resolution, confidence thresholds, NMS settings, tracker selection, temporal smoothing parameters, and visual styling options for complete control over the detection pipeline.

## Temporal Smoothing

For video processing, prevents classification flickering by requiring consistent predictions over multiple frames:

- Objects must maintain classification for 25 frames before considered stable
- Brief misclassifications lasting less than 10 frames are ignored
- Confidence scores remain dynamic and reflect current detection quality

This ensures stable classifications during challenging viewing angles or partial occlusions, addressing natural variation in detection confidence during dynamic scenes.

## Current Models

**bplane-small** - Military Aircraft Detection
- **Categories**: Air Superiority & Multi-Role (F-15, F-16, F-18, F-35, J-20, EF2000, Rafale), Close Air Support (A-10), Airlift & Utility (C-130, C-2), General Aircraft (fallback)
- **Performance**: 80% mAP@50 at 1024x1024 resolution
- **Strengths**: High reliability for transport aircraft and distinct fighter silhouettes
- **Repository**: [bplane-small](https://github.com/bulutmuf/bplane-small)

Additional models will be added to support diverse detection tasks.

## Contributing

Contributions welcome. Open an issue to discuss proposed changes before submitting pull requests.

## Support

For issues, questions, or feature requests, open an issue on GitHub.
