Metadata-Version: 2.4
Name: montaigne
Version: 0.8.3
Summary: Media processing toolkit for presentation localization
Author: Yann Debray
License: MIT
Project-URL: Homepage, https://github.com/yanndebray/montaigne
Project-URL: Documentation, https://github.com/yanndebray/montaigne#readme
Project-URL: Repository, https://github.com/yanndebray/montaigne
Keywords: gemini,ai,localization,pdf,audio,tts,translation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Image Processing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: google-genai>=1.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: pymupdf>=1.24.0
Requires-Dist: python-pptx>=0.6.21
Requires-Dist: Pillow>=10.0.0
Requires-Dist: tqdm>=4.65.0
Requires-Dist: requests>=2.28.0
Requires-Dist: elevenlabs>=2.29.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: webapp
Requires-Dist: streamlit>=1.30.0; extra == "webapp"
Provides-Extra: cloud
Requires-Dist: fastapi>=0.100.0; extra == "cloud"
Requires-Dist: uvicorn[standard]>=0.20.0; extra == "cloud"
Requires-Dist: google-cloud-storage>=2.10.0; extra == "cloud"
Requires-Dist: python-multipart>=0.0.6; extra == "cloud"
Provides-Extra: all
Requires-Dist: montaigne[cloud,dev,webapp]; extra == "all"
Dynamic: license-file

# Montaigne

Media processing toolkit for presentation localization using Google Gemini AI.

## Features

- **PDF Extraction**: Convert PDF pages to images
- **Script Generation**: Generate voiceover scripts from slides using AI
- **Image Translation**: Translate text in images to any language
- **Audio Generation**: Generate voiceover audio from scripts using TTS
- **PowerPoint Generation**: Create PPTX from PDF or images with speaker notes
- **Video Generation**: Combine slides and audio into MP4 videos

## Installation

### Using pip

```bash
pip install montaigne
```

### Using uv

```bash
uv pip install montaigne
```

### Using uvx (no installation required)

```bash
uvx --from montaigne essai setup
uvx --from montaigne essai script --input presentation.pdf
```

## Setup

1. Get a Gemini API key from [Google AI Studio](https://aistudio.google.com/)
2. Create a `.env` file:
   ```
   GEMINI_API_KEY=your-api-key
   ```
3. Verify setup:
   ```bash
   essai setup
   ```

## Usage

### Extract PDF to Images

```bash
essai pdf presentation.pdf
essai pdf presentation.pdf --dpi 200 --format jpg
```

### Generate Voiceover Script from Slides

```bash
essai script --input presentation.pdf
essai script --input slides_images/ --context "AI workshop"
essai script --input presentation.pdf --output custom_script.md
```

Options:
- `--input, -i`: PDF file or folder of slide images
- `--output, -o`: Output markdown file path
- `--context, -c`: Additional context to guide script generation. Use this to specify the topic, target audience, desired tone, or script length (e.g., "Brief scripts, 2-3 sentences per slide" or "Detailed technical explanations for developers")

### Generate Audio from Script

```bash
essai audio --script voiceover.md
essai audio --script voiceover.md --voice Kore
```

Available voices: `Puck`, `Charon`, `Kore`, `Fenrir`, `Aoede`, `Orus`

### Translate Images

```bash
essai images --input slides/
essai images --input image.png --lang Spanish
```

### Create PowerPoint from PDF or Images

```bash
essai ppt --input presentation.pdf
essai ppt --input slides/ --script voiceover.md
essai ppt --input presentation.pdf --keep-images
```

This will create a `.pptx` file with each PDF page or image as a slide. If a voiceover script is provided, it will be added as speaker notes.

### Generate Video from Slides

```bash
essai video --pdf presentation.pdf
essai video --images slides/ --audio audio/
```

### Full Localization Pipeline

```bash
essai localize --pdf presentation.pdf --script voiceover.md --lang French
```

This will:
1. Extract PDF pages to images
2. Translate all images to the target language
3. Generate audio for all slides

## Claude Command: Presentation Video Workflow

`montaigne/.claude/commands/presentation-video.md`

Invoke with:
```
/presentation-video $PDF_PATH=path/to/deck.pdf $CONTEXT_FOLDER=path/to/context $MAX_DURATION=60 $VOICE=Orus
```

The command automates the complete workflow:
- Extract context from .docx, .pdf, .md files
- Analyze presentation slides
- Generate voiceover script with configurable max duration per slide
- Generate TTS audio
- Create final video
- Optional localization to other languages

## Voiceover Script Format

Scripts should follow this markdown format:

```markdown
## SLIDE 1: Title
**[Duration: ~45 seconds]**

Your narration text for slide 1 goes here.

---

## SLIDE 2: Next Topic
**[Duration: ~60 seconds]**

Narration for slide 2.
```

## Demo

See the `demo/hamlet/` folder for a complete example with:
- Sample PDF presentation
- Voiceover script
- Image asset

```bash
cd demo/hamlet
essai localize --lang French
```

## Requirements

- Python 3.10+
- Google Gemini API key
- ffmpeg (for video generation)
- Dependencies: `google-genai`, `python-dotenv`, `pymupdf`, `python-pptx`, `Pillow`
