Skip to content

📦 Installation Guide

System Requirements

  • Python 3.8 or higher
  • Required Dependencies:
  • poppler-utils (for PDF processing)
  • libreoffice (for DOCX/ODT support)
  • tesseract-ocr (for OCR functionality)
  • libreoffice-headless (for server environments)
  • Disk Space: ~500MB (more for large document processing)
  • Memory: 2GB minimum (4GB recommended for large documents)

Installing System Dependencies

Ubuntu/Debian

sudo apt update
sudo apt install -y \
    poppler-utils \
    libreoffice \
    tesseract-ocr \
    tesseract-ocr-eng \
    libreoffice-headless

macOS (using Homebrew)

brew install poppler tesseract tesseract-lang libreoffice

Windows

  1. Download and install Poppler for Windows
  2. Download and install Tesseract OCR
  3. Download and install LibreOffice
  4. Add all installation directories to your system PATH

Installing Redoc

# Basic installation
pip install redoc

# Install with all optional dependencies
pip install "redoc[all]"

# Or install specific components
pip install "redoc[cli]"       # Command line interface
pip install "redoc[server]"     # Web server and API
pip install "redoc[ai]"         # AI features (requires Ollama)
pip install "redoc[ocr]"        # OCR capabilities (Tesseract)
pip install "redoc[templates]"  # Pre-built templates
# Pull the latest image
docker pull text2doc/redoc:latest

# Run a conversion
docker run -v $(pwd):/data text2doc/redoc convert input.pdf output.html

# Start the web interface
docker run -p 8000:8000 -v $(pwd)/templates:/app/templates text2doc/redoc serve

Development Installation

# Clone the repository
git clone https://github.com/text2doc/redoc.git
cd redoc

# Install in development mode with all dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

Verifying Installation

# Check Redoc installation
redoc --version

# Verify basic functionality
redoc convert --help

# Run self-tests
redoc test

Configuration

Redoc can be configured using environment variables or a configuration file:

Environment Variables

export REDOC_LOG_LEVEL=INFO
export REDOC_WORKERS=4
export REDOC_TEMP_DIR=/tmp/redoc

Configuration File

Create ~/.config/redoc/config.yaml:

log_level: INFO
workers: 4
temp_dir: /tmp/redoc
default_output_format: pdf

templates:
  search_paths:
    - ~/.config/redoc/templates
    - /usr/local/share/redoc/templates

Troubleshooting

Common Issues

  1. Missing Dependencies

    # Check for missing system dependencies
    which pdftohtml tesseract libreoffice
    

  2. Permission Errors

    # Fix permission issues for temporary files
    chmod 777 /tmp/redoc  # Or your configured temp directory
    

  3. Docker Issues

    # Check if Docker is running
    docker ps
    
    # Check container logs
    docker logs <container_id>
    

  4. OCR Problems

  5. Ensure Tesseract language packs are installed
  6. Check image quality and resolution
  7. Try with --preprocess image option for better OCR results

Next Steps