Saturday, October 19, 2024

Install Stirling PDF on Linux Based Operating Systems


In today's digital age, dealing with PDFs, scanned documents, and images often presents the challenge of editing and searching text. Enter Stirling PDF, an open-source tool that harnesses the power of Optical Character Recognition (OCR) technology to transform these documents into editable and searchable formats. Packaged as a Docker container, Stirling PDF is easy to deploy and manage across various environments, making it a must-have for anyone working with documents frequently.

Key Features

  • OCR Capabilities: Stirling PDF excels at recognizing text from scanned documents and images, enabling users to extract and edit text from previously non-editable PDFs.
  • Multi-Language Support: With the flexibility to add additional language training data, Stirling PDF caters to a global audience by processing documents in various languages.
  • User-Friendly Interface: Designed for simplicity, Stirling PDF allows users to navigate its features intuitively, enhancing the overall user experience without a steep learning curve.
  • Customization: Users can mount external directories for training data and settings, tailoring the service to meet specific needs.
  • Security Features: Stirling PDF includes options for user authentication, adding an extra layer of protection for sensitive documents.

Getting Started with Stirling PDF

Setting up Stirling PDF is a breeze thanks to its Docker containerization. Follow these simple steps to get started:

Install Docker: If you haven't already, install it. learn more about Docker

Create Directories:

mkdir -p docker/stirling-pdf
cd docker/stirling-pdf
mkdir -p {trainingData,extraConfigs,logs}

Create docker-compose.yml file using your preferred text editor:

sudo nano docker-compose.yml

Copy and paste the following configuration into the file:

    image: frooodle/s-pdf:latest
      - '8080:8080'   # change the left side port if 8080 is already in use on your host machine
      - ./trainingData:/usr/share/tessdata #Required for extra OCR languages
      - ./extraConfigs:/configs
      - LANGS=en_US    # change this to your preferred language code

Run the following command to start the service:

docker compose up -d && docker compose logs -f

Access the Application: Open your web browser and navigate to the IP address of your host followed by the port number:

