In today's digital age, dealing with PDFs, scanned documents, and images often presents the challenge of editing and searching text. Enter Stirling PDF, an open-source tool that harnesses the power of Optical Character Recognition (OCR) technology to transform these documents into editable and searchable formats. Packaged as a Docker container, Stirling PDF is easy to deploy and manage across various environments, making it a must-have for anyone working with documents frequently.
Key Features
- OCR Capabilities: Stirling PDF excels at recognizing text from scanned documents and images, enabling users to extract and edit text from previously non-editable PDFs.
- Multi-Language Support: With the flexibility to add additional language training data, Stirling PDF caters to a global audience by processing documents in various languages.
- User-Friendly Interface: Designed for simplicity, Stirling PDF allows users to navigate its features intuitively, enhancing the overall user experience without a steep learning curve.
- Customization: Users can mount external directories for training data and settings, tailoring the service to meet specific needs.
- Security Features: Stirling PDF includes options for user authentication, adding an extra layer of protection for sensitive documents.
Getting Started with Stirling PDF
Setting up Stirling PDF is a breeze thanks to its Docker containerization. Follow these simple steps to get started:
Install Docker: If you haven't already, install it. learn more about Docker
Create Directories:
mkdir -p docker/stirling-pdf
cd docker/stirling-pdf
mkdir -p {trainingData,extraConfigs,logs}
Create docker-compose.yml file using your preferred text editor:
sudo nano docker-compose.yml
Copy and paste the following configuration into the file:
services:
stirling-pdf:
image: frooodle/s-pdf:latest
ports:
- '8080:8080' # change the left side port if 8080 is already in use on your host machine
volumes:
- ./trainingData:/usr/share/tessdata #Required for extra OCR languages
- ./extraConfigs:/configs
environment:
- DOCKER_ENABLE_SECURITY=false
- INSTALL_BOOK_AND_ADVANCED_HTML_OPS=false
- LANGS=en_US # change this to your preferred language code
Run the following command to start the service:
docker compose up -d && docker compose logs -f
Access the Application: Open your web browser and navigate to the IP address of your host followed by the port number:
http://192.168.10.23:8080