Sha256: 127c8c2cdb9b12c5533a460799aeac724b8ab2b08cccccdc0352ee29e5b3bd62

Contents?: true

Size: 1.93 KB

Versions: 4

Compression:

Stored size: 1.93 KB

Contents

module OcrFile
  module ImageEngines
    class ImageMagick
      # TODO:
      # Conversion of image types
      # Rotation and detection of skew

      attr_reader :image_path, :image, :temp_path, :save_file_path, :config

      def initialize(image_path:, temp_path:, save_file_path:, config:)
        @image_path = image_path
        @config = config
        @save_file_path = save_file_path

        @temp_path = temp_path

        # Will be available in the next version of MiniMagick > 4.11.0
        # https://github.com/minimagick/minimagick/pull/541
        # MiniMagick.configure do |config|
        #   # cli_version  graphicsmagick?  imagemagick7?  imagemagick? version
        #   config.tmpdir = File.join(Dir.tmpdir, @temp_path)
        # end

        @image = MiniMagick::Image.open(image_path)
      end

      def convert!
        return @image_path unless @config[:image_preprocess]

        @config[:effects].each do |effect|
          self.send(effect.to_sym)
        end

        save!
      end

      def save!
        image.write(@save_file_path)
        @save_file_path
      end

      # Effects
      # http://www.imagemagick.org/script/command-line-options.php
      def bw
        @image.alpha('off')
        @image.auto_threshold("otsu")
      end

      def enhance
        @image.enhance
      end

      def norm
        @image.equalize
      end

      # Most likely not going to be configurable because
      # these are aggressive parameters used to optimised OCR results
      # and not the final results of the PDFs
      def sharpen
        @image.sharpen('0x4') # radiusXsigma
      end

      # https://github.com/ImageMagick/ImageMagick/discussions/4145
      def remove_shadow
        @image.negate
        @image.lat("20x20+10\%")
        @image.negate
      end

      def deskew
        @image.deskew('40%') # threshold recommended in the docs
      end

      def despeckle
        @image.despeckle
      end
    end
  end
end

Version data entries

4 entries across 4 versions & 1 rubygems

Version Path
ocr-file-0.0.7 lib/ocr-file/image_engines/image_magick.rb
ocr-file-0.0.6 lib/ocr-file/image_engines/image_magick.rb
ocr-file-0.0.4 lib/ocr-file/image_engines/image_magick.rb
ocr-file-0.0.3 lib/ocr-file/image_engines/image_magick.rb