Sha256: 8c4774ea6898bad91e11284eb61dc56072faa73f06ac08c32eea04c679f3016d

Contents?: true

Size: 1.69 KB

Versions: 2

Compression:

Stored size: 1.69 KB

Contents

# ## Show Character Bounding Boxes
#
# This examples shows how to process the contents of a page. It finds all
# characters on a page and surrounds them with their bounding box. Additionally,
# all consecutive text runs are also surrounded by box.
#
# The code provides two ways of generating the boxes. The commented part of
# `ShowTextProcessor#show_text` uses a polyline since some characters may be
# transforemd (rotated or skewed). The un-commented part uses rectangles which
# is faster and correct in most cases.
#
# Usage:
# : `ruby show_char_boxes.rb INPUT.PDF`
#

require 'hexapdf'

class ShowTextProcessor < HexaPDF::Content::Processor

  def initialize(page)
    super()
    @canvas = page.canvas(type: :overlay)
  end

  def show_text(str)
    boxes = decode_text_with_positioning(str)
    return if boxes.string.empty?

    @canvas.line_width = 1
    @canvas.stroke_color(224, 0, 0)
    # Polyline for transformed characters
    #boxes.each {|box| @canvas.polyline(*box.points).close_subpath.stroke}
    # Using rectangles is faster but not 100% correct
    boxes.each do |box|
      x, y = *box.lower_left
      tx, ty = *box.upper_right
      @canvas.rectangle(x, y, tx - x, ty - y).stroke
    end

    @canvas.line_width = 0.5
    @canvas.stroke_color(0, 224, 0)
    @canvas.polyline(*boxes.lower_left, *boxes.lower_right,
                     *boxes.upper_right, *boxes.upper_left).close_subpath.stroke
  end
  alias :show_text_with_positioning :show_text

end

doc = HexaPDF::Document.open(ARGV.shift)
doc.pages.each_with_index do |page, index|
  puts "Processing page #{index + 1}"
  processor = ShowTextProcessor.new(page)
  page.process_contents(processor)
end
doc.write('show_char_boxes.pdf', optimize: true)

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
hexapdf-0.3.0 examples/show_char_bboxes.rb
hexapdf-0.2.0 examples/show_char_bboxes.rb