Sha256: 7355d9f1dfd01c09014134bee82c9d8d895d0353d5084a34f39542bf51a5bcb9

Contents?: true

Size: 791 Bytes

Versions: 11

Compression:

Stored size: 791 Bytes

Contents

class FormatParser::PDFParser
  include FormatParser::IOUtils
  # First 9 bytes of a PDF should be in this format, according to:
  #
  #  https://stackoverflow.com/questions/3108201/detect-if-pdf-file-is-correct-header-pdf
  #
  # There are however exceptions, which are left out for now.
  #
  PDF_MARKER = /%PDF-[12]\.[0-8]{1}/
  PDF_CONTENT_TYPE = 'application/pdf'

  def likely_match?(filename)
    filename =~ /\.(pdf|ai)$/i
  end

  def call(io)
    io = FormatParser::IOConstraint.new(io)

    header = safe_read(io, 9)
    return unless header =~ PDF_MARKER

    FormatParser::Document.new(format: :pdf, content_type: PDF_CONTENT_TYPE)
  rescue FormatParser::IOUtils::InvalidRead
    nil
  end

  FormatParser.register_parser new, natures: :document, formats: :pdf, priority: 3
end

Version data entries

11 entries across 11 versions & 1 rubygems

Version Path
format_parser-2.10.0 lib/parsers/pdf_parser.rb
format_parser-2.9.0 lib/parsers/pdf_parser.rb
format_parser-2.8.0 lib/parsers/pdf_parser.rb
format_parser-2.7.2 lib/parsers/pdf_parser.rb
format_parser-2.7.1 lib/parsers/pdf_parser.rb
format_parser-2.7.0 lib/parsers/pdf_parser.rb
format_parser-2.6.0 lib/parsers/pdf_parser.rb
format_parser-2.5.0 lib/parsers/pdf_parser.rb
format_parser-2.4.5 lib/parsers/pdf_parser.rb
format_parser-2.4.4 lib/parsers/pdf_parser.rb
format_parser-2.4.3 lib/parsers/pdf_parser.rb