RubygemsResearch

Sha256: 4b634200f4570aab39c86ff94a426d2f02b4122ab94f1295df599dc362e622ac

Contents?: true

Size: 716 Bytes

Versions: 8

Compression:

Stored size: 716 Bytes

class FormatParser::PDFParser
  include FormatParser::IOUtils
  # First 9 bytes of a PDF should be in this format, according to:
  #
  #  https://stackoverflow.com/questions/3108201/detect-if-pdf-file-is-correct-header-pdf
  #
  # There are however exceptions, which are left out for now.
  #
  PDF_MARKER = /%PDF-1\.[0-8]{1}/
  PDF_CONTENT_TYPE = 'application/pdf'

  def likely_match?(filename)
    filename =~ /\.(pdf|ai)$/i
  end

  def call(io)
    io = FormatParser::IOConstraint.new(io)

    return unless safe_read(io, 9) =~ PDF_MARKER

    FormatParser::Document.new(format: :pdf, content_type: PDF_CONTENT_TYPE)
  end

  FormatParser.register_parser new, natures: :document, formats: :pdf, priority: 1
end

Version data entries

8 entries across 8 versions & 1 rubygems

Version	Path
format_parser-1.2.1	lib/parsers/pdf_parser.rb
format_parser-1.2.0	lib/parsers/pdf_parser.rb
format_parser-1.1.0	lib/parsers/pdf_parser.rb
format_parser-1.0.0	lib/parsers/pdf_parser.rb
format_parser-0.29.1	lib/parsers/pdf_parser.rb
format_parser-0.29.0	lib/parsers/pdf_parser.rb
format_parser-0.28.0	lib/parsers/pdf_parser.rb
format_parser-0.27.0	lib/parsers/pdf_parser.rb