Sha256: e453d47ea45acb93ebf96ebee94dd3d53a0d3e70e2c2e92816122328e2a85534

Contents?: true

Size: 1.03 KB

Versions: 2

Compression:

Stored size: 1.03 KB

Contents

module Asciidoctor
module Pdf
module Sanitizer
  BuiltInEntityChars = {
    '&lt;' => '<',
    '&gt;' => '>',
    '&amp;' => '&'
  }
  BuiltInEntityCharRx = /(?:#{BuiltInEntityChars.keys * '|'})/
  BuiltInEntityCharOrTagRx = /(?:#{BuiltInEntityChars.keys * '|'}|<)/
  NumericCharRefRx = /&#(\d{2,6});/
  XmlSanitizeRx = /<[^>]+>/
  SegmentPcdataRx = /(?:(&[a-z]+;|<[^>]+>)|([^&<]+))/

  # Strip leading, trailing and repeating whitespace, remove XML tags and
  # resolve all entities in the specified string.
  #
  # FIXME move to a module so we can mix it in elsewhere
  # FIXME add option to control escaping entities, or a filter mechanism in general
  def sanitize string
    string.strip
        .gsub(XmlSanitizeRx, '')
        .tr_s(' ', ' ')
        .gsub(NumericCharRefRx) { [$1.to_i].pack('U*') }
        .gsub(BuiltInEntityCharRx, BuiltInEntityChars)
  end

  def upcase_pcdata string
    if BuiltInEntityCharOrTagRx =~ string
      string.gsub(SegmentPcdataRx) { $2 ? $2.upcase : $1 }
    else
      string.upcase
    end
  end
end
end
end

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
asciidoctor-pdf-1.5.0.alpha.13 lib/asciidoctor-pdf/sanitizer.rb
asciidoctor-pdf-1.5.0.alpha.12 lib/asciidoctor-pdf/sanitizer.rb