Sha256: bf87a046ce9067e1b9c651d318cb5b3501f70da9c14203d4a9843f059a3765f9

Contents?: true

Size: 452 Bytes

Versions: 4

Compression:

Stored size: 452 Bytes

Contents

require_relative './text_splitter'

module Baran
  class CharacterTextSplitter < TextSplitter
    attr_accessor :separator

    def initialize(chunk_size: 1024, chunk_overlap: 64, separator: nil)
      super(chunk_size: chunk_size, chunk_overlap: chunk_overlap)
      @separator = separator || "\n\n"
    end

    def splitted(text)
      splits = separator.empty? ? text.chars : text.split(separator)
      merged(splits, @separator)
    end
  end
end

Version data entries

4 entries across 4 versions & 1 rubygems

Version Path
baran-0.1.8 lib/baran/character_text_splitter.rb
baran-0.1.7 lib/baran/character_text_splitter.rb
baran-0.1.6 lib/baran/character_text_splitter.rb
baran-0.1.5 lib/baran/character_text_splitter.rb