Sha256: 656383d6b573e9ccd49c9c02e7bd4b1e7585a542144bb5ee747f29cf31b2656e

Contents?: true

Size: 1.2 KB

Versions: 4

Compression:

Stored size: 1.2 KB

Contents

module ETL #:nodoc:
  module Processor #:nodoc:
    # Row processor that checks whether or not the row has already passed 
    # through the ETL processor, using the key fields provided as the keys
    # to check.
    class CheckUniqueProcessor < ETL::Processor::RowProcessor

      # The keys to check
      attr_accessor :keys
      
      # Initialize the processor
      # Configuration options:
      # * <tt>:keys</tt>: An array of keys to check against
      def initialize(control, configuration)
        super
        @keys = configuration[:keys]
      end

      # A Hash of keys that have already been processed.
      def compound_key_constraints
        @compound_key_constraints ||= {}
      end
      
      # Process the row. This implementation will only return a row if it
      # it's key combination has not already been seen.
      #
      # An error will be raised if the row doesn't include the keys.
      def process(row)
        ensure_columns_available_in_row!(row, keys, 'for unicity check')
        
        key = (keys.collect { |k| row[k] }).join('|')
        unless compound_key_constraints[key]
          compound_key_constraints[key] = 1
          return row
        end
      end
    end
  end
end

Version data entries

4 entries across 4 versions & 3 rubygems

Version Path
activewarehouse-etl-1.0.0 lib/etl/processor/check_unique_processor.rb
activewarehouse-etl-1.0.0.rc1 lib/etl/processor/check_unique_processor.rb
etl-0.9.5.rc1 lib/etl/processor/check_unique_processor.rb
activewarehouse-etl-sgonyea-0.9.6 lib/etl/processor/check_unique_processor.rb