Sha256: 4f4ed4daba4d42150974a317b5dae46c5e2ce65edf79246c39842233da155805
Contents?: true
Size: 1.36 KB
Versions: 2
Compression:
Stored size: 1.36 KB
Contents
Feature: Deduplicating data in a file The `deduplicate` keyword can be used to remove duplicate records from a file. Only the first occurence of each duplicate record is kept. Parameters: - The source file to be deduplicated. This parameter is mandatory. - into: The target file name. If no target file is specified then the source file is overwritten with the result of the deduplication. - using: The fields to be used to determine whether or not a record is a duplicate. If not specified then all fields of the source file are used. Scenario: Single file transformation Given a file named "command_script.rb" with: """ file :items do field :rownum field :item_id field :item_name end file :unique_items do field :item_id field :item_name end deduplicate :items, into: :unique_items, using: :item_id """ And a file named "items.csv" with: """ rownum,item_id,item_name 1,Item1,Item name 1 2,Item1,Item name 1 3,Item2,Item name 2 4,Item2,Item name 2 5,Item3,Item name 3 """ When I run `forge command_script.rb` Then the exit status should be 0 And a file named "unique_items.csv" should exist And the file "unique_items.csv" should contain exactly: """ item_id,item_name Item1,Item name 1 Item2,Item name 2 Item3,Item name 3 """
Version data entries
2 entries across 2 versions & 1 rubygems
Version | Path |
---|---|
data_forge-0.1.1 | features/deduplication.feature |
data_forge-0.1 | features/deduplication.feature |