CHANGELOG in activewarehouse-etl-0.8.4 vs CHANGELOG in activewarehouse-etl-0.9.0

- old
+ new

@@ -1,33 +1,42 @@ 0.1.0 - Dec 6, 2006 * Initial release 0.2.0 - Dec 7, 2006 * Added an XML parser for source parsing -* Added support for compound key constraints in destinations via the :unique => [] option +* Added support for compound key constraints in destinations via the + :unique => [] option * Added ability to declare explicit columns in bulk import * Added support for generators in destinations -* Added a SurrogateKeyGenerator for cases where the database doesn't support auto generated surrogate keys +* Added a SurrogateKeyGenerator for cases where the database doesn't support + auto generated surrogate keys 0.3.0 - Dec 19, 2006 * Added support for calculated values in virtual fields with Proc 0.4.0 - Jan 11, 2006 -* Added :skip_lines option to file source configurations, which can be used to skip the first n lines in the source data file -* Added better error handling in delimited parser - an error is now raised if the expected and actual field lengths do not match -* Added :truncate option for database destination. Set to true to truncate before importing data. -* Added support for :unique => [] option and virtual fields for the database destination +* Added :skip_lines option to file source configurations, which can be used + to skip the first n lines in the source data file +* Added better error handling in delimited parser - an error is now raised + if the expected and actual field lengths do not match +* Added :truncate option for database destination. Set to true to truncate + before importing data. +* Added support for :unique => [] option and virtual fields for the database + destination 0.5.0 - Feb 17, 2007 -* Changed require_gem to gem and added alias to allow for older versions of rubygems. -* Added support for Hash in the source configuration where :name => :parser_name defines the parser to use and - :options => {} defines options to pass to the parser. +* Changed require_gem to gem and added alias to allow for older versions of + rubygems. +* Added support for Hash in the source configuration where :name => :parser_name + defines the parser to use and :options => {} defines options to pass to the + parser. * Added support for passing a custom Parser class in the source configuration. * Removed the need to include Enumerable in each parser implementation. * Added new date_to_string and string_to_date transformers. * Implemented foreign_key_lookup transform including an ActiveRecordResolver. -* Added real time activity logging which is called when the etl bin script is invoked. +* Added real time activity logging which is called when the etl bin script is + invoked. * Improved error handling. * Default logger level is now WARN. 0.5.1 - Feb 18, 2007 * Fixed up truncate processor. @@ -38,96 +47,111 @@ * Fixed problem with transform error handling. 0.6.0 - Mar 8, 2007 * Fixed missing method problem in validate in Control class. * Removed control validation for now (source could be code in the control file). -* Transform interface now defined as taking 3 arguments, the field name, field value and the row. This - is not backwards compatible. +* Transform interface now defined as taking 3 arguments, the field name, field + value and the row. This is not backwards compatible. * Added HierarchyLookupTransform. -* Added DefaultTransform which will return a specified value if the initial value is blank. +* Added DefaultTransform which will return a specified value if the initial + value is blank. * Added row-level processing. -* Added HierarchyExploderProcessor which takes a single hierarchy row and explodes it to multiple rows - as used in a hierarchy bridge. -* Added ApacheCombinedLogParser which parses Apache Combined Log format, including parsing of the +* Added HierarchyExploderProcessor which takes a single hierarchy row and + explodes it to multiple rows as used in a hierarchy bridge. +* Added ApacheCombinedLogParser which parses Apache Combined Log format, + including parsing of the user agent string and the URI, returning a Hash. -* Fixed bug in SAX parser so that attributes are now set when the start_element event is received. -* Added an HttpTools module which provides some parsing methods (for user agent and URI). -* Database source now uses its own class for establishing an ActiveRecord connection. +* Fixed bug in SAX parser so that attributes are now set when the start_element + event is received. +* Added an HttpTools module which provides some parsing methods (for user agent + and URI). +* Database source now uses its own class for establishing an ActiveRecord + connection. * Log files are now timestamped. * Source files are now archived automatically during the extraction process -* Added a :condition option to the destination configuration Hash that accepts a Proc with a single - argument passed to it (the row). -* Added an :append_rows option to the destination configuration Hash that accepts either a Hash (to - append a single row) or an Array of Hashes (to append multiple rows). -* Only print the read and written row counts if there is at least one source and one destination - respectively. -* Added a depends_on directive that accepts a list of arguments of either strings or symbols. Each - symbol is converted to a string and .ctl is appended; strings are passed through directly. The - dependencies are executed in the order they are specified. +* Added a :condition option to the destination configuration Hash that accepts + a Proc with a single argument passed to it (the row). +* Added an :append_rows option to the destination configuration Hash that + accepts either a Hash (to append a single row) or an Array of Hashes (to + append multiple rows). +* Only print the read and written row counts if there is at least one source + and one destination respectively. +* Added a depends_on directive that accepts a list of arguments of either strings + or symbols. Each symbol is converted to a string and .ctl is appended; + strings are passed through directly. The dependencies are executed in the order + they are specified. * The default field separator in the bulk loader is now a comma (was a tab). 0.6.1 - Mar 22, 2007 * Added support for absolute paths in file sources * Added CopyFieldProcessor 0.7 - Apr 8, 2007 -* Job execution is now tracked in a database. This means that ActiveRecord is required regardless - of the sources being used in the ETL scripts. An example database configuration for the etl can - be found in test/database.example.yml. This file is loaded from either a.) the current working - directory or b.) the location specified using the -c command line argument when running the - etl command. +* Job execution is now tracked in a database. This means that ActiveRecord is + required regardless of the sources being used in the ETL scripts. An example + database configuration for the etl can be found in test/database.example.yml. + This file is loaded from either a.) the current working directory or b.) the + location specified using the -c command line argument when running the etl + command. * etl script now supports the following command line arguments: ** -h or --help: Prints the usage -** -l or --limit: Specifies a limit for the number of source rows to read, useful for testing - your control files before executing a full ETL process -** -o or --offset: Specified a start offset for reading from the source, useful for testing your - control files before executing a full ETL process -** -c or --config: Specify the database.yml file to configure the ETL execution data store +** -l or --limit: Specifies a limit for the number of source rows to read, + useful for testing your control files before executing a full ETL process +** -o or --offset: Specified a start offset for reading from the source, useful + for testing your control files before executing a full ETL process +** -c or --config: Specify the database.yml file to configure the ETL + execution data store ** -n or --newlog: Write to the logfile rather than appending to it -* Database source now supports specifying the select, join and order parts of the query. -* Database source understands the limit argument specified on the etl command line +* Database source now supports specifying the select, join and order parts of + the query. +* Database source understands the limit argument specified on the etl command + line * Added CheckExistProcessor * Added CheckUniqueProcessor -* Added SurrogateKeyProcessor. The SurrogateKey processor should be used in conjunction with the - CheckExistProcessor and CheckUniqueProcessor to provide +* Added SurrogateKeyProcessor. The SurrogateKey processor should be used in + conjunction with the CheckExistProcessor and CheckUniqueProcessor to provide + surrogate keys for all dimension records. * Added SequenceProcessor * Added OrdinalizeTransform * Fixed a bug in the trim transform -* Sources now provide a trigger file which can be used to indicate that the original source - data has been completely extracted to the local file system. This is useful if you need to - recover from a failed ETL process. +* Sources now provide a trigger file which can be used to indicate that the + original source data has been completely extracted to the local file system. + This is useful if you need to recover from a failed ETL process. * Updated README 0.7.1 - Apr 8, 2007 * Fixed source caching 0.7.2 - Apr 8, 2007 * Fixed quoting bug in CheckExistProcessor 0.8.0 - Apr 12, 2007 * Source now available through the current row source accessor. -* Added new_rows_only configuration option to DatabaseSource. A date field must be specified and - only records that are greater than the date value in that field, relative to the last successful +* Added new_rows_only configuration option to DatabaseSource. A date field must + be specified and only records that are greater than the date value in that + field, relative to the last successful execution, will be returned from the source. -* Added an (untested) count feature which returns the number of rows for processing. -* If no natural key is defined then an empty array will now be used, resulting in the row being - written to the output without going through change checks. -* Mapping argument in destination is now optional. An empty hash will be used if the mapping - hash is not specified. If the mapping hash is not specified then the order will be determined - using the originating source's order. -* ActiveRecord configurations loaded from database.yml by the etl tool will be merged with - ActiveRecord::Base.configurations. +* Added an (untested) count feature which returns the number of rows for + processing. +* If no natural key is defined then an empty array will now be used, resulting + in the row being written to the output without going through change checks. +* Mapping argument in destination is now optional. An empty hash will be used + if the mapping hash is not specified. If the mapping hash is not specified + then the order will be determined using the originating source's order. +* ActiveRecord configurations loaded from database.yml by the etl tool will be + merged with ActiveRecord::Base.configurations. * Fixed several bugs in how record change detection was implemented. -* Fixed how the read_locally functionality was implemented so that it will find that last - completed local source copy using the source's trigger file (untested). +* Fixed how the read_locally functionality was implemented so that it will find + that last completed local source copy using the source's trigger file (untested). 0.8.1 - Apr 12, 2007 * Added EnumerableSource -* Added :type configuration option to the source directive, allowing the source type to be - explicitly specified. The source type can be a string or symbol (in which case the class will - be constructed by appending Source to the type name), a class (which will be instantiated - and passed the control, configuration and mapping) and finally an actual Source instance. +* Added :type configuration option to the source directive, allowing the source + type to be explicitly specified. The source type can be a string or symbol + (in which case the class will be constructed by appending Source to the type + name), a class (which will be instantiate and passed the control, + configuration and mapping) and finally an actual Source instance. 0.8.2 - April 15, 2007 * Fixed bug with premature destination closing. * Added indexes to execution records table. * Added a PrintRowProcessor. @@ -137,6 +161,18 @@ 0.8.3 - May 13, 2007 * Added patches from Andy Triboletti 0.8.4 - May 24, 2007 -* Added fix for backslash in file writer +* Added fix for backslash in file writer + +0.9.0 - +* Added support for batch processing through .ebf files. These files are + essentially control files that apply settings to an entire ETL process. +* Implemented support for screen blocks. These blocks can be used to test + the data and raise an error if the screens do not pass. +* Connections are now cached in a Hash available through + ETL::Engine.connection(name). This should be used rather than including + connection information in the control files. +* Implemented temp table support throughout. +* DateDimensionBuilder now included in ActiveWarehouse ETL directly. +* Time calculations for fiscal year now included in ActiveWarehouse ETL. \ No newline at end of file