CHANGELOG in activewarehouse-etl-0.8.4 vs CHANGELOG in activewarehouse-etl-0.9.0
- old
+ new
@@ -1,33 +1,42 @@
0.1.0 - Dec 6, 2006
* Initial release
0.2.0 - Dec 7, 2006
* Added an XML parser for source parsing
-* Added support for compound key constraints in destinations via the :unique => [] option
+* Added support for compound key constraints in destinations via the
+ :unique => [] option
* Added ability to declare explicit columns in bulk import
* Added support for generators in destinations
-* Added a SurrogateKeyGenerator for cases where the database doesn't support auto generated surrogate keys
+* Added a SurrogateKeyGenerator for cases where the database doesn't support
+ auto generated surrogate keys
0.3.0 - Dec 19, 2006
* Added support for calculated values in virtual fields with Proc
0.4.0 - Jan 11, 2006
-* Added :skip_lines option to file source configurations, which can be used to skip the first n lines in the source data file
-* Added better error handling in delimited parser - an error is now raised if the expected and actual field lengths do not match
-* Added :truncate option for database destination. Set to true to truncate before importing data.
-* Added support for :unique => [] option and virtual fields for the database destination
+* Added :skip_lines option to file source configurations, which can be used
+ to skip the first n lines in the source data file
+* Added better error handling in delimited parser - an error is now raised
+ if the expected and actual field lengths do not match
+* Added :truncate option for database destination. Set to true to truncate
+ before importing data.
+* Added support for :unique => [] option and virtual fields for the database
+ destination
0.5.0 - Feb 17, 2007
-* Changed require_gem to gem and added alias to allow for older versions of rubygems.
-* Added support for Hash in the source configuration where :name => :parser_name defines the parser to use and
- :options => {} defines options to pass to the parser.
+* Changed require_gem to gem and added alias to allow for older versions of
+ rubygems.
+* Added support for Hash in the source configuration where :name => :parser_name
+ defines the parser to use and :options => {} defines options to pass to the
+ parser.
* Added support for passing a custom Parser class in the source configuration.
* Removed the need to include Enumerable in each parser implementation.
* Added new date_to_string and string_to_date transformers.
* Implemented foreign_key_lookup transform including an ActiveRecordResolver.
-* Added real time activity logging which is called when the etl bin script is invoked.
+* Added real time activity logging which is called when the etl bin script is
+ invoked.
* Improved error handling.
* Default logger level is now WARN.
0.5.1 - Feb 18, 2007
* Fixed up truncate processor.
@@ -38,96 +47,111 @@
* Fixed problem with transform error handling.
0.6.0 - Mar 8, 2007
* Fixed missing method problem in validate in Control class.
* Removed control validation for now (source could be code in the control file).
-* Transform interface now defined as taking 3 arguments, the field name, field value and the row. This
- is not backwards compatible.
+* Transform interface now defined as taking 3 arguments, the field name, field
+ value and the row. This is not backwards compatible.
* Added HierarchyLookupTransform.
-* Added DefaultTransform which will return a specified value if the initial value is blank.
+* Added DefaultTransform which will return a specified value if the initial
+ value is blank.
* Added row-level processing.
-* Added HierarchyExploderProcessor which takes a single hierarchy row and explodes it to multiple rows
- as used in a hierarchy bridge.
-* Added ApacheCombinedLogParser which parses Apache Combined Log format, including parsing of the
+* Added HierarchyExploderProcessor which takes a single hierarchy row and
+ explodes it to multiple rows as used in a hierarchy bridge.
+* Added ApacheCombinedLogParser which parses Apache Combined Log format,
+ including parsing of the
user agent string and the URI, returning a Hash.
-* Fixed bug in SAX parser so that attributes are now set when the start_element event is received.
-* Added an HttpTools module which provides some parsing methods (for user agent and URI).
-* Database source now uses its own class for establishing an ActiveRecord connection.
+* Fixed bug in SAX parser so that attributes are now set when the start_element
+ event is received.
+* Added an HttpTools module which provides some parsing methods (for user agent
+ and URI).
+* Database source now uses its own class for establishing an ActiveRecord
+ connection.
* Log files are now timestamped.
* Source files are now archived automatically during the extraction process
-* Added a :condition option to the destination configuration Hash that accepts a Proc with a single
- argument passed to it (the row).
-* Added an :append_rows option to the destination configuration Hash that accepts either a Hash (to
- append a single row) or an Array of Hashes (to append multiple rows).
-* Only print the read and written row counts if there is at least one source and one destination
- respectively.
-* Added a depends_on directive that accepts a list of arguments of either strings or symbols. Each
- symbol is converted to a string and .ctl is appended; strings are passed through directly. The
- dependencies are executed in the order they are specified.
+* Added a :condition option to the destination configuration Hash that accepts
+ a Proc with a single argument passed to it (the row).
+* Added an :append_rows option to the destination configuration Hash that
+ accepts either a Hash (to append a single row) or an Array of Hashes (to
+ append multiple rows).
+* Only print the read and written row counts if there is at least one source
+ and one destination respectively.
+* Added a depends_on directive that accepts a list of arguments of either strings
+ or symbols. Each symbol is converted to a string and .ctl is appended;
+ strings are passed through directly. The dependencies are executed in the order
+ they are specified.
* The default field separator in the bulk loader is now a comma (was a tab).
0.6.1 - Mar 22, 2007
* Added support for absolute paths in file sources
* Added CopyFieldProcessor
0.7 - Apr 8, 2007
-* Job execution is now tracked in a database. This means that ActiveRecord is required regardless
- of the sources being used in the ETL scripts. An example database configuration for the etl can
- be found in test/database.example.yml. This file is loaded from either a.) the current working
- directory or b.) the location specified using the -c command line argument when running the
- etl command.
+* Job execution is now tracked in a database. This means that ActiveRecord is
+ required regardless of the sources being used in the ETL scripts. An example
+ database configuration for the etl can be found in test/database.example.yml.
+ This file is loaded from either a.) the current working directory or b.) the
+ location specified using the -c command line argument when running the etl
+ command.
* etl script now supports the following command line arguments:
** -h or --help: Prints the usage
-** -l or --limit: Specifies a limit for the number of source rows to read, useful for testing
- your control files before executing a full ETL process
-** -o or --offset: Specified a start offset for reading from the source, useful for testing your
- control files before executing a full ETL process
-** -c or --config: Specify the database.yml file to configure the ETL execution data store
+** -l or --limit: Specifies a limit for the number of source rows to read,
+ useful for testing your control files before executing a full ETL process
+** -o or --offset: Specified a start offset for reading from the source, useful
+ for testing your control files before executing a full ETL process
+** -c or --config: Specify the database.yml file to configure the ETL
+ execution data store
** -n or --newlog: Write to the logfile rather than appending to it
-* Database source now supports specifying the select, join and order parts of the query.
-* Database source understands the limit argument specified on the etl command line
+* Database source now supports specifying the select, join and order parts of
+ the query.
+* Database source understands the limit argument specified on the etl command
+ line
* Added CheckExistProcessor
* Added CheckUniqueProcessor
-* Added SurrogateKeyProcessor. The SurrogateKey processor should be used in conjunction with the
- CheckExistProcessor and CheckUniqueProcessor to provide
+* Added SurrogateKeyProcessor. The SurrogateKey processor should be used in
+ conjunction with the CheckExistProcessor and CheckUniqueProcessor to provide
+ surrogate keys for all dimension records.
* Added SequenceProcessor
* Added OrdinalizeTransform
* Fixed a bug in the trim transform
-* Sources now provide a trigger file which can be used to indicate that the original source
- data has been completely extracted to the local file system. This is useful if you need to
- recover from a failed ETL process.
+* Sources now provide a trigger file which can be used to indicate that the
+ original source data has been completely extracted to the local file system.
+ This is useful if you need to recover from a failed ETL process.
* Updated README
0.7.1 - Apr 8, 2007
* Fixed source caching
0.7.2 - Apr 8, 2007
* Fixed quoting bug in CheckExistProcessor
0.8.0 - Apr 12, 2007
* Source now available through the current row source accessor.
-* Added new_rows_only configuration option to DatabaseSource. A date field must be specified and
- only records that are greater than the date value in that field, relative to the last successful
+* Added new_rows_only configuration option to DatabaseSource. A date field must
+ be specified and only records that are greater than the date value in that
+ field, relative to the last successful
execution, will be returned from the source.
-* Added an (untested) count feature which returns the number of rows for processing.
-* If no natural key is defined then an empty array will now be used, resulting in the row being
- written to the output without going through change checks.
-* Mapping argument in destination is now optional. An empty hash will be used if the mapping
- hash is not specified. If the mapping hash is not specified then the order will be determined
- using the originating source's order.
-* ActiveRecord configurations loaded from database.yml by the etl tool will be merged with
- ActiveRecord::Base.configurations.
+* Added an (untested) count feature which returns the number of rows for
+ processing.
+* If no natural key is defined then an empty array will now be used, resulting
+ in the row being written to the output without going through change checks.
+* Mapping argument in destination is now optional. An empty hash will be used
+ if the mapping hash is not specified. If the mapping hash is not specified
+ then the order will be determined using the originating source's order.
+* ActiveRecord configurations loaded from database.yml by the etl tool will be
+ merged with ActiveRecord::Base.configurations.
* Fixed several bugs in how record change detection was implemented.
-* Fixed how the read_locally functionality was implemented so that it will find that last
- completed local source copy using the source's trigger file (untested).
+* Fixed how the read_locally functionality was implemented so that it will find
+ that last completed local source copy using the source's trigger file (untested).
0.8.1 - Apr 12, 2007
* Added EnumerableSource
-* Added :type configuration option to the source directive, allowing the source type to be
- explicitly specified. The source type can be a string or symbol (in which case the class will
- be constructed by appending Source to the type name), a class (which will be instantiated
- and passed the control, configuration and mapping) and finally an actual Source instance.
+* Added :type configuration option to the source directive, allowing the source
+ type to be explicitly specified. The source type can be a string or symbol
+ (in which case the class will be constructed by appending Source to the type
+ name), a class (which will be instantiate and passed the control,
+ configuration and mapping) and finally an actual Source instance.
0.8.2 - April 15, 2007
* Fixed bug with premature destination closing.
* Added indexes to execution records table.
* Added a PrintRowProcessor.
@@ -137,6 +161,18 @@
0.8.3 - May 13, 2007
* Added patches from Andy Triboletti
0.8.4 - May 24, 2007
-* Added fix for backslash in file writer
+* Added fix for backslash in file writer
+
+0.9.0 -
+* Added support for batch processing through .ebf files. These files are
+ essentially control files that apply settings to an entire ETL process.
+* Implemented support for screen blocks. These blocks can be used to test
+ the data and raise an error if the screens do not pass.
+* Connections are now cached in a Hash available through
+ ETL::Engine.connection(name). This should be used rather than including
+ connection information in the control files.
+* Implemented temp table support throughout.
+* DateDimensionBuilder now included in ActiveWarehouse ETL directly.
+* Time calculations for fiscal year now included in ActiveWarehouse ETL.
\ No newline at end of file