README.md in embulk-output-bigquery-0.1.5 vs README.md in embulk-output-bigquery-0.1.6
- old
+ new
@@ -124,10 +124,10 @@
### Data Consistency
When `prevent_duplicate_insert` is set to true, embulk-output-bigquery generate job ID from md5 hash of file and other options to prevent duplicate data insertion.
-`job ID = md5(md5(file) + dataset + table + schema + source_format + file_delimiter + max_bad_records + encoding)`
+`job ID = md5(md5(file) + dataset + table + schema + source_format + file_delimiter + max_bad_records + encoding + ignore_unknown_values)`
[job ID must be unique(including failures)](https://cloud.google.com/bigquery/loading-data-into-bigquery#consistency). So same data can't insert with same settings.
In other words, you can retry as many times as you like, in case something bad error(like network error) happens before job insertion.