Friday, June 11, 2010

Needle in a Haystack

While working on loading 13m rows into a staging table, a name-parsing routine came across unexpected data, and the load failed. One of the greatest drawbacks to set-based ETL such as T-SQL, versus row- or batch-based ETL tools such as Informatica or SSIS, is that data discrepancies cause failures that are difficult to diagnose. In this case, the load failed with the error:

"Msg 537, Level 16, State 5, Procedure spd_Load_Stg_Names, Line 31
Invalid length parameter passed to the LEFT or SUBSTRING function.
The statement has been terminated."

This is almost useless as far as finding the problem data, as it does not tell me what value, or which row, caused the failure.

No comments:

Post a Comment