Tuesday, December 8, 2009

TSQL Challenge #18 Blues

I'm working on TSQL Challenge #18, and hitting up against a brick wall. The challenge involves building a calendar for given months of interest (month/year that is). So I build a CTE that first figures out the first & last day of the month, then the week numbers for those days. Matching the week numbers against a tally table, I can then tell which weeks that the days of a particular month will span - those weeks then become rows in the results. I then build a CTE to hold all of my days of interest, with the week number and day-of-month as columns. Using my 'Weeks' CTE as a base, I LEFT JOIN the 'Days' dataset seven times - one for each day of the week.

This is the error I encounter: "The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value." If I limit my day-related tally table to a max of 28 (rather than 31), the error disappears, but I don't get any days in my calendar past the 28th of course. The weird thing is, if I set the tally to max of 31, and I select from the 'Days2' CTE, I don't get an error, and all the days I expect to see are present. Something strange is going on under the hood of SQL.

Thursday, December 3, 2009

Kimball Design Tip #119 Updating the Date Dimension

I've always thought of the Time or Date dimension to be a static one, rather than slowly-changing, until I read this tip from Kimball U: "...there are some attributes you may add to the basic date dimension that will change over time. These include indicators such as IsCurrentDay, IsCurrentMonth, IsPriorDay, IsPriorMonth, and so on. IsCurrentDay obviously must be updated each day."

Creating and automatically updating these attributes of the Date dimension will save time (sorry, no pun intended), for report writers and data mart developers, and also standardize such calculations across the enterprise. For example, let's say you are responsible for the data warehouse of a catalog company, and you want to calculate lag time between ordering and shipping, a common metric of customer satisfaction. Furthermore, management decides to give everyone an extra holiday as a bonus for company profitability from last year, and so lets everyone take off Groundhog Day. To account for this in your metric, you are going to want to have an attribute like "BusinessDayOfYear" as an incrementing integer, so that the first business day in January gets assigned BusinessDayOfYear=1, the second day is 2, and so forth, until you have the highest number as the last business day in December (which is most likely Dec 30th). If the company now has Groundhog Day as a holiday, then the IsHoliday for Feb 2nd is changed from 0 to 1, and then the BusinessDayOfYear attribute is recalculated.

Calculating how many business days between ordering and shipping is then trivial. Of course, this does not account for orders placed in December that ship in January, so you might want to have an attribute BusinessDayLifetime, which starts at 1 the first day the company was in business, and increments freely forever.