Hi,
I am currently loading dimensions using a Sql Server Destination and i was wondering if i can create a middle step to insert a certain row so i would identify in my dimension the undefined records... I know i can do it with sql statement but i was wondering if there is a better way.
Best Regards,
Luis Sim?es
Would this not be part of the fact load, you wish to infer the dim member from the fact data when there is not already a matching dim member? You would use either the lookup or join to detect the rows in the fact that do not have a corresponding dim member. Normally I would expect a lookup to be better for the job as there you have many more fact rows than dim rows, so lookups are normally quite effective. The lookup would be used to get the dim key (surrogate key value) and this is then inserted. You could use the error output of the lookup, or even ignore lookup failures and then use a conditional split to get a path with only rows that having missing values. You would then insert the missing Dim members then insert the fact. You will need to ensure only you do only one insert as you could have several facts missing the same dim member in one buffer.
|||I think that was not the desired output...
What i really want to do is to insert just one row with a specific caracteristic and add it to the dimension if it doens't exists already...
That record will be used to identify "undefined" records of dimension X.
Best Regards,
|||You can very easily use a custom script component in "Source" mode to create rows.The rest depends on your architecture, but perhaps you could union your new "unknown" member with the rest of your incoming dimension data. I assume you already have a mechanism to discard/update members that already exist.
Alternatively, you could have a separate Data Flow that tries to load unknowns for all dimensions. Each dimension would have a separate custom source to create the row, a lookup to detect it if already exists, and a destination to insert it if it doesn't.
|||
If you want to do this in the data-flow that populates the dimension table then use a script transformation to create your "Unknown" row and use a UNION transformaiton to put it together with the est of the incoming data. Very simple.
-Jamie
|||
Script and Unions will do the trick, but how often do you populate a Dim table from scratch? Would it not be easier to include the "unknown" record as part of your create table script? You only create a Dim table once, and therefore only need to do this once, so it would make more sense to me just do it manually, and avoid complicating the packge.
It also makes it easier to break that cardinal rule and hard code a surrogate key value. Code a -1 for example as the unknown, and then you can use that as a default value assigned in your pipeline if the regular lookup fails. People may complain about this logic, but it can perform rather well.
No comments:
Post a Comment