When and How to Snowflake Dimension Sources : SSAS Design Part 3
Reposted from Jason Thomas blog with the author's permission.
In my last post, I had quickly hinted why I prefer using a star schema and also explained the second scenario in which I would snowflake the dimension sources for the DSV. The three scenarios as well as the example product dimension are repeated below for quick reference
Scenario 3 – Multiple Levels joining to same Parent
Consider the above example where there are multiple levels (Local SKU and Nielsen SKU) joining to the same parent. Invariably, a “diamond” or cyclic relationship would be involved the moment you say that there are multiple levels joining to the same parent in SSAS dimensions (You could avoid this by splitting the dimensions and making referenced dimensions in SSAS but then you would lose the ability to make hierarchies between the levels in the two split dimensions). For Scenario 3 to be applied, an additional condition also needs to be satisfied – at least one of the intermediate attributes involved in the cyclic relationship needs to have fact coming in at that level and this attribute should not be the start point or end point of the cyclic relationship. Some examples of diamond relationship are given below
In our example, both Local SKU and Nielsen SKU have fact coming in at their levels and they are neither the start point (Dummy Level) nor the end point (Retail SKU) of the cyclic relationship, and hence Retail SKU qualifies under Scenario 3. In this scenario, the following views would be made
a) View for parent level – A view is made for the parent entity (Retail SKU in our example) which will contain all the attributes / entities till the next entity where one of the three scenarios’ happens. For our example, one view is made.
// View for Retail SKU and attributes
SELECT DISTINCT
// Scenario 2 at Brand Pack
Retail_SKU, Standard_SKU, BrandPack#FK
FROM
<table>
WHERE
// <levels inside “diamond” at which fact granularity comes>
<level=Local_SKU_ID or level=Nielsen_SKU_ID>
Some things to note here
-
As mentioned in the note section of my previous post, there is a basic difference in the WHERE clause of the views in Scenario 3 and the rest. For the rest of the scenarios, the clause would be for the lowest level of the view. So if the same principle was applied to the above view, then the WHERE clause would have been for <level=Retail_SKU>. Consider a case where there are Retail SKUs which don’t have any Local SKUs or Nielsen SKUs. There would not be any facts associated with them, and would be a pain for the business users if they start appearing in the dropdowns for the reports. Having the Scenario 3 WHERE clause would prevent this from happening as only the Retails SKUs which have facts associated with them would be brought in the cube.
-
The reason why Brand Family doesn’t qualify for Scenario 3 is because apart from the lowest level (Brand Pack), there are no attributes in the “diamond” relationship which has fact coming at that level. Hence, all these attributes have come under the view for Brand Pack in Scenario 2.
-
In both Scenario 1 and Scenario 3, there is a cyclic relationship or diamond relationship present in the model. These views will only ensure that the master data is correct. Care should be taken to handle the cyclic relationships within SSAS by defining multiple hierarchies or by making the relationship linear. Else you might see the blue squiggly line in the dimension attribute relationship manager in SSAS warning you of the occurrence of cyclic relationships.
That said, this marks the end of my trilogy on When and How to Snowflake Dimension Sources. I have made an effort to make it as generic as possible, but there could be scenarios which I have missed. Feel free to throw bricks or debate on the above concepts, constructive criticism is always welcome!
Jason has been working with Microsoft BI tools since he joined the IT industry in 2006. Since then he has worked across SSAS, SSRS and SSIS for a large number of clients. He is currently working for Mariner and is based out of Charlotte, NC. His personal blog can be found at http://road-blogs.blogspot.com |
Tags: design