How to create duplicate members in Anlaysis Services 2005 (by Richard Tkachuk)
Duplicate Members in Analysis Services 2005
Introduction
In a previous document, I discussed how many-to-many dimensions could be used when a single fact was associated with more than one member of a dimension. But there are situations where one member can have multiple parents In this document, I’ll outline how a dimension can be configured so that the same member appears in multiple places in the same hierarchy.
Problem Statement
Consider the example of an organization where employees have a primary manager but matrix report to other managers. For analysis purposes, the same employee needs to appear in multiple places.
If the employee is to appear under multiple managers without its descendants, the recommended solution is to create additional members with the same caption but with a new key with a custom member formula that populates its values with the original.
If the employee is to appear in multiple places with its descendants intact, the solution is requires some extra work – this is what this document describes.
Creating the Dimension table
We’re starting with the employee dimension table from the Foodmart sample distributed with Analysis Services 2000 with the following structure:
This is a parent-child dimension table with the recursive relationship between employee_id and supervisor_id. In it, each employee has a single manager.
I’ve created one more table that contains the matrix reporting relationship between an employee and his other managers:
There is a row in this table for every extra reporting relationship between an employee and a supervisor (it does not contain the primary employee-supervisor relationship in employees, just the matric reporting replationships).
From these two tables, a view can be created that replicates the original parent child hierarchy and adds the new members from the matrix reporting with their descendants. In this example, we’re just going to have a single row in this table that indicate that addition we’re going to have Michael Spence (of Foodmart fame) will also report to Derrick Whelpy (also of Foodmart fame). To do this a single row is added to the matrix reporting table where employee_id=2 (Michael Spence’s employee_id) and supervisor_id=4 (Derrick Whelpy’s employee_id).
From this single row, we want to add to this view rows for Michael Spence and all his descendants. Each has to be given a new employee_id to distinguish them from their original. I chose the convention:
<matrix_reporting.supervisor_id> + “_” + <matrix_reporting.employee_id> + “_” + <employee.employee_id>
Why this identifier? We have to guarantee it’s uniqueness. There is no reason why a truly complex implementation of this approach would have an employee reporting to almost all other employees of an organization (well, except for herself or anyone reporting to her which would cause an infinite recursion problem). Including the employee_id and supervisor_id from the matrix_reporting table in the id of the new employee guarantees uniqueness. I’ve called these values root_employee_id and root_supervisor_id.
The result is as follows.
Each column in this table is defined as:
s_employee_id: a surrogate key for each employee. If the employee is from the table, this is just its integer id converted to a string. If the employee is added from the matrix_reporting table, the structure of the key is <matrix_reporting.supervisor_id> + “_” + <matrix_reporting.employee_id> + “_” + <employee.employee_id>. The same employee appears multiple times, but is guaranteed a unique key using this method.
s_supervisor_id: the employee’s supervisor. This has the same key structure as s_employe_id.
employee_id: the original key for the employee (preserving the original datatype)
root_employee_id: the original value of employee_id for the row in matrix_reporting that generated the row.
root_supervisor_id: the original value of supervisor_id for the row in matrix_reporting that generated the row in this table.
The purpose of the last two columns are for generating the view itself. I’ve used the new SQL feature common table expressions (or CTEs) to generate the view that uses these internally. This documented isn’t intended to be an overview of CTEs, so I’ll just reproduce the view definition in an appendix (sadly, CTEs cannot be used in a DSV named query, otherwise I’d have defined it there).
Creating a dimension table from this view gives us the dimension we want. The next issue is getting the data for the copied employees to be the same value as their original.
Getting the Data Right
In the dimension table the key changed from an integer to a string. To keep the fact table in synch with this modified key, I added a named calculation to the fact table in the DSV that applies the same conversion. The expression for this named calculation is:
convert( varchar(4) , employee_id)
Binding this named calculation to the new key of the modified dimension table associates the data to the original employees. To get the same data for the duplicate members, the following assignment is needed in the MDX Script:
scope ({[Employee].[Root Employee].[Root Employee].members - [Employee].[Root Employee].&[0]});
this = iif(
[employee].[employee].currentmember is [Employee].[Employee].[All],
[Employee].[Root Employee].&[0],
( strtomember("[employee].[employee].&[" + [employee].[employee].currentmember.properties("Source Employee") + "]") , [Employee].[Root Employee].&[0]));
end scope;
The scope limits the assignment to copied members. (This is actually a bit of a hack. I’m assuming that there is no employee has an employee_id of 0 because nulls are converted to 0 in my example. An actual implementation might have to accommodate for id’s that included a 0 !)
The assignment copies the value from the original employee to the copied employee.
Results
That’s it. Browsing the date reveals the following values:
Notice how the values for our friend Michael Spence and his descendants are the same as whether he reports to Sheri Nowmer (as Foodmart 2000 was designed) or Derrick Whelpy (as modified here).
Notes, Cautions and Worries
I haven’t done any performance testing on this approach. I don’t expect issues – as this approach was designed that fact data applying to the original member aggregates normally. The assignment kicks in only for copied members.
Appendix A: View Definition for Dimension table with Duplicate Members
The query below generates the view used as a basis for the extended parent child dimension table containing the duplicate members (and their descendants).
WITH
employee_cte(
s_employee_id
,s_supervisor_id
,employee_id
,root_employee_id
,root_supervisor_id
)
AS
(
SELECT
Convert(varchar(4), mr.supervisor_id) + '_' + Convert(varchar(4), mr.employee_id) + '_' + Convert(varchar(4), e.employee_id) as s_employee_id
,Convert(varchar(4), mr.supervisor_id) + '_' + Convert(varchar(4), mr.employee_id) + '_' + Convert(varchar(4), e.supervisor_id) as s_supervisor_id
,e.employee_id
,mr.employee_id as root_employee_id
,mr.supervisor_id as root_supervisor_id
FROM employee e inner join matrix_reporting mr
on e.supervisor_id = mr.employee_id
-- Add the descendants of the matrix employees
union all
SELECT
Convert(varchar(4), e2.root_supervisor_id) + '_' + Convert(varchar(4), e2.root_employee_id) + '_' + Convert(varchar(4), e.employee_id) as s_employee_id
,Convert(varchar(4), e2.root_supervisor_id) + '_' + Convert(varchar(4), e2.root_employee_id) + '_' + Convert(varchar(4), e.supervisor_id) as s_supervisor_id
,e.employee_id
,e2.root_employee_id
,e2.root_supervisor_id
FROM employee e
inner join
employee_cte e2 on
e.supervisor_id = e2.employee_id
)
-- Get the original list of employees
select
Convert(varchar(4), employee_id) as s_employee_id
,Convert(varchar(4), supervisor_id) as s_supervisor_id
,employee_id
,null as root_employee_id
,null as root_supervisor_id
from employee
-- add the matrix employees
union all
select
Convert(varchar(4), supervisor_id) + '_' + Convert(varchar(4), employee_id) + '_' + Convert(varchar(4), employee_id) as s_employee_id
,Convert(varchar(4), supervisor_id) as s_supervisor_id
,employee_id
,employee_id as root_employee_id
,supervisor_id as root_supervisor_id
from matrix_reporting
-- add the descendants of the matrixed employees
union all
select * from employee_cte