Attribute relationships explained (by Richard Tkachuk)
Introduction
The purpose of this document is to define how attribute relationships affect calculations. Attribute relationships have other impacts such as query performance that are only briefly mentioned here.
In Analysis Services 2005, Attribute relationships define functional dependencies between attributes. In other words, if A has a related attribute B, written A è B, there is one member in B for every member in A, and many members in A for a given member in B. More specifically, given an attribute relationship City è State, if the current city is Seattle, then we know the State must be Washington.
A quick definition of terms: if B is related to A, then AèB and A is said to be a relating attribute to B. For example, given City è State, State is related to City and City is relating to State.
Given this definition, consider the query:
Select measures.sales on columns
from Sales
where
(Customer.Name.Richard)
What cell is actually being queried? There are two choices; it’s either:
(measures.sales, Customer.Name.Richard, Customer.Gender.All, Customer.City.All, Customer.State.All, Customer.Country.All)
or
(measures.sales, Customer.Name.Richard, Customer.Gender.Male, Customer.City.Sammamish, Customer.State.Washington, Customer.Country.USA)
The two choices reflect whether or not attributes not specified in the query are left alone or whether they are moved to the members that exist with those specified in the query. If there are no attribute relationships defined, the answer is the first option. But if the query is executed in the context of the following attributes and attribute relationships:
Attribute |
Attribute Relationships |
Name |
Name->City |
|
Name->Gender |
Gender |
|
City |
City->State |
State |
State->Country |
Country |
|
Then the answer is the second of the two choices. When the engine resolves the coordinates of a cell, it uses the tuple specified by the query (Customer.Name.Richard, measures.Sales) and then uses the attribute relationships to get the coordinates of related and relating attributes. Here’s the rule:
This is pretty straightforward. If you tell me you live in Seattle, I know you live in Washington, USA (yes, yes, many cities have the same name – let’s assume not for now). Analysis Services 2005 behaves the same way – if you indicate that a city can only exist in one and only one state by defining the attribute relationship, it moves the state coordinate to that state when the city is defined.
When you create a dimension in Analysis Services 2005, an attribute relationship is automatically created between the key attribute and all the other attributes. For example, if you create the customer dimension, it will look something like this:
Attribute |
Attribute Relationships |
Name |
Name->City |
|
Name->Gender |
|
Name->State |
|
Name->Country |
Gender |
|
City |
|
State |
|
Country |
|
It’s up to you to know the relationships in your data and define them. You should do this – not only do you get behavior that makes sense, you’ll get a significant performance boost as well whenever the dimension is queried.
Attribute Relationships and Granularity Attributes
Most of the time measures are associated with the dimension by its key. Defining or not defining the attribute relationships affects performance and overwrite behavior. But there are situations where defining attribute relationships incorrectly can result in errors in query results.
Sometimes the granularity of a measuregroup is at something other than the dimension’s key attribute. Consider actual and budgeted sales against a time dimension. Actual sales can have a finer granularity than budgeted sales. Actual sales can be by day and budgets can be by week. Now consider attribute relationship defined this way:
Attribute |
Attribute Relationships |
Day |
Day->Week |
|
Day->Month |
Week |
Week->Year |
Month |
Month->Year |
The dimension and measure groups might be best visualized as:
I’ve drawn the attribute relationships as arrows and the relationship between measure groups and the dimension as lines. Now note that there is no relationship between week and month – so querying the budget measures by month can result in incorrect values. So structures like these should be avoided if possible.
Attribute Relationships and Calculations
Attribute relationships are used in calculations and effect query results. Whenever an attribute is overwritten, its related and relating attributes are overwritten as well.
Perhaps some explanation is required as to what exactly is an overwrite: whenever a calculation specifies a member of a hierarchy, this is an overwrite of the currentmember. As the simplest example, consider the following statement in a script:
Create member measures.Profit as (measures.Revenue – measures.Expense);
The first part of the expression overwrites the current member in the measures dimension with Revenue. The second part overwrites the current member with Expense. Virtually all calculations overwrite one or more hierarchies.
For example, consider a query with and without attribute relationship defined to a customer dimension. The customer dimension contains attributes Name, City, State and Country and a user hierarchy Geography containing these attributes. Now consider the expression:
with member measures.CityName as
Customers.Geography.Currentmember.Name
select
measures.CityName on rows,
Customers.City.members on columns
from Sales
One would expect the name of the city but, without attribute relationships defined, that’s not the result. In fact, one would see something like this
Customers.City |
Measures.CityName |
All Customers |
All Customers |
Seattle |
All Customers |
Redmond |
All Customers |
Tacoma |
Al Customers |
… |
|
What’s going on?
The explanation for the unexpected result is how the currentmember in the Geography hierarchy is determined. For each member in Customer.City the expression attempts to find the member in the Geography hierarchy that corresponds to that specific city, and the other members Name.All, State.All, Country.All (no attribute relationships defined means these attributes are unaffected by changing city). There are no such members in the user hierarchy. In such circumstances the Analysis Services engine applies the so-called “gap rule” and overwrites attributes starting from the lowest level to the top until a valid coordinate is found.
But this is clearly not the intent of the calculation’s author. Generally a customer lives in one city and the city determines the state which then determines the country. But the Analysis Services engine doesn’t know this – to the engine, they’re just attributes. The engine has to be told of the functional dependencies with the attribute relationships: NameèCity, CityèState, and StateèCountry.
With this new information, you’ll see the expected results and the Name in the geography hierarchy will match the name in the user hierarchy:
Customers.City |
Measures.CityName |
All Customers |
All Customers |
Seattle |
Seattle |
Redmond |
Redmond |
Tacoma |
Tacoma |
… |
|
Explicit vs Implicit Overwrite
When an attribute is overwritten in an expression, it is said to be explicitly overwritten. Related and relating attributes are implicitly overwritten. For example, the following expression is an explicit overwrite of City and an implicit overwrite of State:
With member measures.x as (measures.Sales, City.Seattle)
Select measures.x on 0, State.Or on 1 from sales
The query result returns the value for (measures.Sales, City.Seattle, State.Wa). The calculated member explicitly overwrites City from City.All to City.Seattle. State undergoes an implicit overwrite from State.Or to State.Wa; thus the result of (measures.Sales, City.Seattle, State.Wa).
Why the distinction? It is because an explicit overwrite moves directly and indirectly related and relating attributes based on the rules below. An implicit overwrite has no impact on its own related and relating attributes.
For example, in the presence of the following attributes and attribute relationships:
Attribute |
Attribute Relationships |
Name |
Name->City |
|
Name->Gender |
Gender |
|
City |
City->State |
State |
State->Country |
Country |
|
Consider the statement below:
with member measures.x as (measures.Sales, City.Portland)
Select measures.x on columns
from sales where
(Customer.Name.Richard)
The result of the calculation yields
(Name.All, Gender.Male, City.Portland, State.Oregon, Country.USA)
Why? The initial coordinate before the calculated member overwrites the coordinates is (assuming Richard is male and lives in Sammamish, Washington, USA) is
(Name.Richard, Gender.Male, City.Sammamish, State.Washington, Country.USA)
The calculation asks for Sales in City.Portland – an explicit overwrite of City which drives an implicit overwrite of its relating attribute (Name) and its related attributes (State and Country) but has no affect on Gender.
Overwriting with the currentmember function
In prior versions of Analysis Services, the currentmember function could be used in calculations with no effect other than improving readability. For example, the contribution of sales of a product to its parent could be written as
(measures.sales, product.hierarchy.currentmember) /
(measures.sales, product.hierarchy.currentmember.parent)
Because a hierarchy’s currentmember is always used unless overwritten, this could be expressed as
measures.Sales /
(measures.Sales, product.hierarchy.currentmember.parent)
In AS2005, specifying the currentmember is an overwrite and has consequences on other attributes and the two expressions might not give identical results in all situations. For example, consider this:
with member measures.x as (measures.sales)
select measures.x, measures.sales on columns
where (Customer.City.Seattle)
The State attribute’s original coordinate is (Name.All, City.Seattle, State.Washington, Country.USA) and the calculated member does not overwrite any attribute in the customer dimension. The result is sales for the same tuple.
Do the same thing (well, almost the same) but now overwrite the state attribute with the currentmember function:
with member measures.x as (measures.sales, customer.[State].currentmember)
select measures.x, measures.sales on columns
where (Customer.City.Seattle)
The calculation refers to the state attribute’s currentmember – but this is still considered an overwrite and the City attribute is moved from City.Seattle to City.All. If the calculation did not contain the currentmember function, the state attribute would not get overwritten and the City attribute coordinate would be unchanged.
Detailed Rules and Examples
Assuming the existence of an attribute relationship between two attributes A and B such that AèB or Aè … è B; that is, A and B are directly or indirectly related via a chain of attribute relationships. . The examples use City and State as attributes A and B, respectively with example data below (in the examples the shorthand “..” sometimes indicates the level directly below the all level.)
State |
City |
Sales |
All |
All |
70 |
WA |
All |
30 |
|
Seattle |
10 |
|
Sammamish |
20 |
OR |
All |
40 |
|
Portland |
40 |
The table below illustrates the effect of an explicit overwrite of A or B has on the other attribute:
Explicit Overwrite |
Result |
Example |
A.All to A.All |
B unaffected |
With member measures.x as (measures.Sales, City.All) Select measures.x on 0, State.WA on 1 from sales Result: Calculated member overwrites City.All with itself. No affect on State attribute. |
A..x to A.All |
B to B.All |
With member measures.x as (measures.Sales, City.All) Select measures.x on 0, City.Seattle on 1 from sales Result: Calculated member overwrites City from Seattle to City.All. State moves from State.WA to State.All |
A to A..x |
B to Exists(B..members, A..x) |
With member measures.x as (measures.Sales, City.Seattle) Select measures.x on 0, State.OR on 1 from sales Result: Calculated member moves City from City.All to City.Seattle. State moves to State.WA. |
B.All to B.All |
A to A.All |
With member measures.x as (measures.Sales, OR.All) Select measures.x on 0, City.Seattle on 1 from sales Result: Calculated member moves State moves to State.OR. City moves from City.Seattle to City.All. |
B..x to B.All |
A to A.All |
With member measures.x as (measures.Sales, State.All) Select measures.x on 0, City.Seattle on 1 from sales Result: Calculated member moves State from State.WA to State.All. City moves to City.All. |
B to B..x |
A to A.All |
With member measures.x as (Sales, State.WA) Select measures.x on 0, City.Seattle on 1 from sales Result: Calculated member moves State to itself. This still constitutes an overwrite and City moves to City.All |
Conclusions
Take care when defining your attribute relationships – they impact performance, accuracy and query semantics:
- overwriting a member overwrites its related and relating attributes
- the currentmember function is not a “noop”; it changes results
- if you err in defining attribute relationships, you can get incorrect and even inconsistent query results
Tags: design, attribute relationship