5 August 2015

Many-to-Many Parent Child

This article will help you developing a parent child dimension that can have multiple hierarchies. It is a complement of the famous paper "The Many-to-Many Revolution 2.0" written by Marco Russo and Alberto Ferrari.

Issue

Imagine you were given the challenge of developing a model about different types of documents that have guidelines to follow in airplane maintenance: documents of the type A, B and C. The documents have a hierarchy to fulfill: "A" documents are fulfilled by "B" documents, "B" documents are fulfilled by "C" documents, "C" documents can fulfill multiple "B" documents and "B" documents can fulfill multiple "A" documents. Besides this, there is also the possibility of "A" documents being fulfilled directly by "C" documents" without "B" documents and "B" documents being the upper level without "A" documents and being fulfilled by "C" documents.

The following picture describes the idea and examples of the hierarchies that can occur:

Solution

A simple solution would be to use the parent child dimension feature of Analysis Service. We can use it to model hierarchical and fast changing dimensions:

The problem of this feature is that you can define only one parent child hierarchy in a dimension. As so you will have to create a many-to-many parent child model.

As proposed by the paper "The Many-to-Many Revolution 2.0" written by Marco Russo and Alberto Ferrari we need to create a bridge table and connect it to both Document and Document Hierarchies dimension.

Although Figure 3 demonstrates a correct cube structure, it can be improved: in this situation we want to be able to query the fact table for specific additional related information. When we define a dimension based on such a fact table item, the dimension is called a fact dimension. Fact dimensions are also known as degenerate dimensions. Fact dimensions are useful for grouping together related fact table rows. Although you can put this information in a separate dimension table in the relational database, creating a separate dimension table for the information provides no benefit because the dimension table would grow at the same rate as the fact table, and would just create duplicate data and unnecessary complexity (MSDN, Defining a Fact Relationship).

As so, we can use the bridge table as a fact and a dimension table:

Implementation

We are going to implement Figure 4 using SQL Server Analysis Services but first we need to add data to our tables. We will use the examples given in Figure 1.

For the Document dimension we just need to add the list of the documents:

1 A1 Document A1 A
2 A2 Document A2 A
3 A3 Document A3 A
4 A4 Document A4 A
5 A5 Document A5 A
6 B1 Document B1 B
7 B2 Document B2 B
8 B3 Document B3 B
9 B4 Document B4 B
10 B5 Document B5 B
11 B6 Document B6 B
12 C1 Document C1 C
13 C2 Document C2 C
14 C3 Document C3 C
15 C4 Document C4 C
16 C5 Document C5 C
17 C6 Document C6 C

For the Type dimension simply add types A, B and C:

1 A Type A
2 B Type B
3 C Type C

For the Document Hierarchy dimension (which will also work as a fact table) we need to pay close attention because this is the most important table for this model.

It is important to present all types of hierarchies and have a HIERARCHY_TYPE column to distinguish them. This will be useful when analyzing information through a pivot table.

First of all insert records for each parent even if it's of the document type A, B or C (for the example given they are: A1, A2, A3, A4, A5, B5 and C6). Then insert records for the next level of the hierarchy: A to B, A to C, B to C and A to B. Finally insert records for the hierarchies that have three levels: A to B to CKeep in mind that ID_HIERARCHY_PARENT column refers to ID_HIERARCHY column (AUX_ID_DOCUMENT and AUX_ID_HIERARCHY_PARENT help you to better understand which ID we are referring to):

1 1 A1 1 A1 Parent
2 2 A2 2 A2 Parent
3 3 A3 3 A3 Parent
4 4 A4 4 A4 Parent
5 5 A5 5 A5 Parent
6 10 B5 6 B5 Parent
7 17 C6 7 C6 Parent
8 6 B1 2 A2 A-B
9 7 B2 2 A2 A-B
10 7 B2 3 A3 A-B
11 8 B3 3 A3 A-B
12 9 B4 4 A4 A-B
13 16 C5 4 A4 A-C
14 16 C5 6 B5 B-C
15 11 B6 5 A5 A-B
16 12 C1 8 B1,A2 A-B-C
17 13 C2 8 B1,A2 A-B-C
18 14 C3 9 B2,A2 A-B-C
19 14 C3 10 B2,A3 A-B-C
20 15 C4 11 B3,A3 A-B-C
21 15 C4 12 B4,A4 A-B-C

Finally we will insert some records to the fact table (just one for each document is enough):

20150101 1 1 1
20150101 2 1 1
20150101 3 1 1
20150101 4 1 1
20150101 5 1 1
20150101 6 2 1
20150101 7 2 1
20150101 8 2 1
20150101 9 2 1
20150101 10 2 1
20150101 11 2 1
20150101 12 3 1
20150101 13 3 1
20150101 14 3 1
20150101 15 3 1
20150101 16 3 1
20150101 17 3 1

Now we can create our SSAS project. We need to create Date, Type, Document and Document Hierarchies dimensions and a cube named Documents with the bridge table and the fact table as measure groups.

The most important aspects you have to keep in mind for this model are: Document Hierarchies dimension.

As this is a parent child dimension you need to:

Add the key attribute "Id Hierarchy" and change some properties: Hierarchy Visible as False NameColumn as V_DIM_DOCUMENT.COD_DOCUMENT;

Add the parent child attribute "Document Hierarchies" and change some properties: Usage as Parent Members with Data as NonLeadfDataHidden (to hide data members contained by the parent attribute); and RootMemberIf as ParentIsSelf (only members with themselves as parents are treated as root members);

Add the attribute "Hierarchy Type" in order to distinguish the different hierarchies you will have.

  • Cube – Dimension Usage

It is important to have the correct relationships in Dimension Usage:

  • Bridge Document Hierarchies Measure Group needs to have a Fact relationship with Document Hierarchies dimension as this dimension table is, at the same time, the fact table;

  • Documents Measure Group needs to have a Many-to-Many relationship with the Document Hierarchies dimension using the Bridge Document Hierarchies as the intermediate measure group.

Final Results

The output in excel in the tabular form will be like this:

Now you will see the importance of the Hierarchy Type column: in Figure 14 you can see that "Type – Description" column is always about the lower level of the hierarchy. For example, for line 6, "Type – Description" column shows "Type C". What if we want to know the type of other levels of the hierarchy? Use Hierarchy Type for filtering and you will be able to choose which hierarchies you want to see. For example, by filtering for A-B hierarchy, the lower level of the hierarchy will be type "B" documents:

This means that we can analyze all levels of our Many-to-Many Parent Child model.
 
References

Ferrari, Alberto. & Russo, Marco. (2011). The Many-to-Many Revolution 2.0.
http://www.sqlbi.com/articles/many2many/
MSDN, Defining a Fact Relationship
https://msdn.microsoft.com/en-us/library/ms167409.aspx

.

.

.

.
     Nuno Sousa
      Consultant