Thursday, July 16, 2015

How to order/sort data in cube view used as a data source in Turbo Integrator (TI)?

An IBM Cognos TM1 cube view can be used as data source in a Turbo Integrator (TI) process. When such a view is used as a data source in TI, there is no control offered on how the cell intersections appear for processing. The order in which they appear, is determined by dimension index in the cube, and the subsets assigned for these dimensions.

Sometimes there is a need to process the data in an ordered format (say work top down i.e. process the data in descending order). We will look at ways to accomplish this.
Irrespective of the approach we take, it involves 3 steps namely –
  1. Export the entire data
  2. Order the data
  3. Use the Ordered data as data source
I will not delve into #1 (Export the data) and #3 (Use the ordered data).  These are straight forward steps.  Let’s discuss #2 (Order the data).  Broadly there are couple of options:
  1. Perform it outside of TM1
  2. Use a workaround and do it within TM1

Order/Sort the data outside of TM1

In this method, we’d need to export the data to a table in a database.  Use ODBC as data source and write a query with ORDER BY clause to perform the operation we need, there by leveraging the power of SQL.
Alternately, we could export to a flat file and do the ordering using excel or a scripting language of your choice.  Later this file can be used as data source.

Use Work around technique and perform the ordering/sorting withing TM1

There is no predefined function/method available in TM1 which can accomplish the work.  Instead, we will be relying on MDX and a temporary dimension to get the work done.  When everything is done, it will produce a file that is ordered/sorted per users choice – ascending or descending.  At a high level we will perform these steps:
  1. We will need one process that will read the cube view (which needs to be sorted) as data source
    • In this process, create a temp dimension in prolog
    • In addition, create ‘S’ attributes equal to the number of dimensions in the cube; plus one ‘N’ type attribute
    • In Data tab, add elements to the dimension
    • For each string attribute, add the dimension values of the cube
    • In the numeric attribute put the NValue
    • In Epilog, create a subset on this dimension using MDX expression.  We will use MDX function ORDER to sort the dimension by it’s attribute (numeric attribute) and create a subset
  2. We will need 2nd process that will use the subset as the data source.  Since the subset is already ordered, looping through the elements in this subset is equivalent to ordering the data
    • In the Data tab, we will retrieve all the attribute of the element and put it in a file
To summarize
  • We are creating a dimension which has elements equal to number of cells we will be working with
  • Add attribute to each element in the dimension.  Each attribute corresponding to the element information that make up the intersection
  • Order the elements in the dimension by using MDX function ORDER
  • Use this ordered subset and export the attributes of the element to a flat file.  This is the data we want to work with
If the number of intersections is very high (to the tune of millions) then there is overhead in terms of creating a dimension with that many elements as well as ordering it using MDX.  It might be faster to perform it in a DB … YMMV

As of writing this post, I am NOT aware of any method to attach file in the blog.  So I will try explaining it with snippets of code and screenshots.  We will use Planning Sample server and work with cube plan_BudgetPlan


In the Prolog tab of the 1st process we will build a view with the context elements highlighted.  We will also create the attributes.  The number of string type attributes equal the number of dimensions in cube and in addition, a numeric attribute to hold the intersection value.  The process accepts a parameter of ps_TempDimName … this is the dimension we will be working with to push the data and sort its elements.  Code snippets are taken from a process, which accepts 3 parameters:
  • TempDimName (explained above)
  • Sort Order (1 for Ascending and 2 for Descending)
  • Cube Name (whose data we want ORDERed BY)
AttrDelete (ps_TempDimName, ‘CubeValue’);
AttrInsert (ps_TempDimName, ”, ‘CubeValue’, ‘N’);
i = 0;
WHILE (i < vi_NumOfDims);
vs_AttrName = ‘Attr_’ | NumberToString (vi_NumOfDims – i);
AttrDelete (ps_TempDimName, vs_AttrName);
AttrInsert (ps_TempDimName, ”, vs_AttrName, ‘S’);
i = i + 1;
END;
AttrDelete (ps_TempDimName, ‘Attr_0′);

Now head over to the Data tab.  In here, we will add elements to the dimension as well as populate the attributes.  For those of you raising the eyebrows and wondering “Shouldn’t element insert happen in Metadata?“.  Take a look at the code below and check the underlined piece.  That should clear the doubt.

vs_ElemName = ‘Elem_’ | NumberToString (vi_RowCnt);
DimensionElementInsertDirect (ps_TempDimName, ”, vs_ElemName, ‘N’);

i = 1;
WHILE ( i <= vi_NumOfDims );
vs_AttrName = ‘Attr_’ | NumberToString (i);
CellPutS ( Expand ( ‘%v’ | NumberToString (i) | ‘%’), vs_AttrCube, vs_ElemName, vs_AttrName);
i = i + 1;
END;
vi_RowCnt = vi_RowCnt + 1;
CellPutN (NValue, vs_AttrCube, vs_ElemName, ‘CubeValue’);
 Let’s move on to the Epilog tab.  Here we will build the MDX expression and create the subset order in ascending or descending manner.  As the last step call the 2nd process

IF (pi_SortOrder = 1);
    vs_OrderBy = ‘ASC';
ELSE;
    vs_OrderBy = ‘DESC';
ENDIF;

vs_MDX = ‘{‘ | ‘ORDER ( TM1SubsetAll ( [‘ | ps_TempDimName | ‘]) , [‘ | ps_TempDimName | ‘].[CubeValue], ‘ | vs_OrderBy | ‘)}';

vs_Sorted_SubName = ‘zSubsetOrdered_’ | ‘vs_OrderBy';
IF (SubsetExists (ps_TempDimName, vs_Sorted_SubName) <> 0);
SubsetDeleteAllElements (ps_TempDimName, vs_Sorted_SubName);
ELSE;
SubsetCreate (ps_TempDimName, vs_Sorted_SubName);
ENDIF;

SubsetMDXSet (ps_TempDimName, vs_Sorted_SubName, vs_MDX);

ExecuteProcess (‘SECOND_PROCESS_NAME’, ‘ps_SubName’, vs_Sorted_SubName, ‘ps_TempDimName’, ps_TempDimName, ‘pi_NumOfDims’, vi_NumOfDims);

Below is a screen grab showing the attribute cube.  This would be output of the 1st process.


The 2nd process is fairly simple.  It will use Dimension Subset as data source and it will assign the Temp Dim Name’s subset  we are working with, as the source.  The Data tab of the code is pretty small.  All it does is, loop through the S and N type attributes and AsciiOutput to a file
vs_OutputStr = ”;
i = 1;
WHILE (i <= vi_NumOfDims);
IF (i=1);
vs_OutputStr = vs_OutputStr | CellGetS (vs_CubeName, V1, ‘Attr_’ | NumberToString (i));
ELSE;
vs_OutputStr = vs_OutputStr | ‘, ‘ | CellGetS (vs_CubeName, V1, ‘Attr_’ | NumberToString (i));
ENDIF;
i = i + 1;
END;
vs_OutputStr = vs_OutputStr | ‘, ‘ | numberToString (CellGetN (vs_CubeName, V1, ‘CubeValue’));
AsciiOutput (vs_FileName, V1, vs_OutputStr);
The sample file output will look like this:


 Now that you have the flat file, you can use this as data source for your 3rd process and move forward.  Just remember to handle the temp dimension and the flat file, after you are done working.  They need to cleared out.

No comments:

Post a Comment