SSIS package that process all partitions/measure groups/cubes in one database
April 10th, 2007 by Vidas MatelisRecently I posted a blog entry on how to process all dimensions in a single Analysis Services 2005 database using SQL Server Integration Services (SSIS) package. Here I’ll add code to this package that will allow you to process whole database, cubes, measure groups or partitions.
Please follow steps from previous post on how to create new package and add script component. Then for the script component use code from bellow:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.RuntimePublic Class ScriptMain
Public Sub Main()
Dim ExecutionSuccessfull As Boolean = True ‘ If true, package executed without errors
Dim sProcessType As String = “ProcessFull”Dim oConnection As ConnectionManager
oConnection = Dts.Connections(“DM”)
Dim sServer As String = CStr(oConnection.Properties(“ServerName”).GetValue(oConnection))
Dim sDatabaseID As String = CStr(oConnection.Properties(“InitialCatalog”).GetValue(oConnection))
Dim oServer As New Microsoft.AnalysisServices.Server
oServer.Connect(sServer) ‘ connect to the server and start scanning down the object hierarchy
Dim oDB As Microsoft.AnalysisServices.Database = oServer.Databases.FindByName(sDatabaseID)If oDB Is Nothing Then
ExecutionSuccessfull = False
GoTo Done
ElseDim oDim As Microsoft.AnalysisServices.Dimension
Dim oCube As Microsoft.AnalysisServices.Cube
Dim oMeasureGroup As Microsoft.AnalysisServices.MeasureGroup
Dim oPartition As Microsoft.AnalysisServices.PartitionoServer.CaptureXml() = True ‘ Start capturing XML.
‘ oDB.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Use this to process whole DB at once
‘ Process all dimensions
‘For Each oDim In oDB.Dimensions
‘ ‘ This will generate XMLA, but because CaptureXML is True, will not execute it!
‘ oDim.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all dimensions
‘ NextFor Each oCube In oDB.Cubes
‘ oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all cubesFor Each oMeasureGroup In oCube.MeasureGroups
‘ oMeasureGroup.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all measure groupsFor Each oPartition In oMeasureGroup.Partitions
oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all partitions
Next
Next
Next
oServer.CaptureXml() = False ‘ Stop capturing XML
‘ Execute captured XML. First parameter Transactional, second parameter Parallel, third optional parameter: processAffected
‘ These are very important parameters!Dim oResults As Microsoft.AnalysisServices.XmlaResultCollection
Dim oResult As Microsoft.AnalysisServices.XmlaResult
Dim dataBytes(0) As Byte
oResults = oServer.ExecuteCaptureLog(True, True)Dim oMessage As Microsoft.AnalysisServices.XmlaMessage
‘Log the errors and warnings
For Each oResult In oResults
For Each oMessage In oResult.Messages
If oMessage.GetType.Name = “XmlaError” Then
‘The processing failed
Dts.Log(oMessage.Description, 0, dataBytes) ‘ You need to setup proper package loging to see this!
ExecutionSuccessfull = False
Else
‘It’s just a warning.
ExecutionSuccessfull = True ‘ if you want to fail on warning, change this to False
End If
Next oMessage
Next oResultEnd If
Done:
oServer.Disconnect() ‘ disconnect from the server — we are doneIf ExecutionSuccessfull Then
Dts.TaskResult = Dts.Results.Success
Else
Dts.TaskResult = Dts.Results.Failure
End If
End SubEnd Class
By default this package will build code to process each partition. This is just an example. There is commented out code there that can process all measure groups or cubes or whole database at once. Use this code as guidance to build your own SSIS package that does what you want.
In my data warehouse design I decided to maintain the list of partitions in SQL Server table. I have a “partition base” table that contains base information about each measure group and partitioning field (I have just one, usually date field). Then I have a stored procedure, that during each DW load calculates if new partitions have to be created or old ones have to be deleted and loads this information into extended partition information table. It also marks which partitions are affected by latest load and has to be reprocessed. Although this might seem like a complicated task, it is not. And it is well worth the effort. My SSIS package then compares this extended partition table with actual SSAS database and then drops/creates new partitions and process partitions that have to be processed.
I would be interested to hear how other people are managing processing of partitions?
Posted in SSAS, SSIS | 33 Comments »
April 11th, 2007 at 8:14 am
Just what I was looking for… I recently posted the URLs for the other two post that have to do with processing dims and designing aggregates on my blog. I will add this URL too. Thanks for the code!
April 12th, 2007 at 2:48 am
We handle processing of partitions through AMO code which is compiled as a dll and called through a SSIS package. All SSAS processing actions (updates, new partitions, deletes) are logged into a SQL table (we call is CubeProcessingLog). This table is then used with the ETL logging table to see what needs to be processed every hour or day. By doing it this way we only process the new incoming data; never older data. It’s fairly complex, however I could let it run for years and not have to touch it.
April 12th, 2007 at 7:03 am
Jesse,
Why do you compile AMO code as dll? Are there any benefits for that?
April 24th, 2007 at 5:30 pm
I tried to use only AMO to create the Scheduler for ETL and dimensions/partitions data processing but…hit some bugs. some code for partition/dimension automation is at one of my blog. Basically, the scheduler is controling and recording the status and steps for SSIS ETLing data into Database and then parallel processing of cubes separating dimensions processing and partitions processing…etc..all configurations as saved in .xml file (web config) as metadata to drive ETL and cubes’ schema changes if any…
April 24th, 2007 at 5:57 pm
Sergey,
It looks like in your blog you are talking about special case when during processing you also changing source table of measure group. Is that right?
Have you consider having SQL View that is used as a source for the measure group. Then you would change just view definition (example select from different table) instead of trying to solve this in AMO?
Vidas Matelis
May 2nd, 2007 at 4:59 pm
Yes, I have developed the controller that automatically updates data (ETL with SSIS packages) in the Data Warehouse tables and processes the dimensions and then partitions based on successfull or not successfull run of SSIS packages. The logic of controller is based on statuses of ‘steps’: etl,process dimensions,process partitions, final cleansing of staging data. it also has locking mechanizm (based on logical grouping for certain cubes to be ETLed and Processed) to prevent any duplication of processing for dimensions shared in many cubes. I am thinking now to have PARALLEL processing, so far it is sequencial – based on success – the next block of steps is executed.The controller supports all type of MSAS processings.COntroller also can process the certain date, month, year or any FROM – To Interval with granularity hr,day, week, month, year.
And, yes, you are correct,Vidas, one of the tags in XML file that stores metadata to control/execute the controller, specifies the query binding and staging table to process partition based on time granularity for specified date or FROM-TO interval.
This controler also records log info into SQl tables what is happening during execution to be able to troubleshot where and why. on my blog I have posted just one class that gave me a headache…with AMO…only XMLA batches will be able to do the job and the XMLA file needed to be ‘parametrized’ like ###database### to be able to replace certain names from controller.
May 2nd, 2007 at 7:08 pm
Thank you Vidas for sharing the source code – would use it for Process Full.
May 4th, 2007 at 1:11 pm
ProcessAdd for dimensions from querybinding – is a ‘feature’ that does not work as it supposed to.
see http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1117591&SiteID=1
February 5th, 2008 at 1:13 pm
Else
‘It’s just a warning.
ExecutionSuccessfull = True ‘ if you want to fail on warning, change this to False
End If
Small possible issue, I think the ExecutionSuccessfull should be commented out, unless failures are always the last item in the results collection. Otherwise a warning may set the bit back to true even if an error did occur but was processed earliar in the collection.
Regardless thanks for the code.
April 8th, 2008 at 3:50 pm
Vidas:
Do you know whether there’s a way to control the parallelism of the partition reprocessing? I’m trying to write a package that will reprocess multiple partitions in one measure group, and I want to regulate the load on the server.
Thanks!
April 8th, 2008 at 7:03 pm
John,
I could not find a way to control Parallelism during processing using ExecuteCaptureLog method. I’ll try to research a bit more, and if I’ll find answer, I’ll post it here.
You probably already know that you can control parallelism with XMLA. There is Parallel parameter for that like in this example:
Vidas
April 18th, 2008 at 2:49 am
Hello Vidas,
The code above helps a lot, I built the similar code in SSIS package for my project, it runs successfully, but I cannot see the detail processing log information, i.e which dimensions and partitions are processed, is there any log file generated by Analysis manager for this processing?
Thanks,
Emily
April 22nd, 2008 at 9:56 pm
Emily,
I have not implemented proper loging in this package. My goal was just to share code to process objects.
May 14th, 2008 at 8:52 am
I built a package that processes the measure groups in parallel by launching the same package four times in a sequence container. Prior to calling the sequence container, I inserted a record for each measure group I wanted to process into a transient table. Each measure group process registers itself via a SP and a registered process table and then uses the returned registered process id as a unique value to request the next available measure group to process. When I run it on my machine locally, it works like a champ. When I deploy it to a server with four processors, I randomly get one or two deadlocks which cause those instances to be shut down. The remaining processes still successfully process the rest of the measure groups, but I cannot figure out why I am encountering deadlocks. Any ideas?
May 14th, 2008 at 9:01 am
Ron,
By default this package will already process everything in parallel. There is no way to limit how many parallel tasks will run, so it will use all processors available.
If you need to control parallelism, better way would be to build process XMLA script like I did in my post here: http://www.ssas-info.com/VidasMatelisBlog/7_ssis-package-to-process-all-dimensions-in-ssas-2005-db
Can you please give a bit more details why you are running 4 packages in parallel?
July 2nd, 2008 at 10:59 am
Vidas, This is just what I was looking for. Your explainations are great and easy to understand. Thanks a million
October 10th, 2008 at 11:55 am
Hi Vidas,
I am thinking about processing the indexes after processing the dimensions, measure groups and partitions in the cube. Is it required to do that?
If yes, does the following modified code work ?
For Each oCube In oDB.Cubes
oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all cubes
For Each oMeasureGroup In oCube.MeasureGroups
oMeasureGroup.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all measure groups
For Each oPartition In oMeasureGroup.Partitions
oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all partitions
Next
Next
oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessIndexes)
Next
October 10th, 2008 at 12:06 pm
Naveen,
ProcessFull option already processes all data and indexes. If you want to split processing off data and indexes, then insteead of doing ProcessFull you should do ProcessData and then ProcessIndexes.
Please note than in my script oCube.Process and oMeasureGroup.Process lines are commented out and left there just to show that you have an option to do processing at the higher level.
So your script should be:
By the way, it is recommended to do process data and process indexes in the separate statements when you want to monitor if you have too many aggregates and because of that processing takes too long.
October 10th, 2008 at 4:31 pm
Hi Vidas,
Thanks for the swift reply!
I changed the script accordingly, and now would you be able to point me to the right resources where i can learn setting up ‘proper error handling – save errors/warnings to dts log’ ?
February 5th, 2009 at 10:16 pm
Hi Vidas,
How can i process specific measures group using your code? Any Ideas?
April 15th, 2009 at 12:05 pm
Hi Vidas,
In my implementation, I am following exactly the same steps as you did above, and it worked well. I am handling the KeyNotFound issue by setting it to IgnoreError in Error Configuration.
However, as per my requirement, I am required to capture those records that are causing the KeyNotFound error and write them to a log. Is there a way to achieve it?
I’ve tried to achieve the same and but didn’t find a way.
Could you please let me know how to do the same? Your early reply will be greatly appreciated.
Thanks
Sekhar
April 16th, 2009 at 5:28 am
Hi Vidas,
Thanks a lot this is very useful
is there same thing for partitioning of the cube?
Thanks & Regards,
Ram
April 16th, 2009 at 5:40 am
Hi Vedas,
I have a question on partitioning
I have a partition based on year but the data can be added to a market(country) for any year.
so we end up with processing all the partitions
in this what should be our strategy?
is the partitioning on the market will help?
waiting for your reply…..
Thanks,
Eshwar
April 17th, 2009 at 5:55 pm
Hi Sekhar,
Currently I process partition using statement:
oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull)
.Process method has another optional parameter with type ‘Microsoft.AnalysisSErvices.ErrorConfiguration’
This parameter “Specifies settings for handling errors that can occur when the parent element is processed.” (from BOL)
I have not used it myself, as I do such type of cleanup during ETL, but you should be able capture key errors using it.
Regards,
Vidas Matelis
April 17th, 2009 at 5:56 pm
Ram,
For partitioning cube please see my blog post:
http://www.ssas-info.com/VidasMatelisBlog/13_ssis-package-to-dropcreate-partitions-based-on-partition-list-in-the-sql-server-table
Regards,
Vidas Matelis
April 17th, 2009 at 5:57 pm
Eshwar,
At the end of this post I have a note, where I explain, that I actually track in SQL server during ETL load what partitions are changed and then I process just affected partitions. You should do the same. There is a bit info about that in this post:
http://www.ssas-info.com/VidasMatelisBlog/13_ssis-package-to-dropcreate-partitions-based-on-partition-list-in-the-sql-server-table
But you will have to fill in missing code yourself. For different reasons I cannot share my full code.
Regards,
Vidas Matelis
October 12th, 2009 at 4:38 pm
I wrote a vb.net assembly and call from SQL Agent as we are not able to use SSIS for processing. I’ve looked, but I can’t find any sign that anyone else has done it this way and am looking for help with error config. Any suggestions?
October 12th, 2009 at 4:44 pm
Sue,
Post your questions in the SSAS MSDN forum here:
http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/threads/
I am sure somebody from the forum will help you.
May 21st, 2010 at 10:14 am
Hi Vidas,
I’ve already implemented an incremental processing based on your posts and everything works as expected!! Thank you so much for sharing it ;)
Now i’m trying to implement an incremental processing using partition processadd type but i’m facing some problems… In my scrip, everything is very similiar to your post except partition processfull type wich i use processadd type and a query binding. The thing is, when i try to process only one partition using processadd type it works fine… but when i try to process two partitions in parallel it return a script error…
After some “digging”, i’ve concluded that the generated xml in the script is not correct when compared to the generated xml by BIDS… Have you already face this problem too? Do you know a workaround?
That’s in advance!!
Best regards
May 23rd, 2010 at 10:41 pm
Hi Rui,
Sorry, I am not using ProcessAdd type, so I have not seen this problem. I don’t know what good workaround would be. I am assuming that doing ProcessAdd one by one is not acceptable – as this would take much more time than running processing in parallel.
September 30th, 2010 at 3:18 pm
[…] have to give credit to Vidas Matelis here for providing some of the AMO coding for the processing, SSIS package that process all partitions/measure groups/cubes in one database . I modified it a bit to be able to limit the scope for a particular cube along with a Try Catch […]
November 1st, 2010 at 1:10 pm
Hello
Ur blog is very informative for SSAS learning people.
I have a query regarding proclarity.
I want to make report which gives information about visitors of my dashboard.I got some data from proclarity event viwer but it dosenot give information about which dashboard or view usr is accessing.
Can you help on this.
Thanks in advance waiting for reply :)
April 26th, 2011 at 2:45 am
i have issue regarding dynamic partition processing using SSIS Script Task. while processing selected partition , other partition are also getting process. Why ?