Mar 10

“The truth is, there’s a good way to model data in a database… but the way business users want to see the data is far different than that.” (Bansal, 2010)

I disagree. This is the exact reason why Data Warehouses need to be modeled using dimensional modeling techniques that are designed around how the business wants to see the data and are setup for fast query performance. With the data already transformed into the format users are expecting to see it with natural hierarchies and relationships built in, queries are simple and much faster.

There is also the potential to pre-calculate all or most of the aggregations into a multi-dimensional database structure. I have done this countless times with Microsoft’s SQL Server Analysis Services and routinely am able to query fact tables with 20M+ rows in milliseconds.

In my opinion BI interfaces trying to make queries faster is a valiant effort however, without a solid dimensional data structure modeled after the way users want to see the data there is only so much they will be able to accomplish.

Mar 09

My approach as a BI professional has always been to organize the data into conformed structures, present that to end users for consumption in their ad-hoc analysis, and only build formal reports and dashboards as necessary. I think the conformed structures provide great consistency for BI across the enterprise no matter who is building it and allow users to get answers to questions faster.

Of course, there is no way IT can keep up with the onslaught of report requests so it makes sense to provide users with the tools to do these themselves. Typically, I would offer up a connection to a cube through Excel and now with the addition of PowerPivot I believe this type of process is going to become even more prevalent.

Some I have talked with have expressed concern of decentralization of BI by tools like PowerPivot however; I see these tools as giving users a sort of proving ground for their BI applications. If an information worker successfully creates a BI application in PowerPivot that becomes widely used across the enterprise IT does not have to guess that formalizing it into a more formal BI application will have value.

Other methods exist that I think are worth for determining the value of a BI application (eg. BI Pathway Method) however I believe this is for more formal BI versus the ad-hoc type of analysis that users clamor for on a day-to-day basis.

Oct 29

Quick one here…I was recently setting up a SSRS 2008 dev environment integrated with a MOSS 2007 dev environment and encountered some strange problems with the SSRS 2008 Add-In for SharePoint. Basically every time it would run and state the installation was “interrupted” and nothing was changed. To my frustration nothing appeared to be wrong with the MOSS or SSRS setups however I did notice an error in the event viewer related to access to the SQL Server database. It turns out that wasn’t the problem either. There seems to be an issue with the rsSharePoint[_x64].msi installer as mentioned in Prash’s blog here (http://blogs.msdn.com/prash/…)

To overcome this hurdle you must follow the following steps:

  1. Open a command prompt with “Run as Administrator…
  2. Run rsSharePoint.msi SKIPCA=1.
  3. Locate rsCustomAction.exe file on the file system. This file is copied to your computer by rsSharePoint.msi and is generally located in the C:\Users\YourUserName\AppData\Local\Temp directory.
  4. Run rsCustomAction.exe /i

You will see lots of messages similar to those below in the command window but it will eventually finish and you’re setup of the SSRS 2008 Add-In for your MOSS 2007 environment will be complete.

SSRS 2008 Add-In for MOSS 2007 Install

Hope this helps!

Oct 28

In SSIS 2005 a single config source (XML or SQL Server) with many connection strings could be used to setup the connections for multiple packages even if those packages did not contain all of the connections specified in the config source. In this scenario the native package configurations would read the source and attempt to apply connection strings for all those specified in the source. In cases where the connection did not exist in the SSIS package a warning would be reported however, these warnings could be ignored by setting the ‘SupressConfigurationWarnings’ property to ‘True’ within the package. This configuration strategy worked well for me in past experience since I would only have to manage the connection strings for almost all packages within a single config location. This method however, does not work in SSIS 2008 :(

Below is the solution I decided on however I am not totally satisfied with it. If you are also running into this issue and have something to share please comment below. Thanks!

Variable Approach
1. Inside the package there is a variable for every connection manager that will contain it’s connection string
2. The value for these variables will be derived from a package configuration pulling from a SQL Server source
3. Connection managers all are then set by an expression assigning the value of the variable to their connection string property

Benefits
1. A warning (versus error) is thrown when there is a variable in the database not found in the package that can be ignored by updating the ‘SupressConfigurationWarnings’ property to true. This is also set by another configuration.
2. A single place to manage all connection strings for all SSIS packages

Cons
1. Extra steps are involved when adding new connection managers to a package

Below are the results from my tests using the same strategy in SSIS 2005 and SSIS 2008.

2005 Output

Warning: 0×8001F02F at Package: Cannot resolve a package path to an object in the package “.Connections[Conn2].ConnectionString”. Verify that the package path is valid.
SSIS package “Package.dtsx” starting.
SSIS package “Package.dtsx” finished: Success.

2008 Output

Error at Package: The connection “Conn2″ is not found. This error is thrown by the Connections collection when the specific connection element is not found.”

Sep 24

Last Thursday I presented at the San Diego SQL Server User Group (www.sdsqlug.org) a codeplex project I developed based on the SSIS Event Logs. Below are links to the codeplex project as well as the slides I presented.

The inspiration for this project comes from a former position I held managing a team of database developers in which we were responsible for all the Customer Data Integration (CDI) processes as well as the Extract Transform and Load (ETL) processes for our Enterprise Data Warehouse (EDW).

While in that role I found that some of the CDI proceses were running slow to the point that it was affecting business operations. Poking around the native SSIS Event Log data it was really difficult to get any clue as to what part of the process was the bottleneck.

Now I knew that Microsoft had previously released some report packs (link) for the SSIS Event Log data that contained the logic I needed to extract this data. So I took these reports and reverse engineered them into an ETL process to load a new Data Mart (DM) based on the SSIS Event Log data.

Add an SSAS cube and some basic reports and I was now able to analyze the performance of my CDI and ETL processes easily and determine where bottlenecks exist and improvements can be made. Another interesting outcome of this is the ability to measure the improvements of changes to the CDI and ETL processes over time.

This project is designed to be generic so that anyone using SSIS can take advantage of this and gain some insight into their process performance. If you are interested in obtaining a copy of the source code please create a codeplex user account and send me a request on the project site (below).

Hope this helps!
Ben

SSIS Event Log Business Intelligence Codeplex Home – http://ssisbi.codeplex.com
Codeplex Registration – https://www.codeplex.com/site/register
San Diego SQL Server User Group – http://www.sdsqlug.org
SSIS on MSDN – http://msdn.microsoft.com/en-us/library/ms141026.aspx