Fletcher or DeMaio, Who’s the next Mayor of San Diego?

The 2012 San Diego Mayoral race is heating up. Most excitedly, Nathan Fletcher, whom I support, has surged into second place behind Carl DeMaio by only 3 points. Below, I have collected some of the data from the public polls and created my own analysis of how the race is progressing. This should be a fun one to keep track of! As more polls get released to the public I will update this if there is interest from the community.

What do you think? Who will be the next Mayor of San Diego, Nathan Fletcher or Carl DeMaio?

Hacking Excel Web Queries

If you are unfamiliar with web quries its ok, their easy, and I’m gonna explain how they work and how to manipulate them beyond their intended use to increase their flexability.

Breifly, a web query is a great way to automate the mundane task of going to a web page and copying the data into an excel sheet. Basically you tell excel where to look (web page) and what to copy (tables of data) and Excel will automatically import the data into a worksheet for you.

Lets get started by creating a simple web query. I decided to use some basketball stats from espn.com. So to create my basic web query I open up a new workbook, find the ‘Data’ item from the menu then go to ‘Import external data’ and ‘New web query’.

In the URL portion of the web query window enter the URL of the web site you want to pull data from and click Go. I used ‘http://sports.espn.go.com/nba/teams/stats?team=pho’.

Next you may want to setup some options as far as how the import is gonna look. To do this click on the ‘options’ button in the top left corner of the query window.

Excel Web Query Import Options Dialogue

In the options window I chose to return the data using full html formatting.

After clicking ok you should be returned to the query window. Now we need to select the tables of data on the page to return.

Note: If the page layout changes frequently you may want to just return the entire page. In that case skip this step.

To select the table of data simply click on the yellow arrow next to the data and it should turn green.

The last thing before we actually import the data that were going to want to do is to save the web query somewhere.

To save the web query simply click on the save icon next to the options button of the query window.

After clicking import you may see a dialogue box asking you where to put it and to select any additional properties. Some that you may want to play with here would be the ‘overwrite existing cells’ and ‘fill down formulas’.

Ok, so now that we have our basic web query setup, lets have some fun with it.

I have compiled a list of nba teams and their associated 3 character abbreviation used in the URL on espn.com. I’m going to load these into a visual basic combo/edit box and get the web query to update automatically when I select a new team. If you are unfamiliar with how to setup a visual basic combo/edit box please read my tutorial on this found here.

With our teams loaded into a combo box I’m going to open up the saved web query and make some changes that will allow it to accept the three character team abbreviation as a parameter when executing.

To edit he web query file simply right click on it and select ‘edit with notepad’.

WEB
1

http://sports.espn.go.com/nba/teams/stats?team=pho

Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

To make this web query accept parameters for our selected team we need to replace the ‘pho’ of the URL with ‘[“team”,””]’. The new web query will look like this…

WEB
1

http://sports.espn.go.com/nba/teams/stats?team=[“team”,””]

Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

Note: You can setup multiple parameters here for any part of the URL

After making the adjustments to your URL save the web query.

Now that we have a dynamic web query ready to be used, we need to first delete our old web query cached in the workbook and use the new one we just created. Once you have deleted the original web query in the workbook, go to Data>>Import External Data>>Import Data. A dialogue box will open asking you to select a data source, browse to the web query you just saved and select it.

If everything worked as planned the ‘Import Data’ box should have the ‘Parameters’ button avialbale.

After setting any properties from the properties box, click on Parameters.

Hacking Excel Web Queries Parameters Select Window

As you can see our ‘team’ parameter is avialble to us now. On the ‘teams’ tab setup we are going to want to have a field that looks up the three character code for the selected team. Then choose the third option on the ‘parameters’ box and browse to that cell.

Hacking Excel Web Queries Parameter Selected Options

You may also wish to check the ‘refresh automatically’ check box so that when you change your selected team, the information returned updates as well.

If everything went well you should be done besides any formatting changes you may need to make…

One thing that I did have to mess with to get to work was in the cell that had the three character code used in the URL. I had to use the TEXT() formula to convert make sure it was in the correct format when being passed to the web query. I’ve found that no matter what you’re passing, (dates, numbers, text, etc…) its always good to use the TEXT() function to make sure the value is being read in the correct format.

Downloads:

Hacking Excel Web Queries Sample File

 

My Job

This is from a recent assignment I submitted. I thought it was relevant and decided to share. Enjoy!

~~

Business Intelligence Developer Position

My position is a Business Intelligence Developer, which means that I build analytical database systems, which allows business people to make better decisions based on the improved visibility into the measures of their business. I first started this career as an analyst building reports in Microsoft Excel. My curious mind and ambitious nature lead me to quicker and more efficient ways of delivering these analyses. Ten years later I have built many relationships, done consulting, presented at many events, and am currently working on my first book. With that context provided, here are my interpretations of how my position measures up against Hackman and Oldham’s Job Characteristics model.

Skill Variety

Business Intelligence is a highly specialized field. This field still contains however, much variety. From a pure technical perspective, being a Business Intelligence Developer can require Relational Database Management Systems (RDBMS) knowledge, Multi-Dimensional Database Design (OLAP), Extract Transform and Load practice (ETL), and presentation layer / report design. Doing the job of a Business Intelligence developer well also requires a solid understanding of the business. This is where Business Intelligence is unique in its attempt to bridge the gap between the business side and the technology side.

Task Identity

In small BI shops, the BI developer may be responsible for designing, building, and delivering the entire solution. In larger organizations, the one I work in, tasks are a bit more segregated however goals are still defined at a team level which encourages the team to work together and gain an understanding of the entire process.

Task Significance

The goal of Business Intelligence is to improve decision making in the company that eventually leads to increased revenue, reduced costs and increased profits. The correlation between BI efforts and these eventual outcomes however, is not always so clear.

Autonomy

Similar to most Business Technology Services (BTS) projects, Business Intelligence traditionally is managed from a program and project approach. Programs define the strategy and overall goal of the BI program while projects make up the individual deliverables all working towards the same goal. Depending on the project management method being used (eg. waterfall or agile) it can be difficult to have much autonomy over organizing your tasks. In my position we are lucky to use an Agile project management methodology that allows the core team, including myself, to have a great say in what tasks are required to do my job well.

Feedback

The most common way feedback is received by the BI team is when something goes wrong with the nightly ETL processing that updates the data in the presentation layer. This feedback of course is always negative however, positive feedback is also received when the data is on time and accurate. There is some gray area here where no feedback is given and the data being presented is assumed to be correct when in fact it is not. This is why we have placed a laser like focus on the quality of our data being presented.

Where to improve

Since I have had a lot of input into the processes and procedures, we have in place currently it is difficult for me to say where we can improve much.  I will note however that the challenges we face with relating task significance and feedback are common among all Business Intelligence programs and can always use some more attention.

Short response to Sanju Bansal about BI and Performance

“The truth is, there’s a good way to model data in a database… but the way business users want to see the data is far different than that.” (Bansal, 2010)

I disagree. This is the exact reason why Data Warehouses need to be modeled using dimensional modeling techniques that are designed around how the business wants to see the data and are setup for fast query performance. With the data already transformed into the format users are expecting to see it with natural hierarchies and relationships built in, queries are simple and much faster.

There is also the potential to pre-calculate all or most of the aggregations into a multi-dimensional database structure. I have done this countless times with Microsoft’s SQL Server Analysis Services and routinely am able to query fact tables with 20M+ rows in milliseconds.

In my opinion BI interfaces trying to make queries faster is a valiant effort however, without a solid dimensional data structure modeled after the way users want to see the data there is only so much they will be able to accomplish.

Centralized Business Intelligence

My approach as a BI professional has always been to organize the data into conformed structures, present that to end users for consumption in their ad-hoc analysis, and only build formal reports and dashboards as necessary. I think the conformed structures provide great consistency for BI across the enterprise no matter who is building it and allow users to get answers to questions faster.

Of course, there is no way IT can keep up with the onslaught of report requests so it makes sense to provide users with the tools to do these themselves. Typically, I would offer up a connection to a cube through Excel and now with the addition of PowerPivot I believe this type of process is going to become even more prevalent.

Some I have talked with have expressed concern of decentralization of BI by tools like PowerPivot however; I see these tools as giving users a sort of proving ground for their BI applications. If an information worker successfully creates a BI application in PowerPivot that becomes widely used across the enterprise IT does not have to guess that formalizing it into a more formal BI application will have value.

Other methods exist that I think are worth for determining the value of a BI application (eg. BI Pathway Method) however I believe this is for more formal BI versus the ad-hoc type of analysis that users clamor for on a day-to-day basis.