Leveraging your Tableau Server to create large Data Extracts

Recently I’ve been doing a lot of work leveraging the Tableau Data Server. This is a great new feature of Tableau Server that allows BI groups to publish standard sets of data to Tableau for analysts to consume. This is a great alternative to traditional approaches which take 100x as long plus don’t allow for rapid prototyping of the data asset.

Although this feature is great it hasn’t always worked the best. The most often problem I run into is creating the extract itself when dealing with “large” data volumes of 20M+ rows. I developed a work-around to leverage the power of the server to create the extract and save your poor laptop from having to crunch all that data for you. Here goes…if you end up trying this please let me know how it works for you or if you have any alternatives. I’d like to use this to feedback to Tableau on how to improve this feature of their product.

1. Create a view in your source database that pulls a small amount of the total rows you want in your extract. In SQL Server I do this using the ‘top’ function. In MySQL you’ll need to use the Limit() function. Before moving to the next step test that querying this view works in a reasonable time frame from your local query environment (eg. SSMS, Toad, SQLYog, etc.) and there are no obvious data issues (nulles, dupes, etc.)

2. Create your extract in Tableau importing all data from the view. Make sure to create whatever hierarchies you want, rename columns so they make sense, correct any data types Tableau messed up, and whatever calculations you know of at this point. Again, before moving on, test that this works relatively well with this subset of data.

3. Publish your data extract to Tableau Server. With the extract now on the server, go and manually run a full refresh. Watch the progress in the Background Tasks Maintenance View. Once it finishes successfully test that you can connect to it and all looks good.

4. Now, go change your view to return all rows. Remove whatever condition you put in to trim down the rows returned. This should allow the extract to pull in the entire data set next time it is refreshed on the Tableau Server.

5. Go back to Tableau Server and execute another full refresh of your extract. This time, the extract should get the entire population of data and the beauty is all the horsepower of your Tableau Server will be used to create the extract instead of your poor laptop.

6. Make sure the extract finishes on the server (this may take a while) and then test again that you can connect via the Tableau Data Server. Lastly, revel in your accomplishment :)

Hope this helps! Please let me know how it works for you if you end up trying it out…

Here are the results of my most recent use of this technique. I started w/ a view only showing 10 rows then changed it behind the scenes to pull in all 35M rows. Works like a charm!

Cheers,
Ben

Customizing Tableau Server Logos and Icons

When deploying Tableau Server for a client, I often find that clients wish to customize their Tableau Server environment with their own branding. The Tableau Server Administration Guide offers some options for customizing Tableau Server however it is limited in what it allows.

Here are the default options that you can customize using tabadmin commands found in the Tableau Server Administration Guide.

Changing the Name

You can customize Tableau Server’s look and feel by customizing the name that
appears in screen tips and messages. To change the name:

  1. Open a command prompt as an administrator and type the following:

  2. 32-bit: cd "C:\Program Files\Tableau\Tableau Server\7.0\bin"
    64-bit: cd "C:\Program Files (x86)\Tableau\Tableau Server\7.0\bin"

  3. Change the name by typing the following:

  4. tabadmin customize name "new_name"

    In the above line, replace “new_name” with the text that you want to appear as
    the name on the server. Example: tabadmin customize “Company
    Server”

  5. Restart the server for the change to take effect by typing:
  6. tabadmin restart

Changing the Logo

You can customize Tableau Server’s look and feel by customizing the logo that appears
on the Tableau Server login page and in the left column of most pages. To change the
logo:

Note: This does not include the site favicon or Tableau Server Icon displayed when viewing a dashboard. More on that below.

  1. Open a command prompt as an administrator and type the following:

  2. 32-bit: cd "C:\Program Files\Tableau\Tableau Server\7.0\bin"
    64-bit: cd "C:\Program Files (x86)\Tableau\Tableau Server\7.0\bin"

  3. Change the logo by typing the following:
  4. tabadmin customize logo "C:\My Pictures\logo.png"

    In the above line, replace “C:\My Pictures\logo.png” with the path and file
    name of the image that you want to appear as the logo on the server. For best
    results, use an image that is 125 pixels x 35 pixels in size. The image can be a
    .png, jpg, or .gif file.

  5. Restart the server for the change to take effect by typing:
  6. tabadmin restart

 

Customizing the Tableau Server icons

(Not officially supported by Tableau)
In this example E is the root of where I installed Tableau. I do this so that I can have a faster disk with more storage capacity than the C drive. I recommend this practice for all instances where you can afford it.

E:\Tableau\Tableau Server\7.0\server.ico
E:\Tableau\Tableau Server\7.0\wgserver\public\favicon.ico
E:\Tableau\Tableau Server\7.0\wgserver\public\v_70001201131054\favicon.ico
E:\Tableau\Tableau Server\7.0\wgserver\public\v_70001201131054\images\server-logo.gif
E:\Tableau\Tableau Server\7.0\wgserver\public\v_70001201131054\images\tableau-icon.gif

If you try this solution and find additional images that need to be replace, or if anyone wants to make a script for this please reply in the comments with what you found!

Hacking Excel Web Queries

If you are unfamiliar with web quries its ok, their easy, and I’m gonna explain how they work and how to manipulate them beyond their intended use to increase their flexability.

Breifly, a web query is a great way to automate the mundane task of going to a web page and copying the data into an excel sheet. Basically you tell excel where to look (web page) and what to copy (tables of data) and Excel will automatically import the data into a worksheet for you.

Lets get started by creating a simple web query. I decided to use some basketball stats from espn.com. So to create my basic web query I open up a new workbook, find the ‘Data’ item from the menu then go to ‘Import external data’ and ‘New web query’.

In the URL portion of the web query window enter the URL of the web site you want to pull data from and click Go. I used ‘http://sports.espn.go.com/nba/teams/stats?team=pho’.

Next you may want to setup some options as far as how the import is gonna look. To do this click on the ‘options’ button in the top left corner of the query window.

Excel Web Query Import Options Dialogue

In the options window I chose to return the data using full html formatting.

After clicking ok you should be returned to the query window. Now we need to select the tables of data on the page to return.

Note: If the page layout changes frequently you may want to just return the entire page. In that case skip this step.

To select the table of data simply click on the yellow arrow next to the data and it should turn green.

The last thing before we actually import the data that were going to want to do is to save the web query somewhere.

To save the web query simply click on the save icon next to the options button of the query window.

After clicking import you may see a dialogue box asking you where to put it and to select any additional properties. Some that you may want to play with here would be the ‘overwrite existing cells’ and ‘fill down formulas’.

Ok, so now that we have our basic web query setup, lets have some fun with it.

I have compiled a list of nba teams and their associated 3 character abbreviation used in the URL on espn.com. I’m going to load these into a visual basic combo/edit box and get the web query to update automatically when I select a new team. If you are unfamiliar with how to setup a visual basic combo/edit box please read my tutorial on this found here.

With our teams loaded into a combo box I’m going to open up the saved web query and make some changes that will allow it to accept the three character team abbreviation as a parameter when executing.

To edit he web query file simply right click on it and select ‘edit with notepad’.

WEB
1

http://sports.espn.go.com/nba/teams/stats?team=pho

Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

To make this web query accept parameters for our selected team we need to replace the ‘pho’ of the URL with ‘[“team”,””]’. The new web query will look like this…

WEB
1

http://sports.espn.go.com/nba/teams/stats?team=[“team”,””]

Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

Note: You can setup multiple parameters here for any part of the URL

After making the adjustments to your URL save the web query.

Now that we have a dynamic web query ready to be used, we need to first delete our old web query cached in the workbook and use the new one we just created. Once you have deleted the original web query in the workbook, go to Data>>Import External Data>>Import Data. A dialogue box will open asking you to select a data source, browse to the web query you just saved and select it.

If everything worked as planned the ‘Import Data’ box should have the ‘Parameters’ button avialbale.

After setting any properties from the properties box, click on Parameters.

Hacking Excel Web Queries Parameters Select Window

As you can see our ‘team’ parameter is avialble to us now. On the ‘teams’ tab setup we are going to want to have a field that looks up the three character code for the selected team. Then choose the third option on the ‘parameters’ box and browse to that cell.

Hacking Excel Web Queries Parameter Selected Options

You may also wish to check the ‘refresh automatically’ check box so that when you change your selected team, the information returned updates as well.

If everything went well you should be done besides any formatting changes you may need to make…

One thing that I did have to mess with to get to work was in the cell that had the three character code used in the URL. I had to use the TEXT() formula to convert make sure it was in the correct format when being passed to the web query. I’ve found that no matter what you’re passing, (dates, numbers, text, etc…) its always good to use the TEXT() function to make sure the value is being read in the correct format.

Downloads:

Hacking Excel Web Queries Sample File

 

Excel Tip – Defaulting to Non-Compatibility Mode

I was recently fighting with Excel because I was sick of it wanting to run in Compatibility Mode by default. Everything I found out there said that you simply need to change the option for Save-As to be the 2007 format. This only seemed to work on the workbooks I had open. If I closed and re-opened Excel it would be back in Compatibility Mode.

Remembering from my old days of hacking Excel 97 workbooks and creating templates with Macro’s I recalled a way to change the default workbook that Excel would use when starting up. This is where the ‘ah-ha’ happened.

Below are the steps you may use to force Excel out of Compatibility Mode when it starts up if changing the aforementioned Save-As method does not work for you.

1. Open Excel (you should get Book1 [Compatibility Mode] by default)
2. Save-As (F12) and save it as Book1.xlsx (the 2007 format) in the following location:

C:\Program Files\Microsoft Office\Office12\XLSTART

3. Restart Excel

That’s it! Excel by default will look in this location (based on where you installed Excel) for the default workbook it uses when starting up. Since you have saved it in the 2007 format Excel will no longer run in Compatibility Mode by default. It will only do this when opening .xls files that were saved in Compatibility Mode. For those, just save as .xlsx and you should be good.

Hope this helps!

Cheers,
Ben