If you are unfamiliar with web queries its OK, their easy, and I’m gonna explain how they work and how to manipulate them beyond their intended use to increase their flexibility.
Contents
Excel Web Queries Overview
Briefly, an Excel Web Query is a great way to automate the mundane task of going to a web page and copying the data into an Excel sheet. Basically you tell Excel where to look (web page) and what to copy (tables of data) and Excel will automatically import the data into a worksheet for you. These are what we call Excel Web Queries.
Creating Excel Web Queries
Lets get started by creating a simple Excel Web Query. I decided to use some basketball stats from espn.com. So to create my basic Excel Web Query I open up a new workbook, find the ‘Data’ item from the menu then go to ‘Import external data’ and ‘New web query’.
In the URL portion of the Excel Web Query window enter the URL of the web site you want to pull data from and click Go. I used ‘http://sports.espn.go.com/nba/teams/stats?team=pho’.
Next you may want to setup some options as far as how the import is gonna look. To do this click on the ‘options’ button in the top left corner of the Excel Web Query window.
In the Excel Web Query options window I chose to return the data using full HTML formatting.
After clicking OK you should be returned to the Excel Web Query window. Now we need to select the tables of data on the page to return.
Note: If the page layout changes frequently you may want to just return the entire page. In that case skip this step.
To select the table of data simply click on the yellow arrow next to the data and it should turn green.
The last thing before we actually import the data that were going to want to do is to save the Excel Web Query somewhere.
To save the web query simply click on the save icon next to the options button of the query window.
After clicking import you may see a dialogue box asking you where to put it and to select any additional properties. Some that you may want to play with here would be the ‘overwrite existing cells’ and ‘fill down formulas’.
Ok, so now that we have our basic web query setup, lets have some fun with it.
Hacking Excel Web Queries
I have compiled a list of NBA teams and their associated 3 character abbreviation used in the URL on espn.com. I’m going to load these into a visual basic combo/edit box and get the web query to update automatically when I select a new team. If you are unfamiliar with how to setup a visual basic combo/edit box please read my tutorial on this found here.
With our teams loaded into a combo box I’m going to open up the saved Excel Web Query and make some changes that will allow it to accept the three character team abbreviation as a parameter when executing.
To edit the Excel Web Query file simply right click on it and select ‘edit with notepad’.
WEB
1
sports.espn.go.com/nba/teams/stats?team=pho
Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False
To make this Excel Web Query accept parameters for our selected team we need to replace the ‘pho’ of the URL with ‘[“team”,””]’. The new web query will look like this…
WEB
1
sports.espn.go.com/nba/teams/stats?team=%5B“team”,””%5D
Selection=2
Formatting=All
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False
Note: You can setup multiple parameters here for any part of the URL
After making the adjustments to your URL save the Excel Web Query.
Now that we have a dynamic Excel Web Query ready to be used, we need to first delete our old Excel Web Query cached in the workbook and use the new one we just created. Once you have deleted the original web query in the workbook, go to Data>>Import External Data>>Import Data. A dialogue box will open asking you to select a data source, browse to the web query you just saved and select it.
If everything worked as planned the ‘Import Data’ box should have the ‘Parameters’ button available.
After setting any properties from the properties box, click on Parameters.
As you can see our ‘team’ parameter is available to us now. On the ‘teams’ tab setup we are going to want to have a field that looks up the three character code for the selected team. Then choose the third option on the ‘parameters’ box and browse to that cell.
You may also wish to check the ‘refresh automatically’ check box so that when you change your selected team, the information returned updates as well.
If everything went well you should be done besides any formatting changes you may need to make…
One thing that I did have to mess with to get to work was in the cell that had the three character code used in the URL. I had to use the TEXT() formula to convert make sure it was in the correct format when being passed to the web query. I’ve found that no matter what you’re passing, (dates, numbers, text, etc…) its always good to use the TEXT() function to make sure the value is being read in the correct format.
Downloads:
Hacking Excel Web Queries Sample File
Want to learn more about Excel?
Take my course on Excel Dashboarding
Or dive in deeper and learn Data Analysis Fundamentals
I run a fantasy golf league and each golfers points are their prize $ for each tournament. Is there a way to automatically get this information onto an excel spreadsheet?
Hi there, these days I would probably use a Google Spreadsheet that you can share w/ folks online. It’s all free, just search for Google Docs and you’ll find a link.
Excel is not really a database or for web. for a better tool, consider cmas.systems. To really automation, you need CMAS.
I recommend www.listly.io/.
It makes web page to Excel in seconds, and finds the CSS-path for list-like contents.
You can extract and download the contents without a parsing code.
Hey Ben, I appreciate you provisioning this page. I have used MSOffice for two decades and recently bought a Mac. Office is not optimized for Mac – in fact it’s safe to say it’s offers almost none of the useful features I am accustomed to. That said, when connecting to HTML websites, I expect navigation to the Internet – but that is not an option. I have access to local machine repositories only, while simultaneously connected to the Internet via multiple browsers. Excel cannot ‘see’ the Internet, I think. I’ve asked this question a dozen times of Microsoft and never hear back and the stores have not idea. Any thoughts? Thank you, ben.