Tableau Startup: First Lessons Learned

XAs I mentioned in the February post, I’m new to Tableau, and as the tone of that post implied,enjoying it very much. Tableau is a robust and flexible solution for data delivery. Like Qlikview, which I worked with a while ago, it is supported by outstanding, and free, introductory training and a very active user community.

As I’ve made my first steps in Tableau I’ve been a frequent user community visitor, and generally have gotten the answers I’ve been looking for. However, like any tool there still have been a few surprises. I’ll run down the top few in this post:

  • Measures can have complex logic
  • Big extracts are tricky
  • Changing data sources is really tricky
  • Sorry, there are some things you just can’t do

Hopefully this post helps other novices negotiate those first few steps a bit more easily.

Measures can have complex logic

My first Tableau assignment was really simple, but the second was hideously complex. Just when I thought there was no hope it dawned on me that a calculation definition could include IF and CASE constructs, so similar but subtly different measures on different worksheets displayed on the same dashboard can be different. Here’s an example from the community.

I know. Not exactly rocket surgery, but still it took me a while to catch on and discovering this unlocked a lot of cool possibilities.

Data sources are tricky

Where I work the data is big for a relational DMBS. I inherited a live Tableau data source that returned about 750,000 rows per month over two years history. After a few rounds of thrashing and waiting I replaced the data source with a SQL group-by extract that reduced the row count to about a million rows total for the two years.

If you’re working against a big relational database, and you expect decent response time for a query that’s not super optimized, you’ll want to use an extract rather than a live database connection. If you have Tableau Server available you’ll want to publish the data source to the server. The catch is that you can’t edit the data source on the server, you have to bring the whole thing (data included) down to your workstation to change the SQL, then in order to publish it back you’ll have to run the query on your desktop, then export the thing data and all back to the server to publish a change.

The community site and lots of blog posts talk about a technique to fake out the desktop so the extract only ever happens on the server. Maybe you can get it to work, I couldn’t. A colleague working nearby said that if you “include false values rather than exclude true values it worked for me sometimes.”

Seriously? Come on Tableau, you can do better than this. Give us the ability to edit data sources on the server, and to save them without downloading the data to the workstation.

Changing data sources is tricky

When I switched from the live data source to the summary extract, most of my dimensions and measures turned from green or blue to red, meaning they don’t work any more. When you change data sources, unless your new query has exactly the same result set columns (case sensitive) you have to rebuild anything in your workbook that refers to the changed result set names.

Makes total sense, but you don’t think of that until you see all of the red pills.

Here are Tableau’s instructions for changing data sources. I have used Option 1 several times at this point. If you have “calculation fields, groups, sets, parameters, etc.” in your workbook, and you will, don’t forget step 3. Expect complications just because it’s hard to remember to copy everything for complex workbooks.

Sorry, there are some things you just can’t do

Tableau is so intuitive that it’s easy to forget that there’s software between you and the data, and that software is written in a certain way and has certain limitations. Here’s an example that I banged my head against.

I needed to use US state codes (AL, VA, MA, etc) in two filters, one just the state code and another with state code concatenated with a different string in my workbook. The data source only had a six digit numeric “State County Code” (SSC). The US government publishes a cross-reference spreadsheet translating the SSC to State and County names/codes. I was able to pull State Codes from the spreadsheet using Tableau’s primary group functionality then set up the state filter. Nice! But when I tried to use the state code in the concatenation, no dice. Primary group members can’t be used in calculations.

OK, so a colleague suggested bringing the spreadsheet into a separate data source and blending the data based on the State County Code. Brilliant! But that didn’t work either.

When you reference a character column from a blended secondary data source, Tableau wraps it inside an ATTR() function. You can use that ATTR() enclosed attribute as a dimension/filter (for example, I could use the blended State Code as a filter). But if you use that attribute in any kind of calculation, including a string concatenation, the result becomes a discrete measure that can’t be used in a dimension or filter.

Apparently the latter situation is fixed in version 9.2. Hope so.

Don’t take all this as a knock on Tableau. It is an amazing tool that’s unlocking data visualization capabilities for a lot of businesses. My intent is to help those who have just gotten started and maybe have only seen the training videos.

Perhaps reading this post will help get you past the first few roadblocks you encounter as you begin to deliver your first dashboards. You’ll find that as you get used to the few inevitable quirks and work through them you’ll become accustomed to and very comfortable with the workarounds.

Leave a Reply

Your email address will not be published. Required fields are marked *