Two key interview questions for SQL developer candidates

Frequently in my career I’ve selected or helped select ETL and reporting professionals who need SQL skills. For some of those opportunities, placement firms returned resumes with interminable, and nearly identical, lists of technical achievements with excruciating unnecessary detail (paraphrasing: “Wrote SELECT statements using GROUP BY”, “Applied both inner and outer joins”). Before interviewing we typically ranked candidates in order of preference based on resumes. Candidates’ interview success bore little relation to resume-based rankings.

With some candidates I’ve encountered consistent pauses after fact questions, sometimes accompanied by keyboard clicks. They obviously used “think time” to look up answers on the internet. As a result, “fact” questions didn’t distinguish one candidate from another.

On the other hand, open ended questions worked well. I’ve written before that interviews should ask opinion rather than fact questions. Open ended questions or thought exercise, as opposed to fact questions, assess SQL skill level, are hard to quickly look up on the web, and have the added benefit of demonstrating a candidate’s reasoning and communication skills. Here are two of my favorite examples:

What’s your approach to troubleshooting a slow query?
Why would you ever join to the same table twice in the same SELECT statement?

What’s your approach to troubleshooting a slow query?

I’ve mentioned this one before — it’s one of my favorites because it tests a candidate’s understanding of how queries and databases work without punishing more junior candidates who’ve learned to be successful without formal training.

Query performance is very situational, depending on the data being accessed, the database management system (DBMS) involved, and the way the query is written, so there are many different ways to “correctly” answer.

My preference is for candidates to emphasize the query in relation to the data. For a fictional example, a candidate might say, “I review all of the joins to make sure there’s no missing join key causing too much data to be returned. One time, I forgot to include order line number in a search for bicycle sales; the query returned millions of irrelevant results and took forever.” This indicates an understanding that inefficient queries are often incorrect queries, and a basic understanding that knowledge of the data is key to resolving query performance.

A candidate should also have some knowledge of the specifics of your DBMS. Experienced junior candidates should be conversant in the basics of how to operate in the DBMS toolset. Mid-level and senior candidates should understand query optimization concepts and be familiar at the appropriate level with execution plans specific to the DBMS. In addition, mid and senior level candidates should understand common complexities in the given DBMS environment, like spooling in Teradata, or dynamic SQL in SQL Server.

At any level, the candidate should show that they have a process for addressing poor query performance, and be able to talk you through that process.

Why would you ever join to the same table twice in the same SELECT statement?

This question requires a candidate to create or recount an example query, and explain the rationale for the query logic.

Here are two examples of self joins that a candidate might cite:

A customer table might include the key of a customer’s spouse, who is also recorded in the same table as a customer. This post shows how to list customers and spouses on the same table by joining to the customer table twice.
Say a table includes many orders per customer. The query to show latest order per customer might include a join to an embedded query selecting the max order date for the given customer:

SELECT
   customer_id, 
   order_date,
   [more columns]
FROM
   customer cust
JOIN
   (
    SELECT customer_ID, MAX(date) AS `date` FROM customer
   ) latest
ON cust.date = latest.date

This question might stump junior candidates. Senior candidates should have encountered an example once or twice in their careers and have an example in their toolbag. In either case, a successful candidate should be able to think through an example that, even if not correct, demonstrates communication and reasoning ability. (A great follow up question for those offering the second example would be to ask about efficiency considerations, mentioned in this post.)

While, in my experience, fact questions are of little value in choosing qualified candidates, thought questions like these can help identify candidates who can think through problems and explain solutions, and who have the technical nous to address challenges they’ll face as developers in your environment.

Bob Lambert