MySQL has traditionally lagged behind in support for the SQL standard. Unfortunately, from my experience, MySQL is often used as the sandbox for SQL code challenges and interviews. If you are used to work with Vertica SQL, writing SQL statements in MySQL can be challenging exercises, NOT necessarily in a good way, because many useful features are not supported.
As discussed in this blog post,
WITH clause syntax, also known as Common Table Expressions (CTE), is thankfully supported in Vertica.
WITH clause allows us to arrange sub-queries, usually intermediate steps, in a complex SQL query in sequential, logical order.
This will make the complex queries easier to compose and read: we can write steps by steps of the query from top to bottom like a story (i.e., literate programming).
WITH clause is not supported by MySQL although this feature has been requested since 2006.
There are work-arounds for MySQL’s lack of CTE, but the easiest way is probably to revert back to using nested subqueries.
Personally, lack of
WITH clause support in MySQL is my greatest hindrance as I often ended up writing queries using
WITH clauses as first draft before rewriting those queries using nested subqueries.
This might appear clumsy in SQL interviews even though writing SQL codes with CTE instead of subqueries is the recommended practice for maintainable code.
Another regrettable hindrance when working in MySQL is its lack of analytic functions such as
Those analytic functions are supported in Vertica.
The difference between these three functions can be a bit subtle, and would be best described in the following example:
1 2 3 4 5 6 7
The outputs of these functions are only different if there are duplicates in
SUM(annual_income) value, as seen in rows 75-81 in the example output below:
|Theodore R. King||97444||71||71||71|
|Laura Y. Pavlov||97417||72||72||72|
|Carla . Garcia||97371||73||73||73|
|Jack Z. Miller||97356||74||74||74|
|Steve W. Williams||97343||75||75||75|
|Lauren Y. Rodriguez||97343||76||75||75|
|Lucas . Webber||97318||77||77||76|
|Sarah N. Moore||97243||78||78||77|
|Lucas O. Li||97184||79||79||78|
|Doug K. Reyes||97166||80||80||79|
|Michael . Weaver||97162||81||81||80|
Sadly, these useful analytic functions are not supported in MySQL. Fortunately, MySQL supports user variables in SQL queries and we can reproduce those functionalities in MySQL using variables and subqueries as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
The MySQL work-around is intentionally generic so that I can adapt it to any use case.
In addition, it intentionally has a single pass (no
SET statements, temporary table) since most SQL code challenges expect a single query.
Finally, note that the above MySQL solution is intentionally incomplete to make it less convoluted.
You need to put that solution in a subquery and
SELECT only relevant columns from it.
As an example, the above code template is used to solve this Rank Scores problem.
In summary, the question asks for
DENSE_RANK functionality to be applied on Score column.
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
The solution in Vertica SQL would be straight-forward as follows:
1 2 3
In MySQL, apply the above code template and note that there is no
partition clause to arrive at the following solution:
1 2 3 4 5 6
Note that the outer
SELECT is used to only expose only columns of interest while the main SQL code is enclosed in a subquery.