MySQL has traditionally lagged behind in support for the SQL standard. Unfortunately, from my experience, MySQL is often used as the sandbox for SQL code challenges and interviews. If you are used to work with Vertica SQL, writing SQL statements in MySQL can be challenging exercises, NOT necessarily in a good way, because many useful features are not supported.
WITH clause
As discussed in this blog post, WITH
clause syntax, also known as Common Table Expressions (CTE), is thankfully supported in Vertica.
In summary, WITH
clause allows us to arrange sub-queries, usually intermediate steps, in a complex SQL query in sequential, logical order.
This will make the complex queries easier to compose and read: we can write steps by steps of the query from top to bottom like a story (i.e., literate programming).
Unfortunately, WITH
clause is not supported by MySQL although this feature has been requested since 2006.
There are work-arounds for MySQL’s lack of CTE, but the easiest way is probably to revert back to using nested subqueries.
Personally, lack of WITH
clause support in MySQL is my greatest hindrance as I often ended up writing queries using WITH
clauses as first draft before rewriting those queries using nested subqueries.
This might appear clumsy in SQL interviews even though writing SQL codes with CTE instead of subqueries is the recommended practice for maintainable code.
Analytic functions
Another regrettable hindrance when working in MySQL is its lack of analytic functions such as ROW_NUMBER
, RANK
and DENSE_RANK
.
Those analytic functions are supported in Vertica.
The difference between these three functions can be a bit subtle, and would be best described in the following example:
1 2 3 4 5 6 7 |
|
The outputs of these functions are only different if there are duplicates in SUM(annual_income)
value, as seen in rows 75-81 in the example output below:
customer_name | SUM | row_number | rank | dense_rank |
---|---|---|---|---|
Theodore R. King | 97444 | 71 | 71 | 71 |
Laura Y. Pavlov | 97417 | 72 | 72 | 72 |
Carla . Garcia | 97371 | 73 | 73 | 73 |
Jack Z. Miller | 97356 | 74 | 74 | 74 |
Steve W. Williams | 97343 | 75 | 75 | 75 |
Lauren Y. Rodriguez | 97343 | 76 | 75 | 75 |
Lucas . Webber | 97318 | 77 | 77 | 76 |
Sarah N. Moore | 97243 | 78 | 78 | 77 |
Lucas O. Li | 97184 | 79 | 79 | 78 |
Doug K. Reyes | 97166 | 80 | 80 | 79 |
Michael . Weaver | 97162 | 81 | 81 | 80 |
Sadly, these useful analytic functions are not supported in MySQL. Fortunately, MySQL supports user variables in SQL queries and we can reproduce those functionalities in MySQL using variables and subqueries as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
The MySQL work-around is intentionally generic so that I can adapt it to any use case.
In addition, it intentionally has a single pass (no SET
statements, temporary table) since most SQL code challenges expect a single query.
Finally, note that the above MySQL solution is intentionally incomplete to make it less convoluted.
You need to put that solution in a subquery and SELECT
only relevant columns from it.
As an example, the above code template is used to solve this Rank Scores problem.
In summary, the question asks for DENSE_RANK
functionality to be applied on Score column.
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 4 5 6 7 8 9 10 |
|
The solution in Vertica SQL would be straight-forward as follows:
1 2 3 |
|
In MySQL, apply the above code template and note that there is no partition clause
to arrive at the following solution:
1 2 3 4 5 6 |
|
Note that the outer SELECT
is used to only expose only columns of interest while the main SQL code is enclosed in a subquery.
Reference
- ROW_NUMBER in MySQL
- DENSE_RANK in MySQL: this link actually shows
RANK
implementation. - Vertica Analytic Functions
- MySQL user variables