一道有趣的 MySQL 小题目

在开源中国每日动弹中看到这么一道题目,蛮有意思,还学到了一个新的 MySQL 语法:CTE。

题目

原始数据

SELECT 'A' Name, 8 Score UNION
SELECT 'B' Name, 7 Score UNION
SELECT 'C' Name, 6 Score UNION
SELECT 'D' Name, 5 Score
+------+-------+
| Name | Score |
+------+-------+
| A    |     8 |
| B    |     7 |
| C    |     6 |
| D    |     5 |
+------+-------+

目标数据

+------+-------+------+-------+
| name | score | name | score |
+------+-------+------+-------+
| A    |     8 | B    |     7 |
| C    |     6 | D    |     5 |
+------+-------+------+-------+

过程

尝试 MySQL CTE:

WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score
)
SELECT * FROM t1;
WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT * FROM t2;
WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1 WHERE MOD(t2.row_, 2) = 0
UNION
SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1 WHERE MOD(t2.row_, 2) != 0;
WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT a.name, a.score, b.name, b.score
FROM (SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1 WHERE MOD(t2.row_, 2) != 0) a,
     (SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1 WHERE MOD(t2.row_, 2) = 0) b
WHERE a.r = b.r;

如果行数是奇数呢?

WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score UNION
    SELECT 'E' Name, 4 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT a.name, a.score, b.name, b.score
FROM (SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1 WHERE MOD(t2.row_, 2) != 0) a,
     (SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1 WHERE MOD(t2.row_, 2) = 0) b
WHERE a.r = b.r;
+------+-------+------+-------+
| name | score | name | score |
+------+-------+------+-------+
| A    |     8 | A    |     8 |
| C    |     6 | C    |     6 |
| E    |     4 | E    |     4 |
+------+-------+------+-------+
WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score UNION
    SELECT 'E' Name, 4 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT *
FROM (SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1 WHERE MOD(t2.row_, 2) != 0) a
LEFT JOIN (SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1 WHERE MOD(t2.row_, 2) = 0) b
ON a.r = b.r;

结果非常诡异:

+------+-------+------+------+------+-------+------+------+
| Name | Score | row_ | r    | Name | Score | row_ | r    |
+------+-------+------+------+------+-------+------+------+
| A    |     8 |    1 |    1 | A    |     8 |    6 |    1 |
| C    |     6 |    3 |    2 | C    |     6 |    8 |    2 |
| E    |     4 |    5 |    3 | E    |     4 |   10 |    3 |
+------+-------+------+------+------+-------+------+------+

真实表 LEFT JOIN 实验

结论:不是我用错连接了,LEFT JOIN 没有问题。
可能是 MariaDB 的 CTE 实现方式有什么问题。

CREATE DATABASE `test` /*!40100 COLLATE 'utf8mb4_unicode_ci' */;
CREATE TABLE `test`.`test20190419` (
    `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `Name` VARCHAR(50) NOT NULL,
    `Score` INT(10) UNSIGNED NOT NULL,
    PRIMARY KEY (`id`)
) COLLATE='utf8mb4_unicode_ci' ENGINE=InnoDB;
INSERT INTO `test`.`test20190419` (`Name`, `Score`) VALUES ('A', 9);
INSERT INTO `test`.`test20190419` (`Name`, `Score`) VALUES ('B', 8);
INSERT INTO `test`.`test20190419` (`Name`, `Score`) VALUES ('C', 7);
INSERT INTO `test`.`test20190419` (`Name`, `Score`) VALUES ('D', 6);
INSERT INTO `test`.`test20190419` (`Name`, `Score`) VALUES ('E', 5);
SELECT a.name, a.score, b.name, b.score
FROM (
    SELECT `test`.`test20190419`.*, @rownum2 := @rownum2 + 1 AS r
    FROM `test`.`test20190419`, (
    SELECT @rownum2 := 0) a1
    WHERE MOD(id, 2) != 0
) a LEFT JOIN (
    SELECT `test`.`test20190419`.*, @rownum3 := @rownum3 + 1 AS r
    FROM `test`.`test20190419`, (
    SELECT @rownum3 := 0) b1
    WHERE MOD(id, 2) = 0
) b ON a.r = b.r;
+------+-------+------+-------+
| name | score | name | score |
+------+-------+------+-------+
| A    |     9 | B    |     8 |
| C    |     7 | D    |     6 |
| E    |     5 | NULL |  NULL |
+------+-------+------+-------+

找到原因

去掉奇偶判断逻辑,之后:

WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score UNION
    SELECT 'E' Name, 4 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT *
FROM (SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1) a
LEFT JOIN (SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1) b
ON a.r = b.r;
+------+-------+------+------+------+-------+------+------+
| Name | Score | row_ | r    | Name | Score | row_ | r    |
+------+-------+------+------+------+-------+------+------+
| A    |     8 |    1 |    1 | A    |     8 |    6 |    1 |
| B    |     7 |    2 |    2 | B    |     7 |    7 |    2 |
| C    |     6 |    3 |    3 | C    |     6 |    8 |    3 |
| D    |     5 |    4 |    4 | D    |     5 |    9 |    4 |
| E    |     4 |    5 |    5 | E    |     4 |   10 |    5 |
+------+-------+------+------+------+-------+------+------+

大胆猜测:

LEFT JOIN 连表操作导致 WITH 重新执行,最后 rownum 变量...悲剧了。

WITH t0 AS (
    SELECT @rownum := 0
), t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score UNION
    SELECT 'E' Name, 4 Score
), t2 AS (
    SELECT t1.*, @rownum := @rownum + 1 AS row_ FROM t1
)
SELECT * FROM t2;

这么写,变量不起作用:

+------+-------+------+
| Name | Score | row_ |
+------+-------+------+
| A    |     8 | NULL |
| B    |     7 | NULL |
| C    |     6 | NULL |
| D    |     5 | NULL |
| E    |     4 | NULL |
+------+-------+------+

感慨:我对 SQL 的执行方式还真是一无所知。

下面是开源中国上贴的图(看这 SQL 写法,应该是 Oracle 吧,反正不是 MySQL,MongoDB, SQLite,PostgreSQL)

  1. 除了连表操作,没有别的方式,不像 Oracle 有个 (+) 的写法。
  2. 算行号,除了使用变量,也没有别的方式,不像 Oracle 有个 rownum 关键字。
  3. Oracle 才有这个 FROM DUAL 的说法,一张虚拟表。


解决方法

MariaDB 支持 row_number() over() 算行号。
我不知道是从哪个版本开始支持的,反正在我本地的 10.3 版本中运行良好:
PS:MySQL 最新版本(8.0.15)也支持,测试通过。

WITH t1 AS (
    SELECT 'A' Name, 8 Score UNION
    SELECT 'B' Name, 7 Score UNION
    SELECT 'C' Name, 6 Score UNION
    SELECT 'D' Name, 5 Score UNION
    SELECT 'E' Name, 4 Score
), t2 AS (
    SELECT t1.*, row_number() over() AS row_ FROM t1, (SELECT @rownum := 0) t21
)
SELECT a.name, a.score, b.name, b.score
FROM (SELECT t2.*, @rownum2 := @rownum2 + 1 AS r FROM t2, (SELECT @rownum2 := 0) a1 WHERE MOD(t2.row_, 2) != 0) a
LEFT JOIN (SELECT t2.*, @rownum3 := @rownum3 + 1 AS r FROM t2, (SELECT @rownum3 := 0) b1 WHERE MOD(t2.row_, 2) = 0) b
ON a.r = b.r;

知识点

CTE (Common Table Expression)

之前的方案

  1. 派生表语句,写在 FROM 后面
    由 DBMS 自动管理的临时表
  2. 数据库视图

CTE 语法

WITH
    cte1 AS (SELECT a, b FROM table1),
    cte2 AS (SELECT c, d FROM table2)
SELECT b, d FROM cte1 JOIN cte2
WHERE cte1.a = cte2.c;

# Recursive Common Table Expression
WITH RECURSIVE cte (n) AS
(
    SELECT 1
    UNION ALL
    SELECT n + 1 FROM cte WHERE n < 5
)
SELECT * FROM cte;
  1. column_list 可以省略

MySQL/MariaDB 支持情况

MySQL

8.0 之后支持,参考:

  • http://mysqlserverteam.com/mysql-8-0-labs-recursive-common-table-expressions-in-mysql-ctes/
  • http://mysqlserverteam.com/mysql-8-0-labs-recursive-common-table-expressions-in-mysql-ctes-part-two-how-to-generate-series/
  • http://mysqlserverteam.com/mysql-8-0-labs-recursive-common-table-expressions-in-mysql-ctes-part-three-hierarchies/
  • https://dev.mysql.com/doc/refman/8.0/en/with.html
MariaDB
  • https://mariadb.com/kb/en/library/with/

Common Table Expression WITH was introduced in MariaDB 10.2.1.
Recursive WITH has been supported since MariaDB 10.2.2.

参考资料与拓展阅读