Почему коррелированный скаляр на 10 раз медленнее в MySQL по сравнению с PG

Почему коррелированный скаляр на 10 раз медленнее в MySQL по сравнению с PG ⇐ MySql

1 сообщение • Страница 1 из 1

Anonymous

Почему коррелированный скаляр на 10 раз медленнее в MySQL по сравнению с PG

Сообщение Anonymous » 03 окт 2025, 09:11

Давайте создадим таблицу с 3000 рядов 
create table tt(id int, txt text);

insert into tt
with recursive r(id) as
(select 1 union all select id + 1 from r where id < 3e3)
select id, concat('name', id)
from r;
< /code>
Один и тот же запрос в обеих базах данных приводит к очень разным производительности: 
select sum(id),
sum((select count(*)
from tt t1
where t1.id = t2.id)) cnt
from tt t2

mysql
mysql> explain analyze
-> select sum(id), sum((select count(*) from tt t1 where t1.id = t2.id)) cnt
-> from tt t2\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum(t2.id), sum((select #2)) (cost=602 rows=1) (actual time=7542..7542 rows=1 loops=1)
-> Table scan on t2 (cost=302 rows=3000) (actual time=0.025..2.75 rows=3000 loops=1)
-> Select #2 (subquery in projection; dependent)
-> Aggregate: count(0) (cost=62.5 rows=1) (actual time=2.51..2.51 rows=1 loops=3000)
-> Filter: (t1.id = t2.id) (cost=32.5 rows=300) (actual time=1.25..2.51 rows=1 loops=3000)
-> Table scan on t1 (cost=32.5 rows=3000) (actual time=0.00256..2.31 rows=3000 loops=3000)

1 row in set, 1 warning (7.54 sec)

pg
postgres=# explain analyze
postgres-# select sum(id), sum((select count(*) from tt t1 where t1.id = t2.id)) cnt
postgres-# from tt t2;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=163599.50..163599.51 rows=1 width=40) (actual time=684.339..684.340 rows=1 loops=1)
-> Seq Scan on tt t2 (cost=0.00..47.00 rows=3000 width=4) (actual time=0.013..0.223 rows=3000 loops=1)
SubPlan 1
-> Aggregate (cost=54.50..54.51 rows=1 width=8) (actual time=0.227..0.227 rows=1 loops=3000)
-> Seq Scan on tt t1 (cost=0.00..54.50 rows=1 width=0) (actual time=0.113..0.223 rows=1 loops=3000)
Filter: (id = t2.id)
Rows Removed by Filter: 2999
Planning Time: 0.663 ms
Execution Time: 684.512 ms
(9 rows)
< /code>
Как видно, разница составляет ~ 7,0 секунд против ~ 0,7 секунды. 
Так что оба случая не являются нендесственными подведщиками и выполняют его 3000 раз. 
Но в MySQL One выполнение требуется 2+ MS, в то время как в PG это требует 0.2 мс. не делать запрос быстрее.
Очевидно, что мы можем создать индекс или переписать в явное соединение.
select sum(t1.id), sum(cnt) cnt
from tt t2
join (select id, sum(1) cnt from tt group by id) t1 on t1.id = t2.id;
< /code>
ps. Оба RDBM имеют настройки по умолчанию. 
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.43 |
+-----------+
1 row in set (0.00 sec)

postgres=# select version();
version
------------------------------------------------------------
PostgreSQL 15.1, compiled by Visual C++ build 1914, 64-bit
(1 row)

update
, как запрошено в комментариях - добавление сравнения на Ubuntu.
все еще 10x разница.mysql> explain analyze
-> select sum(id), sum((select count(*) from tt t1 where t1.id = t2.id)) cnt
-> from tt t2\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: sum(t2.id), sum((select #2)) (cost=600 rows=1) (actual time=5416..5416 rows=1 loops=1)
-> Table scan on t2 (cost=300 rows=3000) (actual time=0.0244..5.38 rows=3000 loops=1)
-> Select #2 (subquery in projection; dependent)
-> Aggregate: count(0) (cost=60.3 rows=1) (actual time=1.79..1.79 rows=1 loops=3000)
-> Filter: (t1.id = t2.id) (cost=30.3 rows=300) (actual time=0.906..1.79 rows=1 loops=3000)
-> Table scan on t1 (cost=30.3 rows=3000) (actual time=0.00668..1.56 rows=3000 loops=3000)

1 row in set, 1 warning (5.41 sec)

pg
postgres=# explain analyze
select sum(id), sum((select count(*) from tt t1 where t1.id = t2.id)) cnt
from tt t2;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=163599.50..163599.51 rows=1 width=40) (actual time=526.720..526.721 rows=1 loops=1)
-> Seq Scan on tt t2 (cost=0.00..47.00 rows=3000 width=4) (actual time=0.012..0.301 rows=3000 loops=1)
SubPlan 1
-> Aggregate (cost=54.50..54.51 rows=1 width=8) (actual time=0.175..0.175 rows=1 loops=3000)
-> Seq Scan on tt t1 (cost=0.00..54.50 rows=1 width=0) (actual time=0.087..0.173 rows=1 loops=3000)
Filter: (id = t2.id)
Rows Removed by Filter: 2999
Planning Time: 0.078 ms
Execution Time: 526.766 ms
(9 rows)

Подробнее здесь: https://stackoverflow.com/questions/797 ... ring-to-pg

Anonymous

1 сообщение • Страница 1 из 1

Вернуться в «MySql»