我的笔记本电脑上安装了PostgreSQL 9.2和MySQL 5.5(InnoDB) . 两个数据库引擎都使用默认安装并从同一CSV文件填充 . 我有'sales_reports'表与ca. 700K行 .
Scenario 1:
- 以下查询:
按名称,年份,地区,分支从sales_reports组中选择名称,年份,地区,分支;
-
PostgreSQL 9.2:总查询运行时:42.14秒,检索到18064行
-
PostgreSQL解释:
Group (cost=165091.16..174275.61 rows=73476 width=58) (actual time=35196.959..41896.739 rows=18064 loops=1)
-> Sort (cost=165091.16..166928.05 rows=734756 width=58) (actual time=35196.956..41704.549 rows=734756 loops=1)
Sort Key: name, year, region, branch
Sort Method: external merge Disk: 49920kB
-> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=58) (actual time=0.048..282.331 rows=734756 loops=1)
Total runtime: 41906.628 ms
-
MySQL 5.5:总查询运行时间:4.4秒,检索到18064行
-
MySQL解释:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 10x times slower
Scenario 2:
- 以下查询:
select name, year, region, branch, sum(sale) as sale from sales_reports group by name, year, region, branch;
-
PostgreSQL 9.2:总查询运行时:42.51秒,检索到18064行
-
PostgreSQL解释:
GroupAggregate (cost=165091.16..176847.26 rows=73476 width=64) (actual time=35160.911..42254.060 rows=18064 loops=1)
-> Sort (cost=165091.16..166928.05 rows=734756 width=64) (actual time=35160.489..41857.986 rows=734756 loops=1)
Sort Key: name, year, region, branch
Sort Method: external merge Disk: 54760kB
-> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=64) (actual time=0.047..296.347 rows=734756 loops=1)
Total runtime: 42264.790 ms
-
MySQL 5.5:总查询运行时间:8.15秒,检索到18064行
-
MySQL解释:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 5x times slower
Scenario 3:
- 以下查询:
select name, year, region, sum(sale) as sale from sales_reports group by name, year, region;
-
PostgreSQL 9.2:总查询运行时间:1秒,检索到18064行
-
PostgreSQL解释:
HashAggregate (cost=45597.12..45655.62 rows=5850 width=37) (actual time=758.396..759.756 rows=4644 loops=1)
-> Seq Scan on sales_reports (cost=0.00..38249.56 rows=734756 width=37) (actual time=0.061..116.541 rows=734756 loops=1)
Total runtime: 760.133 ms
-
MySQL 5.5:查询总运行时间:5.8秒,检索到18064行
-
MySQL解释:
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
| 1 | SIMPLE | sales_reports | ALL | NULL | NULL | NULL | NULL | 729433 | Using temporary; Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+--------+---------------------------------+
- PostgreSQL 5x times faster
任何想法为什么前两个场景在PostgreSQL上如此之慢?
顺便说一句,我为PostgreSQL上的查询中使用的字段创建了索引,我没有在MySQL上创建任何索引 .
谢谢,
马雷克
2 回答
默认的postgresql配置相当保守 . 首先,尝试将
shared_buffers
增加到1GB . (请记住重新启动服务器以使更改生效 . )另外,尝试增加
work_mem
,直到GroupAggregate在解释中切换到HashAggregate . 您无需重启即可更改此设置 .一句警告:在弄乱 生产环境 中的设置之前,请阅读友好的手册,其中涉及一些问题 .
一些东西:
在varchar / text列上创建索引时(比如名称可能是这样),在PostgreSQL中使用哈希类型索引来获取group by语句可能会更快
如果mysql正在使用MyISAM表,那么非并发负载可能会更快,因为MyISAM忽略了InnoDB和PgSQL所具有的许多安全性 .
你应该在你的pgsql db上尝试真空分析 .
您可能希望在名称,年份,区域,分支上创建复合索引,以便更快地按部分创建该组 .