首页 文章

使用嵌套而不是在PostgreSQL中提高查询性能

提问于
浏览
2

我正在尝试将应用程序从MySQL 5.6移植到PostgreSQL 9.2原始应用程序使用的视图我已经设法至少运行但查询时间很糟糕 .

我想知道PostgreSQL中优化“不在”查询的最佳方法 .

我的第一个想法是创建一个临时表,但由于这是一个视图,我不认为这是一个选项 .

create VIEW ready_ports AS 
    SELECT ports.id AS id, 
           ports.run AS run,
           ports.name AS name, 
           ports.pkgname AS pkgname,
           ports.version AS version,
           ports.description AS description,
           ports.license AS license,
           ports.www AS www, 
           ports.status AS status, 
           ports.updated AS updated,
          (SELECT count(0) AS COUNT 
           FROM depends 
           WHERE depends.dependency = ports.id) AS priority 
    FROM ports 
    WHERE (ports.status = 'untested' and 
          (not(ports.id in 
                 (SELECT locks.port AS port 
                  FROM locks 
                  WHERE locks.port = ports.id)
               )
          ) and 
          (
             (not(ports.id in (SELECT depends.port AS port 
                               FROM depends 
                               WHERE depends.port = ports.id))) or 
             (not(ports.id in 
                  (SELECT depends.port AS port 
                   FROM depends
                   WHERE ((not(depends.dependency in 
                     (SELECT ports.id AS dep_id 
                      FROM ports   
                      WHERE (ports.id = depends.dependency 
                             and (ports.status = 'pass' 
                                  or ports.status = 'warn')
                             )
                     ))) or 
    depends.dependency in 
    (SELECT locks.port AS port 
     FROM locks 
     WHERE locks.port = ports.id)))))))
ORDER BY priority desc
QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Sort  (cost=367498265655.68..367498265763.29 rows=43047 width=136)
   Sort Key: ((SubPlan 1))
   ->  Index Scan using ports_1_idx on ports  (cost=0.00..367498259398.93 rows=43047 width=136)
         Index Cond: ((status)::text = 'untested'::text)
         Filter: ((NOT (SubPlan 2)) AND ((NOT (SubPlan 3)) OR (NOT (SubPlan 6))))
         SubPlan 1
           ->  Aggregate  (cost=9.62..9.63 rows=1 width=0)
                 ->  Index Only Scan using depends_dependency_idx on depends  (cost=0.00..9.47 rows=60 width=0)
                       Index Cond: (dependency = public.ports.id)
         SubPlan 2
           ->  Index Only Scan using locks_port_key on locks  (cost=0.00..8.27 rows=1 width=4)
                 Index Cond: (port = public.ports.id)
         SubPlan 3
           ->  Index Only Scan using depends_pkey on depends  (cost=0.00..8.72 rows=14 width=4)
                 Index Cond: (port = public.ports.id)
         SubPlan 6
           ->  Seq Scan on depends  (cost=8.27..6399946.81 rows=1150079 width=4)
                 Filter: ((NOT (SubPlan 4)) OR (hashed SubPlan 5))
                 SubPlan 4
                   ->  Index Scan using ports_pkey on ports  (cost=0.00..8.31 rows=1 width=4)
                         Index Cond: (id = public.depends.dependency)
                         Filter: (((status)::text = 'pass'::text) OR ((status)::text = 'warn'::text))
                 SubPlan 5
                   ->  Index Only Scan using locks_port_key on locks  (cost=0.00..8.27 rows=1 width=4)
                         Index Cond: (port = public.ports.id)

3 回答

  • 4

    您可以尝试 NOT EXISTS 和反连接版本,因为 NOT IN 不能使用很多索引(因为NULL处理issues):

    SELECT *
    FROM table1
    WHERE table1.id NOT IN (SELECT id FROM table2)
    
    -- vs NOT EXISTS
    
    SELECT *
    FROM table1
    WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.id = table2.id)
    
    -- vs anti-join
    
    SELECT *
    FROM table1
    LEFT JOIN table2 ON table1.id = table2.id
    WHERE table2.id IS NULL
    
  • 2

    我最终使用了连接和不存在查询的组合来获得最终的工作查询 .

    create VIEW ready_ports AS 
    SELECT ports.id AS id, 
           ports.run AS run,
           ports.name AS name, 
           ports.pkgname AS pkgname,
           ports.version AS version,
           ports.description AS description,
           ports.license AS license,
           ports.www AS www, 
           ports.status AS status, 
           ports.updated AS updated,
          (SELECT count(0) AS COUNT 
           FROM depends 
           WHERE depends.dependency = ports.id) AS priority 
    FROM ports 
    LEFT JOIN locks on locks.port = ports.id
    LEFT JOIN depends on depends.port = ports.id
    WHERE ports.status = 'untested' and locks.id is null and 
          (depends.port is null or 
             not exists
                  (SELECT depends.port AS port 
                   FROM depends WHERE ports.id = depends.port and not exists              
                     (SELECT ports.id as dep_id
                      FROM ports   
                      WHERE ports.id = depends.dependency and 
                      (ports.status = 'pass' or ports.status = 'warn'))
                      or 
    depends.dependency = locks.port))
    ORDER BY priority desc, ports.name asc
    
  • 1

    您需要使用连接重写查询:

    select ...
    from ports
    left join locks ...
    left join depends ...
    where criteria
    

    这样,你就可以开发一套大套装,这是三套装的结果,而不是半套装 .

    将计数移出您的视图也是一个优点 . 使用单独的查询或加入您的视图以获取该部分 . (视图中的聚合很少是一个好主意,除非在报告中 . )

相关问题