首页 文章

Postgres慢查询(慢速索引扫描)

提问于
浏览
1

我有一个300万行和1.3GB大小的表 . 在我的笔记本电脑上使用4GB RAM运行Postgres 9.3 .

explain analyze
select act_owner_id from cnt_contacts where act_owner_id = 2

我在cnt_contacts.act_owner_id上有btree键,定义如下:

CREATE INDEX cnt_contacts_idx_act_owner_id 
   ON public.cnt_contacts USING btree (act_owner_id, status_id);

查询在大约5秒内运行

Bitmap Heap Scan on cnt_contacts  (cost=2598.79..86290.73 rows=6208 width=4) (actual time=5865.617..5875.302 rows=5444 loops=1)
  Recheck Cond: (act_owner_id = 2)
  ->  Bitmap Index Scan on cnt_contacts_idx_act_owner_id  (cost=0.00..2597.24 rows=6208 width=0) (actual time=5865.407..5865.407 rows=5444 loops=1)
        Index Cond: (act_owner_id = 2)
Total runtime: 5875.684 ms"

为什么要这么久?

work_mem = 1024MB; 
shared_buffers = 128MB;
effective_cache_size = 1024MB
seq_page_cost = 1.0         # measured on an arbitrary scale
random_page_cost = 15.0         # same scale as above
cpu_tuple_cost = 3.0

2 回答

  • 2

    您正在笔记本电脑上选择分散在1.3 GB表中的5444条记录 . 你期待多长时间?

    看起来你的索引没有被缓存,因为它无法在缓存中维持,或者因为这是你第一次使用它的那一部分 . 如果重复运行完全相同的查询会发生什么?相同的查询,但具有不同的常量?

    在“explain(analyze,buffers)”下运行查询将有助于获取其他信息,尤其是在您首先启用track_io_timing时 .

  • 2

    好的,PG有大表,索引和长时间执行 . 让我们思考如何改善计划和减少时间的方法 . 您编写和删除行 . PG写和删除元组,表和索引可以膨胀 . 为了良好的搜索,PG将索引加载到共享缓冲区并且您需要尽可能保持索引清洁 . 选择PG读取共享缓冲区而不是搜索 . 尝试设置缓冲区内存并减少索引和表膨胀,保持数据库清理 .

    你做了什么,想一想:

    1)只检查索引重复项,并且索引具有良好的选择:

    WITH table_scans as (
        SELECT relid,
            tables.idx_scan + tables.seq_scan as all_scans,
            ( tables.n_tup_ins + tables.n_tup_upd + tables.n_tup_del ) as writes,
                    pg_relation_size(relid) as table_size
            FROM pg_stat_user_tables as tables
    ),
    all_writes as (
        SELECT sum(writes) as total_writes
        FROM table_scans
    ),
    indexes as (
        SELECT idx_stat.relid, idx_stat.indexrelid,
            idx_stat.schemaname, idx_stat.relname as tablename,
            idx_stat.indexrelname as indexname,
            idx_stat.idx_scan,
            pg_relation_size(idx_stat.indexrelid) as index_bytes,
            indexdef ~* 'USING btree' AS idx_is_btree
        FROM pg_stat_user_indexes as idx_stat
            JOIN pg_index
                USING (indexrelid)
            JOIN pg_indexes as indexes
                ON idx_stat.schemaname = indexes.schemaname
                    AND idx_stat.relname = indexes.tablename
                    AND idx_stat.indexrelname = indexes.indexname
        WHERE pg_index.indisunique = FALSE
    ),
    index_ratios AS (
    SELECT schemaname, tablename, indexname,
        idx_scan, all_scans,
        round(( CASE WHEN all_scans = 0 THEN 0.0::NUMERIC
            ELSE idx_scan::NUMERIC/all_scans * 100 END),2) as index_scan_pct,
        writes,
        round((CASE WHEN writes = 0 THEN idx_scan::NUMERIC ELSE idx_scan::NUMERIC/writes END),2)
            as scans_per_write,
        pg_size_pretty(index_bytes) as index_size,
        pg_size_pretty(table_size) as table_size,
        idx_is_btree, index_bytes
        FROM indexes
        JOIN table_scans
        USING (relid)
    ),
    index_groups AS (
    SELECT 'Never Used Indexes' as reason, *, 1 as grp
    FROM index_ratios
    WHERE
        idx_scan = 0
        and idx_is_btree
    UNION ALL
    SELECT 'Low Scans, High Writes' as reason, *, 2 as grp
    FROM index_ratios
    WHERE
        scans_per_write <= 1
        and index_scan_pct < 10
        and idx_scan > 0
        and writes > 100
        and idx_is_btree
    UNION ALL
    SELECT 'Seldom Used Large Indexes' as reason, *, 3 as grp
    FROM index_ratios
    WHERE
        index_scan_pct < 5
        and scans_per_write > 1
        and idx_scan > 0
        and idx_is_btree
        and index_bytes > 100000000
    UNION ALL
    SELECT 'High-Write Large Non-Btree' as reason, index_ratios.*, 4 as grp 
    FROM index_ratios, all_writes
    WHERE
        ( writes::NUMERIC / ( total_writes + 1 ) ) > 0.02
        AND NOT idx_is_btree
        AND index_bytes > 100000000
    ORDER BY grp, index_bytes DESC )
    SELECT reason, schemaname, tablename, indexname,
        index_scan_pct, scans_per_write, index_size, table_size
    FROM index_groups;
    

    2)检查表格和索引是否膨胀?

    SELECT
            current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
            ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::FLOAT/otta END)::NUMERIC,1) AS tbloat,
            CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
          iname, /*ituples::bigint, ipages::bigint, iotta,*/
          ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::FLOAT/iotta END)::NUMERIC,1) AS ibloat,
          CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
        FROM (
          SELECT
            schemaname, tablename, cc.reltuples, cc.relpages, bs,
            CEIL((cc.reltuples*((datahdr+ma-
              (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta,
            COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
            COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta -- very rough approximation, assumes all cols
          FROM (
            SELECT
              ma,bs,schemaname,tablename,
              (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr,
              (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
            FROM (
              SELECT
                schemaname, tablename, hdr, ma, bs,
                SUM((1-null_frac)*avg_width) AS datawidth,
                MAX(null_frac) AS maxfracsum,
                hdr+(
                  SELECT 1+COUNT(*)/8
                  FROM pg_stats s2
                  WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
                ) AS nullhdr
              FROM pg_stats s, (
                SELECT
                  (SELECT current_setting('block_size')::NUMERIC) AS bs,
                  CASE WHEN SUBSTRING(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
                  CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
                FROM (SELECT version() AS v) AS foo
              ) AS constants
              GROUP BY 1,2,3,4,5
            ) AS foo
          ) AS rs
          JOIN pg_class cc ON cc.relname = rs.tablename
          JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
          LEFT JOIN pg_index i ON indrelid = cc.oid
          LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
        ) AS sml
        ORDER BY wastedbytes DESC
    

    3)你是否从硬盘清理未使用的元组?真空时间到了吗?

    SELECT 
        relname AS TableName
        ,n_live_tup AS LiveTuples
        ,n_dead_tup AS DeadTuples
    FROM pg_stat_user_tables;
    

    4)想一想 . 如果db中有10条记录,10条中有8条id = 2,那就意味着你的索引选择性很差,这样PG就会扫描所有8条记录 . 但是你尝试使用id!= 2 index会很好用 . 尝试设置具有良好选择的索引 .

    5)使用适当的列类型获取数据 . 如果你可以为你的列使用更少的kb类型,只需转换它 .

    6)检查数据库和条件 . 检查这个以便开始page只是尝试看到表中有数据库中未使用的数据,必须清理索引,检查索引的选择性 . 尝试使用其他brin索引获取数据,尝试重新创建索引 .

相关问题