首页 文章

PostgreSQL SELECT必须匹配多行

提问于
浏览
4

我有一个表,其中包含一对表示时间 Span 的时间戳 . 这些行的范围是用户标识,每个用户可以有一行或多行与之关联 .

此数据是从抽象"availability"表单生成的,表示用户在一周内可用的时间 . 我需要输入一系列时间范围作为查询,并返回表中所有行匹配的所有用户ID .

鉴于此表:

CREATE TABLE "public"."availability" (
  "id" int4 NOT NULL,
  "user_id" int4,
  "starts_at" timestamp(6),
  "ends_at" timestamp(6),
  PRIMARY KEY ("id")
) WITH (OIDS=FALSE)

这个数据:

User #1 is available Mon-Tue between 08:00 and 17:00

+----+---------+---------------------+---------------------+
| id | user_id | starts_at           | ends_at             |
+----+---------+---------------------+---------------------+
| 1  | 1       | 2013-03-18 08:00:00 | 2013-03-18 17:00:00 |
+----+---------+---------------------+---------------------+
| 2  | 1       | 2013-03-19 08:00:00 | 2013-03-19 17:00:00 |
+----+---------+---------------------+---------------------+

User #2 is available Sun-Sat all day

+----+---------+---------------------+---------------------+
| 3  | 2       | 2013-03-17 00:00:00 | 2013-03-23 23:59:59 |
+----+---------+---------------------+---------------------+

User #3 is available Wed between 06:00 and 18:00

+----+---------+---------------------+---------------------+
| 4  | 3       | 2013-03-20 06:00:00 | 2013-03-20 18:00:00 |
+----+---------+---------------------+---------------------+

我可以轻松选择可用于任何给定时间戳的用户:

SELECT * FROM "public"."availability"
  WHERE ('2013-03-19 08:35:00' BETWEEN starts_at AND ends_at 
     AND '2013-03-19 18:25:00' BETWEEN starts_at AND ends_at)
    OR  ('2013-03-20 12:00:00' BETWEEN starts_at AND ends_at
     AND '2013-03-20 18:00:00' BETWEEN starts_at AND ends_at);

+----+---------+---------------------+---------------------+
| id | user_id | starts_at           | ends_at             |
+----+---------+---------------------+---------------------+
| 3  | 2       | 2013-03-17 00:00:00 | 2013-03-23 23:59:59 |
+----+---------+---------------------+---------------------+
| 4  | 3       | 2013-03-20 06:00:00 | 2013-03-20 18:00:00 |
+----+---------+---------------------+---------------------+

但我真正需要的是能够查询多个时间盘并仅返回匹配 all 条件的 user_id .

查询: 2013-03-17 10:00:00 - 2013-03-17 16:00:002013-03-23 10:00:00 - 2013-03-23 16:00:00 应返回:

+----+---------+---------------------+---------------------+
| id | user_id | starts_at           | ends_at             |
+----+---------+---------------------+---------------------+
| 3  | 2       | 2013-03-17 00:00:00 | 2013-03-23 23:59:59 |
+----+---------+---------------------+---------------------+

查询: 2013-03-18 09:00:00 - 2013-03-18 16:00:002013-03-19 08:00:00 - 2013-03-19 15:45:00 应返回:

+----+---------+---------------------+---------------------+
| id | user_id | starts_at           | ends_at             |
+----+---------+---------------------+---------------------+
| 1  | 1       | 2013-03-18 08:00:00 | 2013-03-18 17:00:00 |
+----+---------+---------------------+---------------------+
| 2  | 1       | 2013-03-19 08:00:00 | 2013-03-19 17:00:00 |
+----+---------+---------------------+---------------------+
| 3  | 2       | 2013-03-17 00:00:00 | 2013-03-23 23:59:59 |
+----+---------+---------------------+---------------------+

查询: 2013-03-18 07:00:00 - 2013-03-18 18:00:00 应该什么都不返回 .

SQLFiddle example

2 回答

  • 4

    对于这样的应用程序,如果您使用的是PostgreSQL 9.2或更高版本,则可能需要尝试range type . 以下是创建,加载和显示数据的示例:

    CREATE TABLE availability (
      id      int4 NOT NULL,
      user_id int4,
      avail   tstzrange,
      PRIMARY KEY (id)
    );
    INSERT INTO availability VALUES
      (1, 1, '[2013-03-18 08:00:00, 2013-03-18 17:00:00)'),
      (2, 1, '[2013-03-19 08:00:00, 2013-03-19 17:00:00)'),
      (3, 2, '[2013-03-17 00:00:00, 2013-03-23 24:00:00)'),
      (4, 3, '[2013-03-20 06:00:00, 2013-03-20 18:00:00)');
    SELECT * FROM availability ;
    
    id | user_id |                        avail                        
    ----+---------+-----------------------------------------------------
      1 |       1 | ["2013-03-18 08:00:00-05","2013-03-18 17:00:00-05")
      2 |       1 | ["2013-03-19 08:00:00-05","2013-03-19 17:00:00-05")
      3 |       2 | ["2013-03-17 00:00:00-05","2013-03-24 00:00:00-05")
      4 |       3 | ["2013-03-20 06:00:00-05","2013-03-20 18:00:00-05")
    (4 rows)
    

    然后您可以查询各种运算符 . 如果要包含任何指定查询范围的所有可用性范围:

    SELECT * FROM availability
      WHERE avail @> '[2013-03-19 08:35:00, 2013-03-19 18:25:00)'
         OR avail @> '[2013-03-20 12:00:00, 2013-03-20 18:00:00)';
    

    要么:

    SELECT * FROM availability
      WHERE avail @> ANY
              (ARRAY ['[2013-03-19 08:35:00, 2013-03-19 18:25:00)'::tstzrange,
                      '[2013-03-20 12:00:00, 2013-03-20 18:00:00)'::tstzrange]);
    
    id | user_id |                        avail                        
    ----+---------+-----------------------------------------------------
      3 |       2 | ["2013-03-17 00:00:00-05","2013-03-24 00:00:00-05")
      4 |       3 | ["2013-03-20 06:00:00-05","2013-03-20 18:00:00-05")
    (2 rows)
    

    如果希望包含所有指定查询范围的所有可用性范围都在一个范围内:

    SELECT * FROM availability
      WHERE avail @> '[2013-03-17 10:00:00, 2013-03-17 16:00:00)'
        AND avail @> '[2013-03-23 10:00:00, 2013-03-23 16:00:00)';
    

    要么:

    SELECT * FROM availability
      WHERE avail @> ALL
              (ARRAY ['[2013-03-17 10:00:00, 2013-03-17 16:00:00)'::tstzrange,
                      '[2013-03-23 10:00:00, 2013-03-23 16:00:00)'::tstzrange]);
    
    id | user_id |                        avail                        
    ----+---------+-----------------------------------------------------
      3 |       2 | ["2013-03-17 00:00:00-05","2013-03-24 00:00:00-05")
    (1 row)
    

    如果您想要包含任何指定查询范围的所有可用性范围, but only for users with availability ranges which cover all of the specified query ranges

    WITH s(ts) AS
    (
      VALUES
        ('[2013-03-18 09:00:00, 2013-03-18 16:00:00)'::tstzrange),
        ('[2013-03-19 08:00:00, 2013-03-19 15:45:00)'::tstzrange)
    )
    SELECT DISTINCT a1.*
      FROM s s1
      JOIN availability a1 ON a1.avail @> s1.ts
        AND NOT EXISTS
            (
              SELECT * FROM s s2
                WHERE NOT EXISTS
                      (
                        SELECT * FROM availability a2
                          WHERE a2.user_id = a1.user_id
                            AND a2.avail @> s2.ts
                      )
            );
    

    或(调整Clodoaldo Neto的查询以使用范围):

    SELECT a.*
      FROM availability a
      JOIN (
             SELECT
                 user_id,
                 sum(('[2013-03-18 09:00:00, 2013-03-18 16:00:00)'::tstzrange
                       <@ avail)::integer
                     +
                     ('[2013-03-19 08:00:00, 2013-03-19 15:45:00)'::tstzrange
                       <@ avail)::integer
                    ) period
               FROM availability
               GROUP BY user_id
           ) s ON a.user_id = s.user_id
      WHERE period >= 2;
    

    您可以创建一个索引,以便在大型表上快速进行此类搜索,如下所示:

    CREATE INDEX availability_avail ON availability USING gist (avail);
    

    笔记:

    • 我停止了架构并引用了可读性 .

    • 索引不太可能与四行一起使用,因为通过直接读取一个数据页面可以更快地获得所有数据 . 使用大型 table 可能会产生很大的不同 .

    • 我使用 TIMESTAMP WITH TIME ZONE 的范围,因为默认(裸) TIMESTAMP 时钟在夏令时结束时每年向后移动 . 要及时捕捉瞬间,请使用 TIMESTAMP WITH TIME ZONE (简称 timestamptz ) .

    • 直接使用时,不需要显式地转换文字;使用查询的 ANYALL 形式时,需要显式强制转换 .

    • 范围上的方括号表示范围包括相邻时间,而圆括号表示范围不包括相邻时间 . 时间戳通常使用 [) 指定,以便以给定时间结束的范围和以相同时间开始的另一范围被认为是相邻的而不是重叠 .

    • '24:00:00' 为一个日期, '00:00:00' 为下一个日期是同一时刻 .

    • 前两点允许更容易地指定在午夜结束的时间戳 . 不存在"lost second"或其他陌生感的风险 .

  • 3

    SQL Fiddle

    这会将boolean强制转换为整数0或1 .

    select a.*
    from
        availability a
        inner join
        (
            select
                user_id,
                sum (
                    ('2013-03-18 09:00:00' between starts_at and ends_at
                     and
                     '2013-03-18 16:00:00' between starts_at and ends_at
                    )::integer
                    +
                    ('2013-03-19 08:00:00' between starts_at and ends_at
                     and
                     '2013-03-19 15:45:00' between starts_at and ends_at
                    )::integer
                ) period
            from availability
            group by user_id
        ) s on a.user_id = s.user_id
    where period >= 2
    

    where 条件更改为要匹配的期间数 .

相关问题