首页 文章

SQL查询:删除表中除最新N之外的所有记录?

提问于
浏览
70

是否可以构建一个单独的mysql查询(没有变量)来删除表中的所有记录,除了最新的N(按id desc排序)?

像这样的东西,只有它不起作用:)

delete from table order by id ASC limit ((select count(*) from table ) - N)

谢谢 .

16 回答

  • 87

    您不能以这种方式删除记录,主要问题是您不能使用子查询来指定LIMIT子句的值 .

    这工作(在MySQL 5.0.67中测试):

    DELETE FROM `table`
    WHERE id NOT IN (
      SELECT id
      FROM (
        SELECT id
        FROM `table`
        ORDER BY id DESC
        LIMIT 42 -- keep this many records
      ) foo
    );
    

    中间子查询是必需的 . 没有它我们会遇到两个错误:

    • SQL Error (1093): You can't specify target table 'table' for update in FROM clause - MySQL不允许您在直接子查询中引用要删除的表 .

    • SQL Error (1235): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery' - 您不能在NOT IN运算符的直接子查询中使用LIMIT子句 .

    幸运的是,使用中间子查询允许我们绕过这两个限制 .


    NickC指出,对于某些用例(例如此用例),可以显着优化此查询 . 我建议您阅读that answer以查看它是否适合您 .

  • 0

    我知道我复活了一个很老的问题,但我最近遇到了这个问题,但需要一些 scales to large numbers well . 我发现的东西没有't any existing performance data, and since this question has had quite a bit of attention, I thought I'发布 .

    实际工作的解决方案是Alex Barrett's double sub-query/NOT IN方法(类似于Bill Karwin's)和Quassnoi's LEFT JOIN方法 .

    遗憾的是,上述两种方法都会创建非常大的中间临时表,并且随着删除的记录数量变大,性能会迅速降低 .

    我决定使用Alex Barrett的双子查询(谢谢!),但使用 <= 而不是 NOT IN

    DELETE FROM `test_sandbox`
      WHERE id <= (
        SELECT id
        FROM (
          SELECT id
          FROM `test_sandbox`
          ORDER BY id DESC
          LIMIT 1 OFFSET 42 -- keep this many records
        ) foo
      )
    

    它使用 OFFSET 来获取 N 记录的id并删除该记录和所有先前的记录 .

    由于排序已经是这个问题的假设( ORDER BY id DESC ), <= 非常适合 .

    它更快,因为子查询生成的临时表只包含一个记录而不是 N 记录 .

    测试用例

    我在两个测试用例中测试了上述三种工作方法和新方法 .

    两个测试用例使用10000个现有行,而第一个测试保留9000(删除最旧的1000),第二个测试保持50(删除最旧的9950) .

    +-----------+------------------------+----------------------+
    |           | 10000 TOTAL, KEEP 9000 | 10000 TOTAL, KEEP 50 |
    +-----------+------------------------+----------------------+
    | NOT IN    |         3.2542 seconds |       0.1629 seconds |
    | NOT IN v2 |         4.5863 seconds |       0.1650 seconds |
    | <=,OFFSET |         0.0204 seconds |       0.1076 seconds |
    +-----------+------------------------+----------------------+
    

    有趣的是, <= 方法看到了更好的性能,但实际上越多越好,而不是更糟 .

  • 5

    不幸的是,对于其他人给出的所有答案,您不能在同一查询中的给定表中 DELETESELECT .

    DELETE FROM mytable WHERE id NOT IN (SELECT MAX(id) FROM mytable);
    
    ERROR 1093 (HY000): You can't specify target table 'mytable' for update 
    in FROM clause
    

    MySQL也不能在子查询中支持 LIMIT . 这些是MySQL的局限性 .

    DELETE FROM mytable WHERE id NOT IN 
      (SELECT id FROM mytable ORDER BY id DESC LIMIT 1);
    
    ERROR 1235 (42000): This version of MySQL doesn't yet support 
    'LIMIT & IN/ALL/ANY/SOME subquery'
    

    我能想出的最佳答案是分两个阶段完成:

    SELECT id FROM mytable ORDER BY id DESC LIMIT n;
    

    收集id并将它们组成逗号分隔的字符串:

    DELETE FROM mytable WHERE id NOT IN ( ...comma-separated string... );
    

    (通常将逗号分隔列表插入到SQL语句中会引入一些SQL注入的风险,但在这种情况下,值不是来自不受信任的源,它们是来自数据库本身的id值 . )

    note: 虽然这不能在单个查询中完成工作,但有时一个更简单,可以完成的解决方案是最有效的 .

  • 3
    DELETE  i1.*
    FROM    items i1
    LEFT JOIN
            (
            SELECT  id
            FROM    items ii
            ORDER BY
                    id DESC
            LIMIT 20
            ) i2
    ON      i1.id = i2.id
    WHERE   i2.id IS NULL
    
  • 8

    如果您的id是增量的,那么使用类似的东西

    delete from table where id < (select max(id) from table)-N
    
  • 1

    To delete all the records except te last N you may use the query reported later.

    它's a single query but with many statements so it'实际上不是原始问题中的预期方式 .

    由于MySQL中的错误,您还需要一个变量和一个内置的(在查询中)预处理语句 .

    希望它无论如何都可能有用......

    nnn是 keep 的行,而theTable是你正在处理的表 .

    我假设你有一个名为id的自动增量记录

    SELECT @ROWS_TO_DELETE := COUNT(*) - nnn FROM `theTable`;
    SELECT @ROWS_TO_DELETE := IF(@ROWS_TO_DELETE<0,0,@ROWS_TO_DELETE);
    PREPARE STMT FROM "DELETE FROM `theTable` ORDER BY `id` ASC LIMIT ?";
    EXECUTE STMT USING @ROWS_TO_DELETE;
    

    关于这种方法的好处是 performance :我在本地数据库上测试了大约13,000条记录的查询,保留了最后的1000条记录 . 它运行0.08秒 .

    接受答案的脚本......

    DELETE FROM `table`
    WHERE id NOT IN (
      SELECT id
      FROM (
        SELECT id
        FROM `table`
        ORDER BY id DESC
        LIMIT 42 -- keep this many records
      ) foo
    );
    

    需要0.55秒 . 大约7倍多 .

    我在带有SSD的i7 MacBookPro上运行mySQL 5.5.25

  • 2
    DELETE FROM table WHERE ID NOT IN
    (SELECT MAX(ID) ID FROM table)
    
  • 0

    尝试以下查询:

    DELETE FROM tablename WHERE id < (SELECT * FROM (SELECT (MAX(id)-10) FROM tablename ) AS a)
    

    内部子查询将返回前10个值,外部查询将删除除前10个之外的所有记录 .

  • 0

    DELETE FROM表WHERE id NOT IN(SELECT id FROM table ORDER BY id,desc LIMIT 0,10)

  • 8

    这也应该有效:

    DELETE FROM [table] INNER JOIN (SELECT [id] FROM (SELECT [id] FROM [table] ORDER BY [id] DESC LIMIT N) AS Temp) AS Temp2 ON [table].[id] = [Temp2].[id]
    
  • 0

    关于什么 :

    SELECT * FROM table del 
             LEFT JOIN table keep
             ON del.id < keep.id
             GROUP BY del.* HAVING count(*) > N;
    

    它返回之前超过N行的行 . 可能有用吗?

  • 113

    在许多情况下,使用id作为此任务不是一个选项 . 例如 - 与twitter的表状态 . 以下是具有指定时间戳字段的变体 .

    delete from table 
    where access_time >= 
    (
        select access_time from  
        (
            select access_time from table 
                order by access_time limit 150000,1
        ) foo    
    )
    
  • -1

    只是想把这个混合使用Microsoft SQL Server而不是MySQL . 关键字'Limit' isn 't supported by MSSQL, so you' ll需要使用替代方案 . 此代码在SQL 2008中有效,并且基于此SO帖子 . https://stackoverflow.com/a/1104447/993856

    -- Keep the last 10 most recent passwords for this user.
    DECLARE @UserID int; SET @UserID = 1004
    DECLARE @ThresholdID int -- Position of 10th password.
    SELECT  @ThresholdID = UserPasswordHistoryID FROM
            (
                SELECT ROW_NUMBER()
                OVER (ORDER BY UserPasswordHistoryID DESC) AS RowNum, UserPasswordHistoryID
                FROM UserPasswordHistory
                WHERE UserID = @UserID
            ) sub
    WHERE   (RowNum = 10) -- Keep this many records.
    
    DELETE  UserPasswordHistory
    WHERE   (UserID = @UserID)
            AND (UserPasswordHistoryID < @ThresholdID)
    

    不可否认,这并不优雅 . 如果您能够针对Microsoft SQL进行优化,请分享您的解决方案 . 谢谢!

  • -1

    如果您还需要根据其他列删除记录,那么这是一个解决方案:

    DELETE
    FROM articles
    WHERE id IN
        (SELECT id
         FROM
           (SELECT id
            FROM articles
            WHERE user_id = :userId
            ORDER BY created_at DESC LIMIT 500, 10000000) abc)
      AND user_id = :userId
    
  • 0

    为什么不

    DELETE FROM table ORDER BY id DESC LIMIT 1, 123456789
    

    只删除除第一行以外的所有行(顺序为DESC!),使用非常大的数字作为第二个LIMIT参数 . See here

  • 0

    经过很长一段时间回答这个问题...遇到了同样的情况而不是使用上面提到的答案,我来到下面 -

    DELETE FROM table_name order by ID limit 10
    

    这将删除前10条记录并保留最新记录 .

相关问题