首页 文章

迭代时从STL集中删除元素

提问于
浏览
124

我需要浏览一个集合并删除符合预定义条件的元素 .

这是我写的测试代码:

#include <set>
#include <algorithm>

void printElement(int value) {
    std::cout << value << " ";
}

int main() {
    int initNum[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    std::set<int> numbers(initNum, initNum + 10);
    // print '0 1 2 3 4 5 6 7 8 9'
    std::for_each(numbers.begin(), numbers.end(), printElement);

    std::set<int>::iterator it = numbers.begin();

    // iterate through the set and erase all even numbers
    for (; it != numbers.end(); ++it) {
        int n = *it;
        if (n % 2 == 0) {
            // wouldn't invalidate the iterator?
            numbers.erase(it);
        }
    }

    // print '1 3 5 7 9'
    std::for_each(numbers.begin(), numbers.end(), printElement);

    return 0;
}

首先,我认为在迭代它时从集合中擦除元素会使迭代器无效,并且for循环的增量将具有未定义的行为 . 尽管如此,我执行了这个测试代码并且一切顺利,我无法解释原因 .

My question: 这是std集的已定义行为还是特定于此实现?顺便说一句,我在ubuntu 10.04(32位版本)上使用gcc 4.3.3 .

谢谢!

Proposed solution:

这是从集合中迭代和擦除元素的正确方法吗?

while(it != numbers.end()) {
    int n = *it;
    if (n % 2 == 0) {
        // post-increment operator returns a copy, then increment
        numbers.erase(it++);
    } else {
        // pre-increment operator increments, then return
        ++it;
    }
}

Edit: PREFERED SOLUTION

我找到了一个对我来说更优雅的解决方案,即使它完全相同 .

while(it != numbers.end()) {
    // copy the current iterator then increment it
    std::set<int>::iterator current = it++;
    int n = *current;
    if (n % 2 == 0) {
        // don't invalidate iterator it, because it is already
        // pointing to the next element
        numbers.erase(current);
    }
}

如果while内有多个测试条件,则每个测试条件必须递增迭代器 . 我更喜欢这段代码,因为迭代器增加了 only in one place ,使得代码不易出错且更具可读性 .

8 回答

  • 0

    这取决于实现:

    标准23.1.2.8:

    插入成员不应影响迭代器和对容器的引用的有效性,并且擦除成员应仅使迭代器和对已擦除元素的引用无效 .

    也许你可以试试这个 - 这是标准的符合:

    for (it = numbers.begin(); it != numbers.end(); ) {
        if (*it % 2 == 0) {
            numbers.erase(it++);
        }
        else {
            ++it;
        }
    }
    

    请注意,它是后缀,因此它将旧位置传递给擦除,但由于操作符,首先跳转到较新的位置 .

    2015.10.27 update: C 11解决了这个缺陷 . iterator erase (const_iterator position); 将一个迭代器返回到删除最后一个元素后面的元素(如果删除了最后一个元素,则返回set :: end) . 所以C 11风格是:

    for (it = numbers.begin(); it != numbers.end(); ) {
        if (*it % 2 == 0) {
            it = numbers.erase(it);
        }
        else {
            ++it;
        }
    }
    
  • 2

    如果你通过valgrind运行你的程序,你会看到一堆读错误 . 换句话说,是的,迭代器正在失效,但你在你的例子中变得幸运(或者真的很不幸,因为你没有看到未定义行为的负面影响) . 对此的一个解决方案是创建临时迭代器,增加临时值,删除目标迭代器,然后将目标设置为temp . 例如,重写您的循环如下:

    std::set<int>::iterator it = numbers.begin();                               
    std::set<int>::iterator tmp;                                                
    
    // iterate through the set and erase all even numbers                       
    for ( ; it != numbers.end(); )                                              
    {                                                                           
        int n = *it;                                                            
        if (n % 2 == 0)                                                         
        {                                                                       
            tmp = it;                                                           
            ++tmp;                                                              
            numbers.erase(it);                                                  
            it = tmp;                                                           
        }                                                                       
        else                                                                    
        {                                                                       
            ++it;                                                               
        }                                                                       
    }
    
  • 154

    你误解了"undefined behavior"的含义 . 未定义的行为并不意味着“如果你这样做,你的程序将崩溃或产生意外的结果 . 如果你这样做,你的程序可能崩溃或产生意外的结果”,或者做任何其他事情,具体取决于你的编译器,你的操作系统,月相等

    如果某些东西在没有崩溃的情况下执行并且按照您的预期行为,那么这并不能证明它不是未定义的行为 . 所有证明的是,在特定操作系统上使用特定编译器进行编译之后,其行为恰好与特定运行一样 .

    从集合中删除元素会使迭代器无效,从而使删除的元素无效 . 使用无效的迭代器是未定义的行为 . 事实恰恰相反,观察到的行为就是你在这个特定情况下的意图;这并不意味着代码是正确的 .

  • 18

    只是为了警告,在deque容器的情况下,检查deque迭代器与numbers.end()相等的所有解决方案都可能在gcc 4.8.4上失败 . 即,擦除deque的元素通常会使指向numbers.end()的指针无效:

    #include <iostream>
    #include <deque>
    
    using namespace std;
    int main() 
    {
    
      deque<int> numbers;
    
      numbers.push_back(0);
      numbers.push_back(1);
      numbers.push_back(2);
      numbers.push_back(3);
      //numbers.push_back(4);
    
      deque<int>::iterator  it_end = numbers.end();
    
      for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
        if (*it % 2 == 0) {
          cout << "Erasing element: " << *it << "\n";
          numbers.erase(it++);
          if (it_end == numbers.end()) {
        cout << "it_end is still pointing to numbers.end()\n";
          } else {
        cout << "it_end is not anymore pointing to numbers.end()\n";
          }
        }
        else {
          cout << "Skipping element: " << *it << "\n";
          ++it;
        }
      }
    }
    

    输出:

    Erasing element: 0
    it_end is still pointing to numbers.end()
    Skipping element: 1
    Erasing element: 2
    it_end is not anymore pointing to numbers.end()
    

    请注意,虽然在这种特定情况下deque转换是正确的,但结束指针在此过程中已经无效 . 对于不同大小的双端队列,错误更明显:

    int main() 
    {
    
      deque<int> numbers;
    
      numbers.push_back(0);
      numbers.push_back(1);
      numbers.push_back(2);
      numbers.push_back(3);
      numbers.push_back(4);
    
      deque<int>::iterator  it_end = numbers.end();
    
      for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
        if (*it % 2 == 0) {
          cout << "Erasing element: " << *it << "\n";
          numbers.erase(it++);
          if (it_end == numbers.end()) {
        cout << "it_end is still pointing to numbers.end()\n";
          } else {
        cout << "it_end is not anymore pointing to numbers.end()\n";
          }
        }
        else {
          cout << "Skipping element: " << *it << "\n";
          ++it;
        }
      }
    }
    

    输出:

    Erasing element: 0
    it_end is still pointing to numbers.end()
    Skipping element: 1
    Erasing element: 2
    it_end is still pointing to numbers.end()
    Skipping element: 3
    Erasing element: 4
    it_end is not anymore pointing to numbers.end()
    Erasing element: 0
    it_end is not anymore pointing to numbers.end()
    Erasing element: 0
    it_end is not anymore pointing to numbers.end()
    ...
    Segmentation fault (core dumped)
    

    以下是解决此问题的方法之一:

    #include <iostream>
    #include <deque>
    
    using namespace std;
    int main() 
    {
    
      deque<int> numbers;
      bool done_iterating = false;
    
      numbers.push_back(0);
      numbers.push_back(1);
      numbers.push_back(2);
      numbers.push_back(3);
      numbers.push_back(4);
    
      if (!numbers.empty()) {
        deque<int>::iterator it = numbers.begin();
        while (!done_iterating) {
          if (it + 1 == numbers.end()) {
        done_iterating = true;
          } 
          if (*it % 2 == 0) {
        cout << "Erasing element: " << *it << "\n";
          numbers.erase(it++);
          }
          else {
        cout << "Skipping element: " << *it << "\n";
        ++it;
          }
        }
      }
    }
    
  • 1

    此行为是特定于实现的 . 为了保证迭代器的正确性,你应该使用“it = numbers.erase(it);”声明,如果你需要删除元素,并在其他情况下简单地使用迭代迭代器 .

  • 6

    我遇到了同样的旧问题,发现下面的代码更多 understandable ,这是上述解决方案的一种方式 .

    std::set<int*>::iterator beginIt = listOfInts.begin();
    while(beginIt != listOfInts.end())
    {
        // Use your member
        std::cout<<(*beginIt)<<std::endl;
    
        // delete the object
        delete (*beginIt);
    
        // erase item from vector
        listOfInts.erase(beginIt );
    
        // re-calculate the begin
        beginIt = listOfInts.begin();
    }
    
  • 0

    我认为使用STL方法' remove_if '可以帮助防止在尝试删除迭代器包装的对象时出现一些奇怪的问题 .

    该解决方案可能效率较低 .

    假设我们有一些容器,比如vector或一个名为m_bullets的列表:

    Bullet::Ptr is a shared_pr<Bullet>
    

    ' it ' is the iterator that ' remove_if '返回,第三个参数是在容器的每个元素上执行的lambda函数 . 因为容器包含 Bullet::Ptr ,lambda函数需要获取作为参数传递的类型(或对该类型的引用) .

    auto it = std::remove_if(m_bullets.begin(), m_bullets.end(), [](Bullet::Ptr bullet){
        // dead bullets need to be removed from the container
        if (!bullet->isAlive()) {
            // lambda function returns true, thus this element is 'removed'
            return true;
        }
        else{
            // in the other case, that the bullet is still alive and we can do
            // stuff with it, like rendering and what not.
            bullet->render(); // while checking, we do render work at the same time
            // then we could either do another check or directly say that we don't
            // want the bullet to be removed.
            return false;
        }
    });
    // The interesting part is, that all of those objects were not really
    // completely removed, as the space of the deleted objects does still 
    // exist and needs to be removed if you do not want to manually fill it later 
    // on with any other objects.
    // erase dead bullets
    m_bullets.erase(it, m_bullets.end());
    

    ' remove_if ' removes the container where the lambda function returned true and shifts that content to the beginning of the container. The ' it ' points to an undefined object that can be considered garbage. Objects from '在该范围内调用' to m_bullets.end() can be erased, as they occupy memory, but contain garbage, thus the '擦除'方法 .

  • 0

    C 20将有“统一的容器擦除”,你将能够写:

    std::erase_if(numbers, [](int n){ return n % 2 == 0 });
    

    这将适用于 vectorsetdeque 等 . 有关详细信息,请参阅cppReference .

相关问题