Python有一个有序集吗？-Java 学习之路

356

Python有ordered dictionary . 订购套装怎么样？

14 回答

-1
出于许多目的，简单地调用sorted就足够了 . 例如
```
>>> s = set([0, 1, 2, 99, 4, 40, 3, 20, 24, 100, 60])
>>> sorted(s)
[0, 1, 2, 3, 4, 20, 24, 40, 60, 99, 100]
```
如果要重复使用它，则调用已排序函数会产生开销，因此您可能希望保存结果列表，只要您完成更改集即可 . 如果您需要维护唯一元素并进行排序，我同意使用具有任意值（如None）的集合中的OrderedDict的建议 .
回复于 2024-05-05T21:00:49+08:00
-9

如果您已在代码中使用pandas，则其 Index 对象的行为非常类似于有序集，如this article所示 .

回复于 2024-05-05T21:00:49+08:00
6
答案是否定的，但您可以使用collections.OrderedDict，它位于Python标准库中，只使用键（和值为 None ）用于相同的目的 .

这是一个如何使用 OrderedDict 作为有序集来过滤掉重复项目同时保留顺序的示例：
```
>>> keywords = ['foo', 'bar', 'bar', 'foo', 'baz', 'foo']

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(keywords).keys())
['foo', 'bar', 'baz']
```
回复于 2024-05-05T21:00:49+08:00

-4

>>> a = {3, 4, 2, 6, 1, 7}
>>> type(a)
<class 'set'>
>>> sorted(a, reverse=True)
[7, 6, 4, 3, 2, 1]
>>> sorted(a)
[1, 2, 3, 4, 6, 7]

回复于 2024-05-05T21:00:49+08:00

33
有一个ordered set（可能new link）的配方，从Python 2 Documentation引用 . 这在Py2.6或更高版本以及3.0或更高版本上运行而不进行任何修改 . 该接口几乎与普通集完全相同，只是初始化应该使用列表完成 .
```
OrderedSet([1, 2, 3])
```
这是一个MutableSet，因此 .union 的签名与set的签名不匹配，但由于它包含 __or__ ，可以很容易地添加类似的东西：
```
@staticmethod
def union(*sets):
    union = OrderedSet()
    union.union(*sets)
    return union

def union(self, *sets):
    for set in sets:
        self |= set
```
回复于 2024-05-05T21:00:49+08:00

游戏有点晚了，但是我写了一个 setlist 类作为 collections-extended 的一部分，它完全实现了 Sequence 和 Set

>>> from collections_extended import setlist
>>> sl = setlist('abracadabra')
>>> sl
setlist(('a', 'b', 'r', 'c', 'd'))
>>> sl[3]
'c'
>>> sl[-1]
'd'
>>> 'r' in sl  # testing for inclusion is fast
True
>>> sl.index('d')  # so is finding the index of an element
4
>>> sl.insert(1, 'd')  # inserting an element already in raises a ValueError
ValueError
>>> sl.index('d')
4

GitHub：https://github.com/mlenzen/collections-extended

文件：http://collections-extended.lenzm.net/en/latest/

PyPI：https://pypi.python.org/pypi/collections-extended

回复于 2024-05-05T21:00:49+08:00

186
我可以比OrderedSet做得更好：boltons有a pure-Python, 2/3-compatible IndexedSet type，它不仅是一个有序集，而且还支持索引（与列表一样） .

只需 pip install boltons （或将 setutils.py 复制到您的代码库中），导入 IndexedSet 并：
```
>>> from boltons.setutils import IndexedSet
>>> x = IndexedSet(list(range(4)) + list(range(8)))
>>> x
IndexedSet([0, 1, 2, 3, 4, 5, 6, 7])
>>> x - set(range(2))
IndexedSet([2, 3, 4, 5, 6, 7])
>>> x[-1]
7
>>> fcr = IndexedSet('freecreditreport.com')
>>> ''.join(fcr[:fcr.index('.')])
'frecditpo'
```
一切都是独特的，并保持有序 . 完全披露：我写了 IndexedSet ，但这也意味着you can bug me if there are any issues . :)
回复于 2024-05-05T21:00:49+08:00

有序集在功能上是有序字典的特例 .

字典的键是唯一的 . 因此，如果忽略有序字典中的值（例如通过赋予它们 None ），则基本上有一个有序集 .

As of Python 3.1有collections.OrderedDict . 以下是OrderedSet的示例实现 . （请注意，只需要定义或覆盖几个方法： collections.OrderedDict 和collections.MutableSet执行繁重的工作 . ）

import collections

class OrderedSet(collections.OrderedDict, collections.MutableSet):

    def update(self, *args, **kwargs):
        if kwargs:
            raise TypeError("update() takes no keyword arguments")

        for s in args:
            for e in s:
                 self.add(e)

    def add(self, elem):
        self[elem] = None

    def discard(self, elem):
        self.pop(elem, None)

    def __le__(self, other):
        return all(e in other for e in self)

    def __lt__(self, other):
        return self <= other and self != other

    def __ge__(self, other):
        return all(e in self for e in other)

    def __gt__(self, other):
        return self >= other and self != other

    def __repr__(self):
        return 'OrderedSet([%s])' % (', '.join(map(repr, self.keys())))

    def __str__(self):
        return '{%s}' % (', '.join(map(repr, self.keys())))

    difference = property(lambda self: self.__sub__)
    difference_update = property(lambda self: self.__isub__)
    intersection = property(lambda self: self.__and__)
    intersection_update = property(lambda self: self.__iand__)
    issubset = property(lambda self: self.__le__)
    issuperset = property(lambda self: self.__ge__)
    symmetric_difference = property(lambda self: self.__xor__)
    symmetric_difference_update = property(lambda self: self.__ixor__)
    union = property(lambda self: self.__or__)

回复于 2024-05-05T21:00:49+08:00

31
关于PyPI的实现

虽然其他人已经指出在Python中还没有内置的插入顺序保留集的实现，但是我觉得这个问题缺少一个答案，说明在PyPI上有什么内容 .

据我所知，目前有：
- ordered-set
- oset
这两种实现都基于recipe posted by Raymond Hettinger to ActiveState，这也在其他答案中提到 . 我检查了两个并确定了以下内容

关键差异：
- ordered-set（1.1版）
- 优势：按索引查找O（1）（例如 my_set[5] ）
- 缺点： remove(item) 未实施
- oset（版本0.1.3）
- 优势：O（1）for remove(item)
- 缺点：显然O（n）用于按索引查找
两个实现都有 add(item) 和 __contains__(item) （ item in my_set ）的O（1） .

遗憾的是，这两种实现都没有基于方法的集合操作，如 set1.union(set2) - >您必须使用基于运算符的表单，如 set1 | set2 . 有关设置操作方法及其基于运算符的等效项的完整列表，请参阅Python documentation on Set Objects .

我第一次使用有序集，直到我第一次使用 remove(item) ，这使我的脚本崩溃了 NotImplementedError . 因为到目前为止我从未使用过索引查找，所以我同时切换到了oset .

如果您了解PyPI的其他实现，请在评论中告诉我 .
回复于 2024-05-05T21:00:49+08:00
15
所以我也有一个小清单，我明显有可能引入非独特的 Value 观 .

我搜索了某种类型的唯一列表的存在，但后来意识到在添加它之前测试元素的存在就可以了 .
```
if(not new_element in my_list):
    my_list.append(new_element)
```
我不知道这个简单方法是否有警告，但它解决了我的问题 .
回复于 2024-05-05T21:00:49+08:00
6
如果您使用有序集来维护排序顺序，请考虑使用PyPI中的排序集实现 . sortedcontainers模块仅为此目的提供SortedSet . 一些好处：纯Python，快速实施，100％单元测试覆盖，数小时的压力测试 .

使用pip从PyPI轻松安装：
```
pip install sortedcontainers
```
请注意，如果您不能 pip install ，只需从open-source repository下拉sortedlist.py和sortedset.py文件即可 .

安装完成后您可以简单地：
```
from sortedcontainers import SortedSet
help(SortedSet)
```
sortedcontainers模块还维护performance comparison，其中包含多个替代实现 .

对于有关Python 's bag data type, there' s的评论或者SortedList可用于有效实施 Baggage 的数据类型 .
回复于 2024-05-05T21:00:49+08:00

官方图书馆里没有 OrderedSet . 我制作了所有数据结构的详尽备忘单供您参考 .

DataStructure = {
    'Collections': {
        'Map': [
            ('dict', 'OrderDict', 'defaultdict'),
            ('chainmap', 'types.MappingProxyType')
        ],
        'Set': [('set', 'frozenset'), {'multiset': 'collection.Counter'}]
    },
    'Sequence': {
        'Basic': ['list', 'tuple', 'iterator']
    },
    'Algorithm': {
        'Priority': ['heapq', 'queue.PriorityQueue'],
        'Queue': ['queue.Queue', 'multiprocessing.Queue'],
        'Stack': ['collection.deque', 'queue.LifeQueue']
        },
    'text_sequence': ['str', 'byte', 'bytearray']
}

回复于 2024-05-05T21:00:49+08:00

3
我相信有四种可能需要的订购方式：
- 按键排序
- 按 Value 订购（虽然我没有听说有人要求这个）
- 按修改时间排序
- 按加时间排序
我相信collections.OrderedDict让你＃4 . 或者你可以移除一个键并重新添加它，＃3 .

对于＃1，你可能应该检查一个红黑树或treap：
红黑树的操作时间差异很小（因此对于交互式应用程序可能更好），但并不像平均交换机那样快（对于批处理来说可能更好 - treaps不会重新组织自己经常使它们快速运行平均，但当他们重新组织时，可能需要相当长的时间） .

这两者都是已 Build 的数据结构，具有多种语言的实现 .
回复于 2024-05-05T21:00:49+08:00
122

ParallelRegression包提供了一个setList( )有序集合类，它比基于ActiveState配方的选项更加方法完整 . 它支持列表可用的所有方法，并且支持大多数方法（如果不是所有方法） .

回复于 2024-05-05T21:00:49+08:00

Python有一个有序集吗？

14 回答

有序集在功能上是有序字典的特例 .

关于PyPI的实现

关键差异：

相关问题