对于元组的SortedCollection，其中'contains'仅考虑每个元组的第一部分-Java 学习之路

我想为表单的元组编写一个已排序的容器

('A', 'b', 1, 0.3, 0.5)
('A', 'b', 2, 0.2, 0.1)
('A', 'b', 3, 0.6, 0.4)
('B', 'e', 1, 0.1, 0.9)
('B', 'e', 2, 0.5, 0.3)

如果前3个条目相同，则认为元组是相等的 . 排序时应忽略末尾的两个浮点数 .

http://code.activestate.com/recipes/577197-sortedcollection/中的SortedCollection配方是正确的方向，所以我用它作为起点 . 可以通过用户定义的密钥来实现排序 . 它没有做的是将contains方法限制为前n个元组元素 .

例如：

def __contains__(self, item):
    k = self._key(item)
    i = bisect_left(self._keys, k)
    j = bisect_right(self._keys, k)
    return item in self._items[i:j]

问题是在bisect调用之后，整个项目用于in运算符 . 这当然比较了元组的全部内容，而这不是我想要的 .

预期的结果是这两个元组被认为是相同的

('A', 'b', 1, 0.3, 0.5)
('A', 'b', 1, 0.2, 0.1)

因为前3个元组元素是相同的 .

包含in运算符的修复可能看起来像

return item[:2] in [ele[:2] for ele in self._items[i:j]]

这当然是非常昂贵的，因为这个列表理解会在每次调用时创建临时对象 .

是否有更有效的方法（例如使用islice）？

1 回答

看看以下内容：

a = [('A', 'b', 1, 0.3, 0.5),
     ('A', 'b', 1, 0.2, 0.1),
     ('A', 'b', 1, 0.4, 0.4),
     ('A', 'b', 3, 0.6, 0.4),
     ('A', 'a', 2, 0.1, 0.3)]  # your initial list of tuples

# to get rid of the identical ones, we will use a dictionary.
b = {x[:3]:x[3:] for x in a}
print(b)  # -> {('A', 'b', 1): (0.4, 0.4), ('A', 'b', 3): (0.6, 0.4), ('A', 'a', 2): (0.1, 0.3)}
# as you see, only the latest value appears: ('A', 'b', 1, 0.4, 0.4)    

# the we built the list again from the dictionary
c = [(*k, *v) for k, v in b.items()]
print(c)  # -> [('A', 'b', 1, 0.4, 0.4), ('A', 'b', 3, 0.6, 0.4), ('A', 'a', 2, 0.1, 0.3)]

# and we finally sort based on your rules.    
d = sorted(c, key=lambda x: x[:3])
print(d)  # -> [('A', 'a', 2, 0.1, 0.3), ('A', 'b', 1, 0.4, 0.4), ('A', 'b', 3, 0.6, 0.4)]
# notice the ('A', 'a' ,..) going first.

回复于 2024-04-26T08:29:25+08:00

对于元组的SortedCollection，其中'contains'仅考虑每个元组的第一部分

1 回答

相关问题