我有这本词典：

num_dict = {
    (2, 3): [(2, 2), (4, 4), (4, 5)],
    (2, 2): [(2, 3), (4, 4), (4, 5)],
    (4, 5): [(4, 4)],
    (1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
    (4, 4): [(4, 5)],
    (1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)],
    }

我需要找到每个这些元组的第一个值的3个长组合的最大数量，其中只有每个键的值可以进行所述键 .

我目前用于查找所有唯一（3个长）组合的代码是：

ans_set = set()
for x in num_dict:
    for y in num_dict[x]:
        for z in num_dict[y]:
            ans_set.add((x[0], y[0], z[0]))
return len(ans_set)

这返回 10 和 ans_set 最终成为：

{
 (2, 2, 2), (1, 2, 2), (1, 4, 4),
 (2, 2, 4), (1, 1, 2), (4, 4, 4),
 (1, 2, 4), (1, 1, 4), (1, 1, 1),
 (2, 4, 4)
}

但我实际上并不关心这些是什么，只关心它们的数量

这种方法不是特别有效，因为它实际上生成了所有可能的组合并将其放入一组中 .

我不需要知道每个独特的组合，我只需知道有多少组合 .

我觉得这可以做到，也许使用值列表的长度？但是我无法绕过它 .

当我意识到我可能没有以最清晰的方式解释它时，澄清关于我需要的问题是受欢迎的 .

最终编辑

通过重新评估我需要它做什么，我找到了找到三元组数量的最佳方法 . 这种方法实际上并没有找到三元组，它只计算它们 .

def foo(l):
    llen = len(l)
    total = 0
    cache = {}
    for i in range(llen):
        cache[i] = 0
    for x in range(llen):
        for y in range(x + 1, llen):
            if l[y] % l[x] == 0:
                cache[y] += 1
                total += cache[x]
    return total

这里有一个函数版本可以解释思考过程（虽然因为垃圾邮件打印而对大型列表不利）：

def bar(l):
    list_length = len(l)
    total_triples = 0
    cache = {}
    for i in range(list_length):
        cache[i] = 0
    for x in range(list_length):
        print("\n\nfor index[{}]: {}".format(x, l[x]))
        for y in range(x + 1, list_length):
            print("\n\ttry index[{}]: {}".format(y, l[y]))
            if l[y] % l[x] == 0:
                print("\n\t\t{} can be evenly diveded by {}".format(l[y], l[x]))
                cache[y] += 1
                total_triples += cache[x]
                print("\t\tcache[{0}] is now {1}".format(y, cache[y]))
                print("\t\tcount is now {}".format(total_triples))
                print("\t\t(+{} from cache[{}])".format(cache[x], x))
            else:
                print("\n\t\tfalse")
    print("\ntotal number of triples:", total_triples)

1 回答

如果我找对你：

from itertools import combinations

num_dict = {
    (2, 3): [(2, 2), (4, 4), (4, 5)],
    (2, 2): [(2, 3), (4, 4), (4, 5)],
    (4, 5): [(4, 4)],
    (1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
    (4, 4): [(4, 5)],
    (1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)]
    }
set(combinations([k[0] for k in num_dict.keys()], 3))

输出：

{(1, 4, 1),
 (2, 1, 1),
 (2, 1, 4),
 (2, 2, 1),
 (2, 2, 4),
 (2, 4, 1),
 (2, 4, 4),
 (4, 1, 1),
 (4, 1, 4),
 (4, 4, 1)}

并且 len() 是 10

所以基本上你要做的是，用itertools.combinations进行所有组合，从长度为3的dict键的第一个元素开始，然后得到 set 以消除重复元素 .

UPDATE

由于您使用所需的输出数据更新了问题

您可以执行以下操作

from itertools import combinations_with_replacement
list(combinations_with_replacement(set([k[0] for k in num_dict.keys()]), 3))

输出：

[(1, 1, 1),
 (1, 1, 2),
 (1, 1, 4),
 (1, 2, 2),
 (1, 2, 4),
 (1, 4, 4),
 (2, 2, 2),
 (2, 2, 4),
 (2, 4, 4),
 (4, 4, 4)]

UPD2

所以关于时间消耗我已经运行了

num_dict = {
    (2, 3): [(2, 2), (4, 4), (4, 5)],
    (2, 2): [(2, 3), (4, 4), (4, 5)],
    (4, 5): [(4, 4)],
    (1, 0): [(1, 1), (2, 2), (2, 3), (4, 4), (4, 5)],
    (4, 4): [(4, 5)],
    (1, 1): [(1, 0), (2, 2), (2, 3), (4, 4), (4, 5)]
    }
def a(num_dict):
    ans_set = set()
    for x in num_dict:
        for y in num_dict[x]:
            for z in num_dict[y]:
                ans_set.add((x[0], y[0], z[0]))
    return len(ans_set)
def b(num_dict):
    from itertools import combinations_with_replacement
    return len(list(combinations_with_replacement(set([k[0] for k in num_dict.keys()]), 3)))
%timeit a(num_dict)
%timeit b(num_dict)

结果是：

The slowest run took 4.90 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 12.1 µs per loop

The slowest run took 5.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.77 µs per loop

所以我在这里提出的解决方案速度提高了2倍 .

回复于 2024-05-07T13:08:16+08:00

找到可能的唯一固定长度排列数的最有效方法是什么？

最终编辑

1 回答

相关问题