首页 文章

在C#中合并字典

提问于
浏览
413

在C#中合并两个或多个词典( Dictionary<T1,T2> )的最佳方法是什么? (像LINQ这样的3.0功能很好) .

我正在考虑一种方法签名:

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(Dictionary<TKey,TValue>[] dictionaries);

要么

public static Dictionary<TKey,TValue>
                 Merge<TKey,TValue>(IEnumerable<Dictionary<TKey,TValue>> dictionaries);

EDIT: 从JaredPar和Jon Skeet那里得到了一个很酷的解决方案,但我正在考虑处理重复键的东西 . 如果发生碰撞,它并不一致 .

20 回答

  • 2

    要么 :

    public static IDictionary<TKey, TValue> Merge<TKey, TValue>( IDictionary<TKey, TValue> x, IDictionary<TKey, TValue> y)
        {
            return x
                .Except(x.Join(y, z => z.Key, z => z.Key, (a, b) => a))
                .Concat(y)
                .ToDictionary(z => z.Key, z => z.Value);
        }
    

    结果是一个联合,其中重复条目“y”获胜 .

  • 5

    这部分取决于你遇到重复的事情 . 例如,你可以这样做:

    var result = dictionaries.SelectMany(dict => dict)
                             .ToDictionary(pair => pair.Key, pair => pair.Value);
    

    如果你得到任何重复的密钥,那将会爆炸 .

    编辑:如果您使用ToLookup,那么您将获得一个查找,每个键可以有多个值 . 然后,您可以将其转换为字典:

    var result = dictionaries.SelectMany(dict => dict)
                             .ToLookup(pair => pair.Key, pair => pair.Value)
                             .ToDictionary(group => group.Key, group => group.First());
    

    这有点难看 - 而且效率低下 - 但这是在代码方面做到最快的方法 . (诚然,我没有测试过 . )

    您当然可以编写自己的ToDictionary2扩展方法(名称更好,但我现在没有时间考虑一个) - 这不是很难做,只是覆盖(或忽略)重复键 . 重要的一点(在我看来)是使用SelectMany,并意识到字典支持迭代其键/值对 .

  • 1

    我会这样做:

    dictionaryFrom.ToList().ForEach(x => dictionaryTo.Add(x.Key, x.Value));
    

    简单易行 . 根据this blog post它甚至比大多数循环更快,因为它的底层实现通过索引而不是枚举器(see this answer)访问元素 .

    如果存在重复,它当然会抛出异常,因此您必须在合并之前进行检查 .

  • 2

    好吧,我迟到了,但这是我用的 . 如果有多个键(“righter”键替换“lefter”键),它不会爆炸,可以合并多个词典(如果需要)并保留类型(限制它需要一个有意义的默认公共构造函数):

    public static class DictionaryExtensions
    {
        // Works in C#3/VS2008:
        // Returns a new dictionary of this ... others merged leftward.
        // Keeps the type of 'this', which must be default-instantiable.
        // Example: 
        //   result = map.MergeLeft(other1, other2, ...)
        public static T MergeLeft<T,K,V>(this T me, params IDictionary<K,V>[] others)
            where T : IDictionary<K,V>, new()
        {
            T newMap = new T();
            foreach (IDictionary<K,V> src in
                (new List<IDictionary<K,V>> { me }).Concat(others)) {
                // ^-- echk. Not quite there type-system.
                foreach (KeyValuePair<K,V> p in src) {
                    newMap[p.Key] = p.Value;
                }
            }
            return newMap;
        }
    
    }
    
  • 3

    琐碎的解决方案是:

    using System.Collections.Generic;
    ...
    public static Dictionary<TKey, TValue>
        Merge<TKey,TValue>(IEnumerable<Dictionary<TKey, TValue>> dictionaries)
    {
        var result = new Dictionary<TKey, TValue>();
        foreach (var dict in dictionaries)
            foreach (var x in dict)
                result[x.Key] = x.Value;
        return result;
    }
    
  • 44

    请尝试以下方法

    static Dictionary<TKey, TValue>
        Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> enumerable)
    {
        return enumerable.SelectMany(x => x).ToDictionary(x => x.Key, y => y.Value);
    }
    
  • 10
    Dictionary<String, String> allTables = new Dictionary<String, String>();
    allTables = tables1.Union(tables2).ToDictionary(pair => pair.Key, pair => pair.Value);
    
  • 2

    以下适用于我 . 如果有重复项,它将使用dictA的值 .

    public static IDictionary<TKey, TValue> Merge<TKey, TValue>(this IDictionary<TKey, TValue> dictA, IDictionary<TKey, TValue> dictB)
        where TValue : class
    {
        return dictA.Keys.Union(dictB.Keys).ToDictionary(k => k, k => dictA.ContainsKey(k) ? dictA[k] : dictB[k]);
    }
    
  • 0

    我很晚才参加派对,也许会遗漏一些东西,但如果要么没有重复的密钥,或者正如OP所说的那样,“如果发生碰撞,只要它是d,哪个值保存到dict就没关系了 . 一致,“这个有什么问题(把D2合并到D1)?

    foreach (KeyValuePair<string,int> item in D2)
                {
                     D1[item.Key] = item.Value;
                }
    

    这看起来很简单,也许太简单了,我想知道我是否遗漏了什么 . 这是我在一些代码中使用的,我知道没有重复的密钥 . 不过,我还在测试中,所以如果我忽略了某些东西,我现在很想知道,而不是后来发现 .

  • 221

    这是我使用的辅助函数:

    using System.Collections.Generic;
    namespace HelperMethods
    {
        public static class MergeDictionaries
        {
            public static void Merge<TKey, TValue>(this IDictionary<TKey, TValue> first, IDictionary<TKey, TValue> second)
            {
                if (second == null || first == null) return;
                foreach (var item in second) 
                    if (!first.ContainsKey(item.Key)) 
                        first.Add(item.Key, item.Value);
            }
        }
    }
    
  • 8

    如何添加 params 重载?

    此外,您应该将它们键入 IDictionary 以获得最大的灵活性 .

    public static IDictionary<TKey, TValue> Merge<TKey, TValue>(IEnumerable<IDictionary<TKey, TValue>> dictionaries)
    {
        // ...
    }
    
    public static IDictionary<TKey, TValue> Merge<TKey, TValue>(params IDictionary<TKey, TValue>[] dictionaries)
    {
        return Merge((IEnumerable<TKey, TValue>) dictionaries);
    }
    
  • 17

    考虑到performance of dictionary key lookups and deletes,因为它们是哈希操作,并且考虑到问题的措辞是最好的方式,我认为下面是一个完全有效的方法,其他有点过于复杂,恕我直言 .

    public static void MergeOverwrite<T1, T2>(this IDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
        {
            if (newElements == null) return;
    
            foreach (var e in newElements)
            {
                dictionary.Remove(e.Key); //or if you don't want to overwrite do (if !.Contains()
                dictionary.Add(e);
            }
        }
    

    或者,如果您在多线程应用程序中工作,并且您的字典无论如何都需要线程安全,那么您应该这样做:

    public static void MergeOverwrite<T1, T2>(this ConcurrentDictionary<T1, T2> dictionary, IDictionary<T1, T2> newElements)
        {
            if (newElements == null || newElements.Count == 0) return;
    
            foreach (var ne in newElements)
            {
                dictionary.AddOrUpdate(ne.Key, ne.Value, (key, value) => value);
            }
        }
    

    然后,您可以将其换行以使其处理字典的枚举 . 无论如何,你正在考虑~O(3n)(所有条件都很完美),因为 .Add() 会在幕后做一些额外的,不必要的但实际上是免费的 Contains() . 我认为它不会好得多 .

    如果要限制大型集合上的额外操作,则应总结要合并的每个字典的 Count ,并将目标字典的容量设置为该值,这样可以避免以后调整大小的成本 . 所以,最终产品是这样的......

    public static IDictionary<T1, T2> MergeAllOverwrite<T1, T2>(IList<IDictionary<T1, T2>> allDictionaries)
        {
            var initSize = allDictionaries.Sum(d => d.Count);
            var resultDictionary = new Dictionary<T1, T2>(initSize);
            allDictionaries.ForEach(resultDictionary.MergeOverwrite);
            return resultDictionary;
        }
    

    请注意,我接受了 IList<T> 这个方法...主要是因为如果你接受了一个 IEnumerable<T> ,你已经打开了自己的同一组的多个枚举,如果你从一个字典集合中获取,这可能是非常昂贵的推迟LINQ声明 .

  • 6

    根据上面的答案,但添加一个Func参数让调用者处理重复项:

    public static Dictionary<TKey, TValue> Merge<TKey, TValue>(this IEnumerable<Dictionary<TKey, TValue>> dicts, 
                                                               Func<IGrouping<TKey, TValue>, TValue> resolveDuplicates)
    {
        if (resolveDuplicates == null)
            resolveDuplicates = new Func<IGrouping<TKey, TValue>, TValue>(group => group.First());
    
        return dicts.SelectMany<Dictionary<TKey, TValue>, KeyValuePair<TKey, TValue>>(dict => dict)
                    .ToLookup(pair => pair.Key, pair => pair.Value)
                    .ToDictionary(group => group.Key, group => resolveDuplicates(group));
    }
    
  • 266

    派对现在已经死了,但这里是用户166390的“改进版”,它进入了我的扩展库 . 除了一些细节,我添加了一个委托来计算合并的值 .

    /// <summary>
    /// Merges a dictionary against an array of other dictionaries.
    /// </summary>
    /// <typeparam name="TResult">The type of the resulting dictionary.</typeparam>
    /// <typeparam name="TKey">The type of the key in the resulting dictionary.</typeparam>
    /// <typeparam name="TValue">The type of the value in the resulting dictionary.</typeparam>
    /// <param name="source">The source dictionary.</param>
    /// <param name="mergeBehavior">A delegate returning the merged value. (Parameters in order: The current key, The current value, The previous value)</param>
    /// <param name="mergers">Dictionaries to merge against.</param>
    /// <returns>The merged dictionary.</returns>
    public static TResult MergeLeft<TResult, TKey, TValue>(
        this TResult source,
        Func<TKey, TValue, TValue, TValue> mergeBehavior,
        params IDictionary<TKey, TValue>[] mergers)
        where TResult : IDictionary<TKey, TValue>, new()
    {
        var result = new TResult();
        var sources = new List<IDictionary<TKey, TValue>> { source }
            .Concat(mergers);
    
        foreach (var kv in sources.SelectMany(src => src))
        {
            TValue previousValue;
            result.TryGetValue(kv.Key, out previousValue);
            result[kv.Key] = mergeBehavior(kv.Key, kv.Value, previousValue);
        }
    
        return result;
    }
    
  • 3

    @Tim:应该是评论,但评论不允许进行代码编辑 .

    Dictionary<string, string> t1 = new Dictionary<string, string>();
    t1.Add("a", "aaa");
    Dictionary<string, string> t2 = new Dictionary<string, string>();
    t2.Add("b", "bee");
    Dictionary<string, string> t3 = new Dictionary<string, string>();
    t3.Add("c", "cee");
    t3.Add("d", "dee");
    t3.Add("b", "bee");
    Dictionary<string, string> merged = t1.MergeLeft(t2, t2, t3);
    

    注意:我申请了@Andrew Orsich将@ANeves修改为解决方案,因此MergeLeft现在看起来像这样:

    public static Dictionary<K, V> MergeLeft<K, V>(this Dictionary<K, V> me, params IDictionary<K, V>[] others)
        {
            var newMap = new Dictionary<K, V>(me, me.Comparer);
            foreach (IDictionary<K, V> src in
                (new List<IDictionary<K, V>> { me }).Concat(others))
            {
                // ^-- echk. Not quite there type-system.
                foreach (KeyValuePair<K, V> p in src)
                {
                    newMap[p.Key] = p.Value;
                }
            }
            return newMap;
        }
    
  • 91

    我知道这是一个老问题,但是因为我们现在有了LINQ,你可以在这样的单行中完成它

    Dictionary<T1,T2> merged;
    Dictionary<T1,T2> mergee;
    mergee.ToList().ForEach(kvp => merged.Add(kvp.Key, kvp.Value));
    

    要么

    mergee.ToList().ForEach(kvp => merged.Append(kvp));
    
  • 0

    害怕看到复杂的答案,不熟悉C# .

    这是一些简单的答案 .
    合并d1,d2等字典并处理任何重叠键(以下示例中的"b"):

    例1

    {
        // 2 dictionaries,  "b" key is common with different values
    
        var d1 = new Dictionary<string, int>() { { "a", 10 }, { "b", 21 } };
        var d2 = new Dictionary<string, int>() { { "c", 30 }, { "b", 22 } };
    
        var result1 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
        // result1 is  a=10, b=21, c=30    That is, took the "b" value of the first dictionary
    
        var result2 = d1.Concat(d2).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
        // result2 is  a=10, b=22, c=30    That is, took the "b" value of the last dictionary
    }
    

    例2

    {
        // 3 dictionaries,  "b" key is common with different values
    
        var d1 = new Dictionary<string, int>() { { "a", 10 }, { "b", 21 } };
        var d2 = new Dictionary<string, int>() { { "c", 30 }, { "b", 22 } };
        var d3 = new Dictionary<string, int>() { { "d", 40 }, { "b", 23 } };
    
        var result1 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.First().Value);
        // result1 is  a=10, b=21, c=30, d=40    That is, took the "b" value of the first dictionary
    
        var result2 = d1.Concat(d2).Concat(d3).GroupBy(ele => ele.Key).ToDictionary(ele => ele.Key, ele => ele.Last().Value);
        // result2 is  a=10, b=23, c=30, d=40    That is, took the "b" value of the last dictionary
    }
    

    有关更复杂的方案,请参阅其他答案 .
    希望有所帮助 .

  • 13

    使用扩展方法合并 . 当存在重复键时它不会抛出异常,而是用第二个字典中的键替换这些键 .

    internal static class DictionaryExtensions
    {
        public static Dictionary<T1, T2> Merge<T1, T2>(this Dictionary<T1, T2> first, Dictionary<T1, T2> second)
        {
            if (first == null) throw new ArgumentNullException("first");
            if (second == null) throw new ArgumentNullException("second");
    
            var merged = new Dictionary<T1, T2>();
            first.ToList().ForEach(kv => merged[kv.Key] = kv.Value);
            second.ToList().ForEach(kv => merged[kv.Key] = kv.Value);
    
            return merged;
        }
    }
    

    用法:

    Dictionary<string, string> merged = first.Merge(second);
    
  • 20
    using System.Collections.Generic;
    using System.Linq;
    
    public static class DictionaryExtensions
    {
        public enum MergeKind { SkipDuplicates, OverwriteDuplicates }
        public static void Merge<K, V>(this IDictionary<K, V> target, IDictionary<K, V> source, MergeKind kind = MergeKind.SkipDuplicates) =>
            source.ToList().ForEach(_ => { if (kind == MergeKind.OverwriteDuplicates || !target.ContainsKey(_.Key)) target[_.Key] = _.Value; });
    }
    

    您可以跳过/忽略(默认)或覆盖重复项:如果您对Linq性能不过于挑剔,那么Bob就是您的叔叔,但我更喜欢简洁的可维护代码:在这种情况下,您可以删除默认的MergeKind.SkipDuplicates来强制执行呼叫者的选择,让开发人员认识到结果将是什么!

  • 1

    使用 EqualityComparer 进行合并,将项目进行映射以进行比较,以进行不同的值/类型 . 这里我们将从 KeyValuePair (枚举字典时的项目类型)映射到 Key .

    public class MappedEqualityComparer<T,U> : EqualityComparer<T>
    {
        Func<T,U> _map;
    
        public MappedEqualityComparer(Func<T,U> map)
        {
            _map = map;
        }
    
        public override bool Equals(T x, T y)
        {
            return EqualityComparer<U>.Default.Equals(_map(x), _map(y));
        }
    
        public override int GetHashCode(T obj)
        {
            return _map(obj).GetHashCode();
        }
    }
    

    用法:

    // if dictA and dictB are of type Dictionary<int,string>
    var dict = dictA.Concat(dictB)
                    .Distinct(new MappedEqualityComparer<KeyValuePair<int,string>,int>(item => item.Key))
                    .ToDictionary(item => item.Key, item=> item.Value);
    

相关问题