首页 文章

C#字符串比较'ö' 'oe' 'o' [复制]

提问于
浏览
2

可能重复:如何识别拼写不同的相似单词

我试图在比较这3个字符串时返回true:'voest','vost'和'vöst'(德国文化),因为它是同一个单词 . (事实上,只有oe和ö是相同的,但是例如对于DB校对CI它是正确的,因为'vost'是错误的'voest')

无论我为该方法提供什么参数,string.Compare(..)/ string.Equals(..)都返回false .

如何使string.Compare()/ Equals(..)返回true?

2 回答

  • 3

    您可以在比较中尝试IgnoreNonSpace选项 . 它不会解决voest - vost,但会对vost-vöst有所帮助 .

    int a = new CultureInfo("de-DE").CompareInfo.Compare("vost", "vöst", CompareOptions.IgnoreNonSpace);
    // a = 0; strings are equal.
    
  • 0

    您可以创建一个忽略变音符号的自定义比较器:

    class IgnoreUmlautComparer : IEqualityComparer<string>
    {
        Dictionary<char, char> umlautReplacer = new Dictionary<char, char>()
        {
            {'ä','a'}, {'Ä','A'},
            {'ö','o'}, {'Ö','O'},
            {'ü','u'}, {'Ü','U'},
        };
        Dictionary<string, string> pseudoUmlautReplacer = new Dictionary<string, string>()
        {
            {"ae","a"}, {"Ae","A"},
            {"oe","o"}, {"Oe","O"},
            {"ue","u"}, {"Ue","U"},
        };
    
        private IEnumerable<char> ignoreUmlaut(string s)
        {
            char value;
            string replaced = new string(s.Select(c => umlautReplacer.TryGetValue(c, out value) ? value : c).ToArray());
            foreach (var kv in pseudoUmlautReplacer)
                replaced = replaced.Replace(kv.Key, kv.Value);
            return replaced;
        }
    
        public bool Equals(string x, string y)
        {
            var xChars = ignoreUmlaut(x);
            var yChars = ignoreUmlaut(y);
            return xChars.SequenceEqual(yChars);
        }
    
        public int GetHashCode(string obj)
        {
            return ignoreUmlaut(obj).GetHashCode();
        }
    }
    

    现在你可以使用这个比较器和 Enumerable 方法,如Distinct

    string[] allStrings = new[]{"voest","vost","vöst"};
    bool allEqual = allStrings.Distinct(new IgnoreUmlautComparer()).Count() == 1;
    // --> true
    

相关问题