可能重复:如何识别拼写不同的相似单词
我试图在比较这3个字符串时返回true:'voest','vost'和'vöst'(德国文化),因为它是同一个单词 . (事实上,只有oe和ö是相同的,但是例如对于DB校对CI它是正确的,因为'vost'是错误的'voest')
无论我为该方法提供什么参数,string.Compare(..)/ string.Equals(..)都返回false .
如何使string.Compare()/ Equals(..)返回true?
您可以在比较中尝试IgnoreNonSpace选项 . 它不会解决voest - vost,但会对vost-vöst有所帮助 .
int a = new CultureInfo("de-DE").CompareInfo.Compare("vost", "vöst", CompareOptions.IgnoreNonSpace); // a = 0; strings are equal.
您可以创建一个忽略变音符号的自定义比较器:
class IgnoreUmlautComparer : IEqualityComparer<string> { Dictionary<char, char> umlautReplacer = new Dictionary<char, char>() { {'ä','a'}, {'Ä','A'}, {'ö','o'}, {'Ö','O'}, {'ü','u'}, {'Ü','U'}, }; Dictionary<string, string> pseudoUmlautReplacer = new Dictionary<string, string>() { {"ae","a"}, {"Ae","A"}, {"oe","o"}, {"Oe","O"}, {"ue","u"}, {"Ue","U"}, }; private IEnumerable<char> ignoreUmlaut(string s) { char value; string replaced = new string(s.Select(c => umlautReplacer.TryGetValue(c, out value) ? value : c).ToArray()); foreach (var kv in pseudoUmlautReplacer) replaced = replaced.Replace(kv.Key, kv.Value); return replaced; } public bool Equals(string x, string y) { var xChars = ignoreUmlaut(x); var yChars = ignoreUmlaut(y); return xChars.SequenceEqual(yChars); } public int GetHashCode(string obj) { return ignoreUmlaut(obj).GetHashCode(); } }
现在你可以使用这个比较器和 Enumerable 方法,如Distinct:
Enumerable
string[] allStrings = new[]{"voest","vost","vöst"}; bool allEqual = allStrings.Distinct(new IgnoreUmlautComparer()).Count() == 1; // --> true
2 回答
您可以在比较中尝试IgnoreNonSpace选项 . 它不会解决voest - vost,但会对vost-vöst有所帮助 .
您可以创建一个忽略变音符号的自定义比较器:
现在你可以使用这个比较器和
Enumerable
方法,如Distinct: