字符串比較是比較常用的操作,一般出於以下兩個原因比較字符串:
- 判斷相等
- 字符串排序
查詢API判斷字符串相等或排序時,由以下方法:
public override bool Equals(object obj);
public bool Equals(string value);
public static bool Equals(string a, string b);
public bool Equals(string value, StringComparison comparisonType);
public static bool Equals(string a, string b, StringComparison comparisonType);
public static int Compare(string strA, string strB);
public static int Compare(string strA, string strB, bool ignoreCase);
public static int Compare(string strA, string strB, StringComparison comparisonType);
public static int Compare(string strA, string strB, bool ignoreCase, CultureInfo culture);
public static int Compare(string strA, string strB, CultureInfo culture, CompareOptions options);
public static int Compare(string strA, int indexA, string strB, int indexB, int length);
public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase);
public static int Compare(string strA, int indexA, string strB, int indexB, int length, StringComparison comparisonType);
public static int Compare(string strA, int indexA, string strB, int indexB, int length, bool ignoreCase, CultureInfo culture);
public static int Compare(string strA, int indexA, string strB, int indexB, int length, CultureInfo culture, CompareOptions options);
public static int CompareOrdinal(string strA, string strB);
public static int CompareOrdinal(string strA, int indexA, string strB, int indexB, int length);
public int CompareTo(object value);
public int CompareTo(string strB);
發現上述的方法中大多都有StringComparison類型的枚舉,查詢msdn后得到:
現簡單寫一段代碼,測試Compare(string strA, string strB, StringComparison comparisonType)方法。分別用到StringComparison.CurrentCulture 和StringComparison.Ordinal。代碼如下:
static void Main(string[] args) { string strA = "asdfadsfasdfew我ò啊?地?方?的?asd"; string strB = "adsfeaqfaead啊?多à發¢安2德?森-efadsfa"; Stopwatch sw = new Stopwatch(); sw.Start(); for (int i = 0; i < 1000000; i++) { string.Compare(strA, strB, StringComparison.CurrentCulture); } sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); for (int i = 0; i < 1000000; i++) { string.Compare(strA, strB,StringComparison.Ordinal); } sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); for (int i = 0; i < 1000000; i++) { string.CompareOrdinal(strA, strB); } sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); }
執行結果如下:
測試結果非常明顯,StringComparison.Currentculture顯式傳遞了當前語言文化,而傳遞了String.Ordinal則會忽略指定的語言文化,這是執行字符串最快的一種方式。
使用.NET Reflector查看源代碼:
public static int Compare(string strA, string strB, StringComparison comparisonType)
{
if ((comparisonType < StringComparison.CurrentCulture) || (comparisonType > StringComparison.OrdinalIgnoreCase))
{
throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
}
if (strA == strB)
{
return 0;
}
if (strA == null)
{
return -1;
}
if (strB == null)
{
return 1;
}
switch (comparisonType)
{
case StringComparison.CurrentCulture:
return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);
case StringComparison.CurrentCultureIgnoreCase:
return CultureInfo.CurrentCulture.CompareInfo.Compare(strA, strB, CompareOptions.IgnoreCase);
case StringComparison.InvariantCulture:
return CultureInfo.InvariantCulture.CompareInfo.Compare(strA, strB, CompareOptions.None);
case StringComparison.InvariantCultureIgnoreCase:
return CultureInfo.InvariantCulture.CompareInfo.Compare(strA, strB, CompareOptions.IgnoreCase);
case StringComparison.Ordinal:
return CompareOrdinalHelper(strA, strB);
case StringComparison.OrdinalIgnoreCase:
if (!strA.IsAscii() || !strB.IsAscii())
{
return TextInfo.CompareOrdinalIgnoreCase(strA, strB);
}
return CompareOrdinalIgnoreCaseHelper(strA, strB);
}
throw new NotSupportedException(Environment.GetResourceString("NotSupported_StringComparison"));
}
在上例中,同時測試了String的CompareOrdinal方法,效率同樣驚人。查看其源代碼后發現與Compare方法String.Ordinal源代碼一樣,此方法只是Compare方法的一個特例:
public static int CompareOrdinal(string strA, string strB)
{
if (strA == strB)
{
return 0;
}
if (strA == null)
{
return -1;
}
if (strB == null)
{
return 1;
}
return CompareOrdinalHelper(strA, strB);
}
接下來看看String.CompareTo()方法的源代碼:
[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
public int CompareTo(string strB)
{
if (strB == null)
{
return 1;
}
return CultureInfo.CurrentCulture.CompareInfo.Compare(this, strB, CompareOptions.None);
}
與類型參數為StringComparison.CurrentCulture的Compare方法相同。
另外StringComparer也實現了字符串比較方法Compare()方法。直接看源代碼:
public int Compare(object x, object y)
{
if (x == y)
{
return 0;
}
if (x == null)
{
return -1;
}
if (y == null)
{
return 1;
}
string str = x as string;
if (str != null)
{
string str2 = y as string;
if (str2 != null)
{
return this.Compare(str, str2);
}
}
IComparable comparable = x as IComparable;
if (comparable == null)
{
throw new ArgumentException(Environment.GetResourceString("Argument_ImplementIComparable"));
}
return comparable.CompareTo(y);
}
如果程序只將字符串用於內部編碼目的,如路徑名、文件名、URL、環境變量、反射、XML標記等,這些字符串通常只在程序內部使用,不會像用戶展示,應該使用String.Ordinal或者使用String.CompareOrdinal()方法
總結及建議:
- 使用顯示地指定了字符串比較規則的重載函數。一般來說,需要帶有StringComparison類型參數的重載函數
- 在對未知文化的字符串做比較時,使用StringComparison.Ordinal和StringComparison.OrdinallgnoreCase作為默認值,提高性能
- 在像用戶輸出結果時,使用基於StringComparison.CurrentCulture的字符串
- 使用String.Equals的重載版本來測試兩個字符串是否相等。
- 不要使用String.Compare或CompareTo的重載版本來檢測返回值是否為0來判斷字符串是否相等。這兩個函數是用於字符串比較,而非檢查相等性。
- 在字符串比較時,應以String.ToUpperInvariant函數使字符串規范化,而不用ToLowerInvariant方法,因為Microsoft對執行大寫比較的代碼進行了優化。之所以不用ToUpper和ToLower方法,是因為其對語言文化敏感。