java 字符串一樣，但是equals為false？

本文轉載自查看原文 2021-03-04 09:51 640 JAVA

前言

有時候寫代碼會遇到一些莫名其妙的問題，兩個字符串明明toString()打印一模一樣，但是equals就是為false。

問題

直接看代碼

public static void main(String[] args) {
   String s1 = "hello‌world‌";
   String s2 = "helloworld";
   System.out.println(s1.equals(s2));
}

這代碼應該夠簡單了，毫無疑問輸出true啊。但是我們還是實際操作一下：

這是什么情況，居然輸出了false，我多年的java白學了嗎?

結論

其實這個問題很簡單，因為字符串s1中包含了不可打印字符，可以把兩個字符串復制到 QQ/TIM 看一下就知道了，或者直接按F12審查元素也可以看到。或者我們繼續在Java代碼中查看

public static void main(String[] args) {
    String s1 = "hello‌world‌";
    String s2 = "helloworld";
    System.out.println(s1.equals(s2));
    System.out.println(Arrays.toString(s1.getBytes()));
    System.out.println(Arrays.toString(s2.getBytes()));
}

這下應該很清楚的知道了為什么兩個字符串toString()看起來一樣，但是equals卻為false。

不可見字符從哪來

說一個最常見的場景，window下新建一個test.txt文件（用window自帶的記事本），隨便寫點什么，就“helloworld”吧。保存（另存為）的時候選擇UTF-8編碼。

這種方式保存的文件，window會在文件頭部添加一個字符，叫做BOM（byte-order mark，字節順序標記）以UTF-8編碼時是三個字節，分別是EF BB BF，用來標記這是一個UTF8編碼的文件。程序讀取文件時，會把BOM頭一起讀入內存：

public static void compareContent() {
        InputStream is = null;
        try {
            is = new FileInputStream(new File("D:\\test.txt"));
            byte[] buff = new byte[16];
            int nRead = 0;
            StringBuilder sb = new StringBuilder();
            while ((nRead = is.read(buff)) != -1) {
                sb.append(new String(buff, 0, nRead));
            }
            //UTF-8文件讀取的字符串
            String fileStr = sb.toString();
            //程序定義的字符串，此處不包含不可打印字符
            String localStr = "helloworld";
            System.out.println(fileStr.toString());
            System.out.println(localStr.toString());
            System.out.println("文件字符串：" + Arrays.toString(fileStr.getBytes()));
            System.out.println("本地字符串：" + Arrays.toString(localStr.getBytes()));
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (is != null) {
                try {
                    is.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }

運行結果如下：

除了前三個字節，后面的內容其實是一模一樣的。

解決辦法

如果文件非要以UTF-8編碼保存，可以有以下幾種方法：

保存的時候去掉BOM頭（notepad++支持以UTF-8無BOM格式編碼）
程序做兼容，兼容代碼如下：

public static String deleteUTF8Bom(String fileStr) {
    byte[] UTF8_BOM_BYTES = new byte[]{(byte) 0xEF, (byte) 0xBB, (byte) 0xBF};
    byte[] bytes = fileStr.getBytes();
    if (bytes[0] == UTF8_BOM_BYTES[0]
             && bytes[1] == UTF8_BOM_BYTES[1]
             && bytes[2] == UTF8_BOM_BYTES[2]) {
         return new String(bytes, 3, bytes.length - 3);
     }
    return fileStr;
}

總結

這個問題不太容易發現，但是其實也是屬於基礎內容。也說明眼見不一定為實，看到的字符串不一定就是真正的字符串。

————————————————
版權聲明：本文為CSDN博主「Sicimike」的原創文章，遵循CC 4.0 BY-SA版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/Baisitao_/article/details/92667122

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 java基礎字符串 “==” 和 “equals” 比較 JAVA 字符串比較: equals() 與 == 從數據庫取出兩個同樣的字符串用equals比較返回false JAVA中字符串比較equals()和equalsIgnoreCase()的區別 java之從字符串比較到==和equals方法區別 JAVA中字符串比較equals()和equalsIgnoreCase()的區別 java之從字符串比較到==和equals方法區別 Java判斷字符串相等'=='與'equals'的區別 java正則匹配多個子字符串樣例 ==和equals在比較字符串時候的區別