java 字符串一样，但是equals为false？

本文转载自查看原文 2021-03-04 09:51 640 JAVA

前言

有时候写代码会遇到一些莫名其妙的问题，两个字符串明明toString()打印一模一样，但是equals就是为false。

问题

直接看代码

public static void main(String[] args) {
   String s1 = "hello‌world‌";
   String s2 = "helloworld";
   System.out.println(s1.equals(s2));
}

这代码应该够简单了，毫无疑问输出true啊。但是我们还是实际操作一下：

这是什么情况，居然输出了false，我多年的java白学了吗?

结论

其实这个问题很简单，因为字符串s1中包含了不可打印字符，可以把两个字符串复制到 QQ/TIM 看一下就知道了，或者直接按F12审查元素也可以看到。或者我们继续在Java代码中查看

public static void main(String[] args) {
    String s1 = "hello‌world‌";
    String s2 = "helloworld";
    System.out.println(s1.equals(s2));
    System.out.println(Arrays.toString(s1.getBytes()));
    System.out.println(Arrays.toString(s2.getBytes()));
}

这下应该很清楚的知道了为什么两个字符串toString()看起来一样，但是equals却为false。

不可见字符从哪来

说一个最常见的场景，window下新建一个test.txt文件（用window自带的记事本），随便写点什么，就“helloworld”吧。保存（另存为）的时候选择UTF-8编码。

这种方式保存的文件，window会在文件头部添加一个字符，叫做BOM（byte-order mark，字节顺序标记）以UTF-8编码时是三个字节，分别是EF BB BF，用来标记这是一个UTF8编码的文件。程序读取文件时，会把BOM头一起读入内存：

public static void compareContent() {
        InputStream is = null;
        try {
            is = new FileInputStream(new File("D:\\test.txt"));
            byte[] buff = new byte[16];
            int nRead = 0;
            StringBuilder sb = new StringBuilder();
            while ((nRead = is.read(buff)) != -1) {
                sb.append(new String(buff, 0, nRead));
            }
            //UTF-8文件读取的字符串
            String fileStr = sb.toString();
            //程序定义的字符串，此处不包含不可打印字符
            String localStr = "helloworld";
            System.out.println(fileStr.toString());
            System.out.println(localStr.toString());
            System.out.println("文件字符串：" + Arrays.toString(fileStr.getBytes()));
            System.out.println("本地字符串：" + Arrays.toString(localStr.getBytes()));
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (is != null) {
                try {
                    is.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }

运行结果如下：

除了前三个字节，后面的内容其实是一模一样的。

解决办法

如果文件非要以UTF-8编码保存，可以有以下几种方法：

保存的时候去掉BOM头（notepad++支持以UTF-8无BOM格式编码）
程序做兼容，兼容代码如下：

public static String deleteUTF8Bom(String fileStr) {
    byte[] UTF8_BOM_BYTES = new byte[]{(byte) 0xEF, (byte) 0xBB, (byte) 0xBF};
    byte[] bytes = fileStr.getBytes();
    if (bytes[0] == UTF8_BOM_BYTES[0]
             && bytes[1] == UTF8_BOM_BYTES[1]
             && bytes[2] == UTF8_BOM_BYTES[2]) {
         return new String(bytes, 3, bytes.length - 3);
     }
    return fileStr;
}

总结

这个问题不太容易发现，但是其实也是属于基础内容。也说明眼见不一定为实，看到的字符串不一定就是真正的字符串。

————————————————
版权声明：本文为CSDN博主「Sicimike」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/Baisitao_/article/details/92667122

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 java基础字符串 “==” 和 “equals” 比较 JAVA 字符串比较: equals() 与 == 从数据库取出两个同样的字符串用equals比较返回false JAVA中字符串比较equals()和equalsIgnoreCase()的区别 java之从字符串比较到==和equals方法区别 JAVA中字符串比较equals()和equalsIgnoreCase()的区别 java之从字符串比较到==和equals方法区别 Java判断字符串相等'=='与'equals'的区别 java正则匹配多个子字符串样例 ==和equals在比较字符串时候的区别