Unicode編碼與中文互轉

本文轉載自查看原文 2018-09-29 11:21 6769 常用代碼片段/ unicode轉換為中文/ unicode與中文相互轉換/ 中文轉化為unicode

 1 /**
 2      * unicode編碼轉換為漢字
 3      * @param unicodeStr 待轉化的編碼
 4      * @return 返回轉化后的漢子
 5      */
 6     public static String UnicodeToCN(String unicodeStr) {
 7         Pattern pattern = Pattern.compile("(\\\\u(\\p{XDigit}{4}))");
 8         Matcher matcher = pattern.matcher(unicodeStr);
 9         char ch;
10         while (matcher.find()) {
11             //group
12             String group = matcher.group(2);
13             //ch:'李四' 
14             ch = (char) Integer.parseInt(group, 16);
15             //group1 
16             String group1 = matcher.group(1);
17             unicodeStr = unicodeStr.replace(group1, ch + "");
18         }
19         
20         return unicodeStr.replace("\\", "").trim();
21     }

/**
     * 漢字轉化為Unicode編碼
     * @param CN 待轉化的中文
     * @return 返回轉化之后的unicode編碼
     */
    public static String CNToUnicode(String CN) {
        
        try {
            StringBuffer out = new StringBuffer("");
            //直接獲取字符串的unicode二進制
            byte[] bytes = CN.getBytes("unicode");
            //然后將其byte轉換成對應的16進制表示即可
            for (int i = 0; i < bytes.length - 1; i += 2) {
                out.append("\\u");
                String str = Integer.toHexString(bytes[i + 1] & 0xff);
                for (int j = str.length(); j < 2; j++) {
                    out.append("0");
                }
                String str1 = Integer.toHexString(bytes[i] & 0xff);
                out.append(str1);
                out.append(str);
            }
            return out.toString();
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
            return null;
        }

測試

1 public static void main(String[] args) {
2         String Unicodestr = "\\u674e\\u56db";
3         System.out.println("unicode為\\u674e\\u56db對應的中文是："+Util.UnicodeToCN(Unicodestr));
4         String CNStr = "李四";
5         System.out.println("李四對應的Unicode編碼是："+Util.CNToUnicode(CNStr));
6         
7     }

測試結果：

這里可能需要解釋的是：\ufeff。\ufeff表示的是UTF-16（大端序）的編碼方式。在顯示的時候可以將\ufeff過濾掉

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 php 中文unicode 互轉 php 中文unicode 互轉 php 中文unicode 互轉常見的中文(Unicode編碼) C# Unicode與中文互轉 Unicode編碼：保存中文cookie js字符串編碼和unicode編碼互轉 Bash shell將Unicode編碼轉為中文 Unicode中文和特殊字符的編碼范圍 C#中文和UNICODE編碼轉換