zip4j 解壓中文亂碼問題解決

本文轉載自查看原文 2020-03-23 19:19 2876

在使用zip4j解壓上傳的zip文件時，總會遇到解壓后的文件名中文亂碼，剛開始是使用判斷字符

其實mac系統默認編碼是UTF-8，windows系統默認是GBK，所以在請求的時候判斷操作系統來決定編碼就好了！！！

String header = request.getHeader("user-agent");
        log.info("客戶端的操作系統是 header:{}",header);
        String charset = "GBK";
        if(!StringUtils.isEmpty(header) && header.contains("Mac")){
            charset = "UTF-8";
        }

File zipFile = new File(zip);
ZipFile zFile = new ZipFile(zipFile);
zFile.setFileNameCharset(getEncoding(zip));
if (!zFile.isValidZipFile()) {
    throw new ZipException("壓縮文件不合法,可能被損壞.");
}




/**
     * 判斷該使用哪種編碼方式解壓
     * @param path
     * @return
     * @throws Exception
     */
    private static String getEncoding(String path) throws Exception {
        String encoding = "GBK";
        ZipFile zipFile = new ZipFile(path);
        zipFile.setFileNameCharset(encoding);
        List<FileHeader> list = zipFile.getFileHeaders();
        for (int i = 0; i < list.size(); i++) {
            FileHeader fileHeader = list.get(i);
            String fileName = fileHeader.getFileName();
            if (isMessyCode(fileName)) {
                encoding = "UTF-8";
                break;
            }
        }
        return encoding;
    }

    private static boolean isMessyCode(String str) {
        for (int i = 0; i < str.length(); i++) {
            char c = str.charAt(i);
            // 當從Unicode編碼向某個字符集轉換時，如果在該字符集中沒有對應的編碼，則得到0x3f（即問號字符?）
            // 從其他字符集向Unicode編碼轉換時，如果這個二進制數在該字符集中沒有標識任何的字符，則得到的結果是0xfffd
            if ((int) c == 0xfffd) {
                // 存在亂碼
                return true;
            }
        }
        return false;
    }

但是這種方式剛開始可以，后來就不行了，不知道為啥，今天終於找了個徹底的解決方法：(不是徹底，下面的只是保證mac系統不亂碼，如果是windows系統壓縮的文件上傳還是亂碼！！！！

所以又改回了上面的代碼！！！好尷尬。。。)

轉載自：https://www.jianshu.com/p/5594952e43f7

public static File[] unzip(String zip, String dest, String passwd) throws Exception {
        File zipFile = new File(zip);
        ZipFile zFile = new ZipFile(zipFile);
        zFile.setFileNameCharset(StandardCharsets.UTF_8.name());
        if (!zFile.isValidZipFile()) {
            throw new ZipException("壓縮文件不合法,可能被損壞.");
        }
        File destDir = new File(dest);
        if (destDir.isDirectory() && !destDir.exists()) {
            destDir.mkdir();
        }
        if (zFile.isEncrypted()) {
            zFile.setPassword(passwd.toCharArray());
        }
        zFile.extractAll(dest);

        List<FileHeader> headerList = zFile.getFileHeaders();
        List<File> extractedFileList = new ArrayList<>();
        for (FileHeader fileHeader : headerList) {
            if (!fileHeader.isDirectory()) {
                extractedFileList.add(new File(destDir, getFileNameFromExtraData(fileHeader)));
            }
        }
        File[] extractedFiles = new File[extractedFileList.size()];
        extractedFileList.toArray(extractedFiles);
        return extractedFiles;
    }


public static String getFileNameFromExtraData(FileHeader fileHeader) {
        List<ExtraDataRecord> extraDataRecords = fileHeader.getExtraDataRecords();

        if (!CollectionUtil.isEmpty(extraDataRecords)) {
            for (ExtraDataRecord extraDataRecord : extraDataRecords) {
                long identifier = extraDataRecord.getHeader();
                if (identifier == 0x7075) {
                    byte[] bytes = extraDataRecord.getData();
                    ByteBuffer buffer = ByteBuffer.wrap(bytes);
                    byte version = buffer.get();
                    assert (version == 1);
                    return new String(bytes, 5, buffer.remaining(), StandardCharsets.UTF_8);
                }
            }
        }
        return fileHeader.getFileName();
    }

通過閱讀ZIP的協議文檔，我們可以發現，Info-ZIP Unicode Path Extra Field (0x7075)
這個額外信息可以解決我們的問題,據筆者測試，WinRAR和百度壓縮等使用GBK作為文件編碼的壓縮軟件，
在這個區域會記錄文件名的UTF-8編碼的名稱，但是因為這個字段不是必要字段，文件名使用UTF-8編碼的
MacOS歸檔、Deepin歸檔等軟件不會填充這個信息。
要學習的太多了～。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 nodejs adm-zip 解壓文件中文文件名亂碼問題解決 Java處理ZIP文件的解決方案——Zip4J（不解壓直接通過InputStream形式讀取其中的文件，解決中文亂碼） ZIP4j 壓縮與解壓的實例詳解 ZIP4j 壓縮與解壓的實例詳解 Zip文件中文亂碼問題解決方法(MAC->Windows) Zip文件中文亂碼問題解決方法(MAC->Windows) mpdf中文亂碼問題解決 Xshell中文亂碼問題解決 FtpClient中文亂碼問題解決 docker中文亂碼問題解決