MFC/C++用Char*（Byte*）讀取文件utf-8的文件亂碼----解碼

MFC/C++用Char（Byte）讀取文件utf-8的文件亂碼----解碼

本文轉載自查看原文 2020-07-15 19:01 509

//utf8Str：以字節（char*或者Byte*）讀取中文的字符串（亂碼）
CString UTF8toUnicode(const char* utf8Str)
{
    UINT theLength=strlen(utf8Str);
    return UTF8toUnicode(utf8Str,theLength);
}
 
CString UTF8toUnicode(const char* utf8Str,UINT length)
{
    CString unicodeStr;
    unicodeStr=_T("");
 
    if (!utf8Str)
        return unicodeStr;
 
    if (length==0)
        return unicodeStr;
 
  
    WCHAR chr=0;//一個中文字符
    for (UINT i=0;i<length;)
    {
        //UTF8的三種中文格式
        if ((0x80&utf8Str[i])==0) //只占用一個字節
        {
            chr=utf8Str[i];
            i++;
        }
        else if((0xE0&utf8Str[i])==0xC0) //占用兩個字節
        {
            chr =(utf8Str[i+0]&0x3F)<<6;
            chr|=(utf8Str[i+1]&0x3F);
            i+=2;
        }
        else if((0xF0&utf8Str[i])==0xE0)//占用三個字節
        {
            chr =(utf8Str[i+0]&0x1F)<<12;
            chr|=(utf8Str[i+1]&0x3F)<<6;
            chr|=(utf8Str[i+2]&0x3F);
            i+=3;
        }
      
        else 
        {
            return unicodeStr;
        }
        unicodeStr.AppendChar(chr);
    }
 
    return unicodeStr;
}

UTF-8百度百科仔細研究!!!

轉自：https://blog.csdn.net/chenlu5201314/article/details/8912707?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-3

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 c++ 讀取 utf-8 文件到 string [C#.net]處理UTF-8文件亂碼 Python讀取txt文件報錯：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0 C++ 讀寫 UTF-8 with bom 文本文件上傳文件亂碼，GBK轉UTF-8 文件讀取錯誤UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 884: invalid start byte Pandas讀取文件報錯UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte 【Python】讀取cvs文件報錯：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 6: invalid start byte Java 解決BufferedReader讀取UTF-8文件中文亂碼解決BufferedReader讀取UTF-8文件中文亂碼