VC++中字符串編碼的轉換

本文轉載自查看原文 2013-12-02 21:41 19983 c++/ Programming/ 字符編碼

在以前VC++6.0中默認的字符集是多字節字符集（MBCS：Multi-Byte Character Set），而VS2005及以后默認的字符集是Unicode，這樣導致以前在VC6.0中非常簡單實用的各類字符操作和函數在VS2010環境下運行時會報各種各樣的錯誤。

字符集可以通過工程屬性修改：“工程-屬性-字符集”。

CString在Unicode和多字節字符集下的區別：CString 是基於 TCHAR 數據類型的。如果為程序的生成定義了符號 _UNICODE，則會將 TCHAR 定義為 wchar_t 類型（一個 16 位的字符編碼類型）；否則，會將它定義為 char（普通的 8 位字符編碼）。於是，在 Unicode 下，CString 由 16 位字符組成。如果沒有 Unicode，它們則由 char 類型的字符組成（來自MSDN）。

以下是CString在Visual C++ .NET 2010環境中Unicode字符集下CString和char *之間相互轉換的幾種方法，其實也就是Unicode字符集與MBCS字符集轉換。

1.Unicode下CString轉換為char *

方法一：使用API：WideCharToMultiByte進行轉換

CString str = _T("你好，世界！Hello,World");
//注意：以下n和len的值大小不同,n是按字符計算的，len是按字節計算的
int n = str.GetLength();  
//獲取寬字節字符的大小，大小是按字節計算的
int len = WideCharToMultiByte(CP_ACP,0,str,str.GetLength(),NULL,0,NULL,NULL);
//為多字節字符數組申請空間，數組大小為按字節計算的寬字節字節大小
char * pFileName = new char[len+1];   //以字節為單位
//寬字節編碼轉換成多字節編碼
WideCharToMultiByte(CP_ACP,0,str,str.GetLength(),pFileName,len,NULL,NULL);
pFileName[len+1] = ‘\0‘;   //多字節字符以’\0′結束

方法二：使用函數：T2A、W2A

CString str = _T("你好，世界！Hello,World");
//聲明標識符
USES_CONVERSION;
//調用函數，T2A和W2A均支持ATL和MFC中的字符轉換
char * pFileName = T2A(str);   
//char * pFileName = W2A(str); //也可實現轉換

2、Unicode下char *轉換為CString

方法一：使用API：MultiByteToWideChar進行轉換

char * pFileName = "你好，世界！Hello,World";
//計算char *數組大小，以字節為單位，一個漢字占兩個字節
int charLen = strlen(pFileName);
//計算多字節字符的大小，按字符計算。
int len = MultiByteToWideChar(CP_ACP,0,pFileName,charLen,NULL,0);
//為寬字節字符數組申請空間，數組大小為按字節計算的多字節字符大小
TCHAR *buf = new TCHAR[len + 1];
//多字節編碼轉換成寬字節編碼
MultiByteToWideChar(CP_ACP,0,pFileName,charLen,buf,len);
buf[len] = ‘\0‘; //添加字符串結尾，注意不是len+1
//將TCHAR數組轉換為CString
CString pWideChar;
pWideChar.Append(buf);
//刪除緩沖區
delete []buf;

方法二：使用函數：A2T、A2W

char * pFileName = "你好，世界！Hello,World"; 
USES_CONVERSION;
CString s = A2T(pFileName);
//CString s = A2W(pFileName);

下面是在網上看到的轉換代碼，注意函數MultiByteToWideChar()和WideCharToMultiByte()第四個參數傳入-1時表示第三個參數傳入的字符串是null結尾的（null-terminated），這時候返回的字節數（字符數）就包含了null，詳情看MSDN。

#include "stdafx.h"

#include <windows.h>
#include <iostream>
#include <vector>
#include <atlstr.h>

using namespace std;

std::wstring UT2WC(const char* buf)
{
    int len = MultiByteToWideChar(CP_UTF8, 0, buf, -1, NULL, 0);
    std::vector<wchar_t> unicode(len);
    MultiByteToWideChar(CP_UTF8, 0, buf, -1, &unicode[0], len);
    return std::wstring(&unicode[0]);
}

std::string WC2UT(const wchar_t* buf)
{
    int len = WideCharToMultiByte(CP_UTF8, 0, buf, -1, NULL, 0, NULL, NULL);
    std::vector<char> utf8(len);
    WideCharToMultiByte(CP_UTF8, 0, buf, -1, &utf8[0], len, NULL, NULL);
    return std::string(&utf8[0]);
}

std::wstring MB2WC(const char* buf)
{
    int len = MultiByteToWideChar(CP_ACP, 0, buf, -1, NULL, 0);
    std::vector<wchar_t> unicode(len);
    MultiByteToWideChar(CP_ACP, 0, buf, -1, &unicode[0], len);
    return std::wstring(&unicode[0]);
}

std::string WC2MB(const wchar_t* buf)
{
    int len = WideCharToMultiByte(CP_ACP, 0, buf, -1, NULL, 0, NULL, NULL);
    std::vector<char> utf8(len);
    WideCharToMultiByte(CP_ACP, 0, buf, -1, &utf8[0], len, NULL, NULL);
    return std::string(&utf8[0]);
}


int main()
{
    setlocale(LC_ALL, "");
    CString str = "UNICODE轉換成UTF-8";
    //cout << WC2UT(str).c_str() << endl; //Unicode下
    BSTR bstr = str.AllocSysString();
    cout << WC2UT(bstr).c_str() << endl; //多字符集下/Unicode下
    
    std::string s = WC2UT(bstr);
    SysFreeString(bstr);
    std::wstring ws = UT2WC(s.c_str());
    wcout<< ws.c_str() << endl;

    const wchar_t* s1 = L"UNICODE轉換成UTF-8";
    cout << WC2UT(s1).c_str() << endl;

    const char* s2 = "ANSI轉換成UNICODE";
    wcout << MB2WC(s2).c_str() << endl;
    
    const wchar_t* s3 = L"UNICODE轉換成ANSI";
    cout << WC2MB(s3).c_str() << endl;

    return 0;
}

參考：

http://msdn.microsoft.com/en-us/library/87zae4a3(v=vs.80).aspx

WideCharToMultiByte：
http://msdn.microsoft.com/en-us/library/windows/desktop/dd374130(v=vs.85).aspx

MultiByteToWideChar：
http://msdn.microsoft.com/en-us/library/windows/desktop/dd319072(v=vs.85).aspx

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 VC++ 中使用 std::string 轉換字符串編碼 python中字符串編碼轉換 VC++如何判斷字符串是否有全為數字 VC++ 字符串操作學習總結 VC++和C語言中常見數據類型轉換為字符串的方法 VC字符串轉換（轉） Python3中轉換字符串編碼 C#中的字符串及其編碼轉換 win7 64 VC++ ado方式連接access 連接字符串字符串轉換UTF-8編碼