编码转换的方法(UNICODE/ASCII/UTF-8)

本文转载自查看原文 2021-12-16 10:43 1512 C++

参考了网上一些方法：所谓的短字符,就是用8bit来表示的字符,典型的应用是ASCII码.  而宽字符,顾名思义,就是用16bit表示的字符,典型的有UNICODE.
   常用的代码页有CP_ACP和CP_UTF8两个。
   使用CP_ACP代码页就实现了ANSI与Unicode之间的转换。
   使用CP_UTF8代码页就实现了UTF-8与Unicode之间的转换。

1. ASCII to Unicode(CP_ACP)

std::wstring string2wstring_CP_ACP(std::string str)
{
  std::wstring result = L"";
  int len = MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.size(), NULL, 0);
  TCHAR* buffer = new TCHAR[len + 1];//保存到Unicode串
  MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.size(), buffer, len);
  buffer[len] = '\0';
  result.append(buffer);
  delete[] buffer;
  return result;
}

2. Unicode to ASCII(CP_ACP)

std::string wstring2string_CP_ACP(std::wstring wstr)
{
  std::string result = "";
  int len = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), wstr.size(), NULL, 0, NULL, NULL);
  char* buffer = new char [len + 1]; //保存ANSI串
  WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), wstr.size(), buffer, len,
                      NULL, NULL);
  buffer[len] = '\0';
  result.append(buffer);
  delete[] buffer;
  return result;
}

3. UTF-8 to Unicode(CP_UTF8)

std::wstring string2wstring_CP_UTF8(std::string str)
{
  std::wstring result = L"";
  int len = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.size(), NULL, 0);
  TCHAR* buffer = new TCHAR[len + 1];//保存到Unicode串
  MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.size(), buffer, len);
  buffer[len] = '\0';
  result.append(buffer);
  delete[] buffer;
  return result;
}

4. Unicode to UTF-8(CP_UTF8)

std::string wstring2string_CP_UTF8(std::wstring wstr)
{
  std::string result = "";
  int len = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.size(), NULL, 0, NULL, NULL);
  char* buffer = new char [len + 1]; 
  WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.size(), buffer, len,
                      NULL, NULL);
  buffer[len] = '\0';
  result.append(buffer);
  delete[] buffer;
  return result;
}

免责声明！

本站转载的文章为个人学习借鉴使用，本站对版权不负任何法律责任。如果侵犯了您的隐私权益，请联系本站邮箱yoyou2525@163.com删除。

猜您在找 ascii、unicode、utf-8、gbk编码区别及转换 JS字符编码----ASCII，Unicode 和 UTF-8 字符与编码：ASCII码、Unicode和UTF-8 字符编码笔记：ASCII，Unicode和UTF-8 ASCII编码、GBK编码，Unicode编码和UTF-8。字符编码笔记：ASCII，Unicode和UTF-8 转字符编码(ASCII，Unicode和UTF-8) 和大小端 ASCII、Unicode、UTF-8 字符串和编码字符编码中ASCII、Unicode和UTF-8的区别字符编码ANSI和ASCII区别、Unicode和UTF-8区别