聲明:本文關於MySQL中文亂碼問題的解決方案均基於Windows 10操作系統,如果是Linux系統會有較多不適用之處,請謹慎參考。
一、MySQL中文亂碼情況
1. sqlDevelper遠程登陸MySQL客戶端中文亂碼
sqlDeveloper操作MySQL中文亂碼
2. command-line登陸本地MySQL客戶端中文亂碼
控制台操作MySQL中文亂碼
二、MySQL中文亂碼產生原因
Windwos中文系統默認的字符編碼集是gbk(擴展國標碼,包括簡體中文、繁體中文、朝鮮語、日本語等東亞語言),Linux系統默認的字符編碼集為utf-8。然而當我們在安裝MySQL的時候往往沒注意,直接使用了MySQL的默認字符編碼集latin1(僅支持英語即其它西歐語言),這就是導致我們在使用MySQL數據庫查看記錄時出現中文亂碼的根本原因所在。
默認 MySQL Server Instance Configuration
三、MySQL中文亂碼可能解決方案
1. 重新安裝MySQL
說說產生這種想法的原因:通過google搜索,嘗試了大致10種左右互聯網上提到的解決MySQL中文亂碼方法,但是沒有一種是能夠完美解決我遇到問題的合適方案。經過不知道多少次修改MySQL配置文件my.ini文件的字符集,總是出現莫名奇妙的問題,用一句話說就是"按下葫蘆起了瓢"。問題詳細描述見 修改MySQL配置文件
我們重新配置數據庫,當然這種重新配置數據庫實例是不會刪除原來已存在的數據庫數據,只是修改一下原有的數據庫配置信息。
自定義 MySQL Server Instance Configuration 圖1
自定義MySQL Server Instance Configuration 圖2
我們選擇reconfigurate如上MySQL Server Instance Configuration圖2所示,可以選擇Best Support For Multilingualism(對多語言的最佳支持)或者Manual Selected Default Character Set/Collation(手動選擇默認字符集/排序方式)中選擇gbk或utf-8。本次選擇的字符集是:utf-8
然后經過與原來配置MySQL相同的步驟,然后啟動MySQL服務(即運行mysqld)。接下去運行MySQL客戶端,查看相應數據庫情況,觀察中文亂碼問題是否解決。
MySQL客戶端中文亂碼
不幸的是我們獲得的仍然亂碼。真的是我們的方案有問題?分析下本次嘗試失敗的原因: 雖然重新設置了默認的字符集為utf-8,即修改了Index.xml預編譯字符集配置文件,也修改了my.ini啟動配置文件(包括default-charater-set=utf-8和default-set-server=utf-8),但是需要讀者明白的是原來數據庫的字符集已經設定,並沒有被改變。
上述理論如何才能判斷是正確的呢?那我就嘗試一下重新建立一個MySQL數據庫,數據庫命名為teacher,表命名為therinfo。
呵呵,又出錯了。正當黔驢技窮的時候,又想到了命令行並不支持utf-8格式(中文版windows的命令行默認支持GBK或ascii)。那么明顯,如果我們從命令行輸入中文字符的時候,在MySQL服務器看來就是驢唇不對馬嘴,存的內容天知道是什么玩意。為了驗證中文windows命令行真的不支持utf-8這個理論,我又嘗試了JDBC操作MySQL數據庫,然后再命令行輸出。輸入情況如下所示。
JDBC操作數據庫在中文版windows命令行輸出數據庫內容
可以輸出,只不過輸出的中文是亂碼。顯然這是命令行數據庫客戶端與數據庫服務器端所支持編碼不同導致。那么可以說,用JDBC操作數據庫后,在MySQL數據中的內容確實是utf-8數據格式的,但是在命令行輸出看成了GBK編碼,導致仍然是亂碼。
(此部分建議先看MySQL原始配置文件)這樣看來,似乎只要把客戶端編碼格式改為GBK就能夠順利解決亂碼問題。為了避免修改配置文件出現的錯誤,我特意查看了Index.xml預編譯字符集是否支持GBK 。查詢結果如下所示。存在gbk編碼,說明重新啟動MySQL服務過程中是絕對不會出現1067錯誤的。
修改的情況為[mysql] default-character-set=gbk,然后保持[mysqld] default-server-set=utf-8不變或改成gbk編碼。顯示的結果為正確,問題順利解決。
命令行客戶端中文亂碼問題解決結果
從上述步驟中,我們小結一下情況:
a) [mysql] default-character-set 是客戶端默認字符集。如果采用命令行作為客戶端,此字符集必須和命令行默認字符集能夠匹配,否則出現中文亂碼。
b) [mysql]和[mysqld]數據字符集可以不同,因為MySQL服務器配置文件Index.xml如果有預編譯編碼,服務器是能夠識別客戶端數據編碼並轉換成服務器端編碼儲存在相應的表中。關鍵是要在Index.xml配置文件中存在此預編譯字符集。
基於這套經過實踐驗證的理論,開始嘗試解決sqlDevelopor遠程登陸MySQL客戶端出現中文亂碼問題。關於sqlDevelopor遠程登陸MySQL數據庫步驟詳情參考文章《sqlDeveloper遠程登陸MySQL數據管理系統》。必須明確一點,sqlDeveloper首先連接MySQL客戶端,然后通過客戶端再連接MySQL服務器。因此,必須保證sqlDeveloper和MySQL客戶端字符編碼集相同。
首先查看sqldevelopor的默認字符編碼集:[幫助]-->[關於]-->[屬性]-->[file.encoding],查到sqlDevelopor默認顯示字符編碼集為GBK。情況具體如下圖所示。
Oracle SQL Developer 默認字符編碼集
Oracle SQL Developer客戶端和其要遠程登陸的Command line客戶端編碼相同,均為gbk,因此數據通訊以及顯示沒有問題。查詢結果如下圖所示。
Oracle SQL Developer客戶端中文亂碼問題解決結果
2. 修改MySQL原始配置文件
# MySQL Server Instance Configuration File
# 翻譯:MySQL服務器實例配置文件 # ---------------------------------------------------------------------- # Generated by the MySQL Server Instance Configuration Wizard # 翻譯:由MySQL服務器實例配置向導自動生成 # # Installation Instructions
# 翻譯:安裝指南 # ---------------------------------------------------------------------- # # On Linux you can copy this file to /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options # (@localstatedir@ for this installation) or to # ~/.my.cnf to set user-specific options. # # On Windows you should keep this file in the installation directory # of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To # make sure the server reads the config file use the startup option # "--defaults-file". # 翻譯:在Windows中必須將該文件(my.ini)保存在MySQL服務器的安裝目錄中。
# 確保服務器在使用啟動項"--defaults-file"時入讀該配置文件。
# # To run the server from the command line, execute this in a # command line shell, e.g.
# 翻譯:為了能命令行運行服務器,需在命令行shell中執行如下命令。 # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # 注意:不同的安裝路徑,參數字符串內容一般不同。主要是找到my.ini絕對路徑即可。
# # To install the server as a Windows service manually, execute this in a # command line shell, e.g.
# 翻譯:為了能夠將MySQL服務器手動安裝成Windows服務,需在命令行shell中執行以下命令。 # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # And then execute this in a command line shell to start the server, e.g. # 翻譯:然后在命令行shell中執行如下命令啟動MySQL服務器。 # net start MySQLXY
#
# # Guildlines for editing this file
# 翻譯:編輯本配置文件的指導路線 # ---------------------------------------------------------------------- # # In this file, you can use all long options that the program supports. # If you want to know the options a program supports, start the program # with the "--help" option. # # More detailed information about the individual options can also be # found in the manual. # # # CLIENT SECTION
# 翻譯:客戶端部分 # ---------------------------------------------------------------------- # # The following options will be read by MySQL client applications. # Note that only client applications shipped by MySQL are guaranteed # to read this section. If you want your own MySQL client program to # honor these values, you need to specify it as an option during the # MySQL client library initialization.
#
# [client] port=3306
[mysql] default-character-set=latin1 # 注意:此處需修改為gbk或utf-8 # SERVER SECTION
# 翻譯:服務器端部分 # ---------------------------------------------------------------------- # # The following options will be read by the MySQL Server. Make sure that # you have installed the server correctly (see above) so it reads this # file. # 翻譯:下列選項將會被MySQL服務器讀取。
# 為了MySQL服務器能夠讀取本文件,確保已經正確安裝MySQL服務器。 [mysqld] # The TCP/IP Port the MySQL Server will listen on
# 翻譯:MySQL服務器將會監聽的TCP/IP端口 port=3306 #Path to installation directory. All paths are usually resolved relative to this.
# 翻譯:安裝目錄路徑。所有的路徑解析通常相對的就是安裝目錄路徑。 basedir="C:/Program Files/MySQL/MySQL Server 5.5/" # 注意:basedir基礎路徑,即所有相對路徑的基礎路徑。不同安裝情況,具體有略微區別。
#Path to the database root
# 翻譯:數據庫root用戶路徑 datadir="C:/ProgramData/MySQL/MySQL Server 5.5/Data/" # The default character set that will be used when a new schema or table is # created and no character set is defined
# 翻譯:當一個新的schema或者tableb被創建,並且沒有字符集被定義時將使用如下默認字符集定義。 character-set-server=latin1 # 注意:此處需要改動為gbk或utf-8。
# The default storage engine that will be used when create new tables when
# 翻譯:在創建新的表時,如下默認儲存引擎將會被使用 default-storage-engine=INNODB # 注意:INNODB是MySQL的數據庫引擎之一,為MySQL AB公司發布binary標准之一,原本由Innobase Oy公司所開發,后被Oracle並購。
# Set the SQL mode to strict sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" # The maximum amount of concurrent sessions the MySQL server will # allow. One of these connections will be reserved for a user with # SUPER privileges to allow the administrator to login even if the # connection limit has been reached. max_connections=100 # Query cache is used to cache SELECT results and later return them # without actual executing the same query once again. Having the query # cache enabled may result in significant speed improvements, if your # have a lot of identical queries and rarely changing tables. See the # "Qcache_lowmem_prunes" status variable to check if the current value # is high enough for your load. # Note: In case your tables change very often or if your queries are # textually different every time, the query cache may result in a # slowdown instead of a performance improvement. query_cache_size=0 # The number of open tables for all threads. Increasing this value # increases the number of file descriptors that mysqld requires. # Therefore you have to make sure to set the amount of open files # allowed to at least 4096 in the variable "open-files-limit" in # section [mysqld_safe] table_cache=256 # Maximum size for internal (in-memory) temporary tables. If a table # grows larger than this value, it is automatically converted to disk # based table This limitation is for a single table. There can be many # of them. tmp_table_size=18M # How many threads we should keep in a cache for reuse. When a client # disconnects, the client's threads are put in the cache if there aren't # more than thread_cache_size threads from before. This greatly reduces # the amount of thread creations needed if you have a lot of new # connections. (Normally this doesn't give a notable performance # improvement if you have a good thread implementation.) thread_cache_size=8 #*** MyISAM Specific options # The maximum size of the temporary file MySQL is allowed to use while # recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE. # If the file-size would be bigger than this, the index will be created # through the key cache (which is slower). myisam_max_sort_file_size=100G # If the temporary file used for fast index creation would be bigger # than using the key cache by the amount specified here, then prefer the # key cache method. This is mainly used to force long character keys in # large tables to use the slower key cache method to create the index. myisam_sort_buffer_size=35M # Size of the Key Buffer, used to cache index blocks for MyISAM tables. # Do not set it larger than 30% of your available memory, as some memory # is also required by the OS to cache rows. Even if you're not using # MyISAM tables, you should still set it to 8-64M as it will also be # used for internal temporary disk tables. key_buffer_size=25M # Size of the buffer used for doing full table scans of MyISAM tables. # Allocated per thread, if a full scan is needed. read_buffer_size=64K read_rnd_buffer_size=256K # This buffer is allocated when MySQL needs to rebuild the index in # REPAIR, OPTIMZE, ALTER table statements as well as in LOAD DATA INFILE # into an empty table. It is allocated per thread so be careful with # large settings. sort_buffer_size=256K #*** INNODB Specific options *** # Use this option if you have a MySQL server with InnoDB support enabled # but you do not plan to use it. This will save memory and disk space # and speed up some things. # skip-innodb # Additional memory pool that is used by InnoDB to store metadata # information. If InnoDB requires more memory for this purpose it will # start to allocate it from the OS. As this is fast enough on most # recent operating systems, you normally do not need to change this # value. SHOW INNODB STATUS will display the current amount used. innodb_additional_mem_pool_size=2M # If set to 1, InnoDB will flush (fsync) the transaction logs to the # disk at each commit, which offers full ACID behavior. If you are # willing to compromise this safety, and you are running small # transactions, you may set this to 0 or 2 to reduce disk I/O to the # logs. Value 0 means that the log is only written to the log file and # the log file flushed to disk approximately once per second. Value 2 # means the log is written to the log file at each commit, but the log # file is only flushed to disk approximately once per second. innodb_flush_log_at_trx_commit=1 # The size of the buffer InnoDB uses for buffering log data. As soon as # it is full, InnoDB will have to flush it to disk. As it is flushed # once per second anyway, it does not make sense to have it very large # (even with long transactions). innodb_log_buffer_size=1M # InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and # row data. The bigger you set this the less disk I/O is needed to # access data in tables. On a dedicated database server you may set this # parameter up to 80% of the machine physical memory size. Do not set it # too large, though, because competition of the physical memory may # cause paging in the operating system. Note that on 32bit systems you # might be limited to 2-3.5G of user level memory per process, so do not # set it too high. innodb_buffer_pool_size=47M # Size of each log file in a log group. You should set the combined size # of log files to about 25%-100% of your buffer pool size to avoid # unneeded buffer pool flush activity on log file overwrite. However, # note that a larger logfile size will increase the time needed for the # recovery process. innodb_log_file_size=24M # Number of threads allowed inside the InnoDB kernel. The optimal value # depends highly on the application, hardware as well as the OS # scheduler properties. A too high value may lead to thread thrashing. innodb_thread_concurrency=10
當默認字符集/排序方式為latin1時,修改配置文件中遇到的問題集錦:
2.1)MySQL默認字符集,客戶端、服務器端均設置為gbk
[mysql] default-charactter-set=gbk [mysqld] default-set-server=gbk
關閉MySQL服務后,再次啟動MySQL服務正常。但是輸入SQL語句SELECT * FROM XXX后顯示的仍然為亂碼。
2.2)MySQL默認字符集,客戶端設置為utf-8,服務器設置為gbk
[mysql] default-charactter-set=utf-8 [mysqld] default-set-server=gbk
關閉MySQL服務后,再次啟動MySQL服務正常。但是在進入命令行客戶端時報錯,報錯信息如下所示:
mysql: Character set 'utf-8' is not a compiled character set and is not specified in the 'C:\Program Files\MySQL\MySQL Server 5.5\share\charsets\Index.xml' file ERROR 2019 (HY000): Can't initialize character set utf-8 (path: C:\Program Files\MySQL\MySQL Server 5.5\share\charsets\)
mysql客戶端(即用mysql -u root -p登錄的客戶端):字符集'utf-8'非已編譯過字符集,在Index.xml配置文件中沒有被指定。錯誤的原因是無法初始化'utf-8'。
2.3)MySQL默認字符集,客戶端設置為utf-8,服務器設置為utf-8
[mysql] default-charactter-set=utf-8 [mysqld] default-set-server=utf-8
關閉MySQL服務后,再次嘗試啟動MySQL服務,此時無法啟動Windows MySQL服務,出現錯誤1067。連查詢的是否為亂碼的機會都沒有給。
MySQL服務啟動失敗
2.1~2.3問題分析詳見 MySQL預編譯字符集問題
注意:本文所有修改配置文件的問題都是基於在修改配置文件重新啟動(先關閉,再啟動。啟動的方式手動或自動)MySQL服務(即啟動mysqld程序)。如下圖所示:
MySQL服務
3. 命令行修改MySQL字符集
如果只需臨時使用MySQL,還可以在命令行臨時修改配置。這種方式在每次重新啟動命令行后都必須重新修改,因此不推薦。google搜索出來的解決方案,很大一部分采用這種治標不治本的方法,當然這也不失為一種解決方案,因此文本也將其列為一種方法。再次強調:本文不推薦此方法。
四、修改配置文件問題及解決方案
1. 修改配置文件my.ini權限問題
在修改完畢后,保存過程中出現"拒絕訪問"問題。由於MySQL默認被安裝在C盤,因此首先查看了權限,發現權限是運行寫入的。就在這時候靈機一動,大不了把文件移到桌面修改,改完之后再放到相應文件夾下,這就省事多了。
2. MySQL預編譯字符集問題
如果用默認的安裝,即默認的字符集為latin1時,諸如gbk和utf-8在Index.xml是沒有被定義的。當時我在沒有重新配置MySQL前,就想在網上下載一份utf-8的Index.xml,可惜沒找到。當經過我重新配置MySQL后,采用utf-8后,有了utf-8和gbk等字符集。采用utf-8后Index.xml如下所示情況。

1 <?xml version='1.0' encoding="utf-8"?> 2 3 <charsets max-id="99"> 4 5 <copyright> 6 Copyright (c) 2003-2005 MySQL AB 7 Use is subject to license terms 8 9 This program is free software; you can redistribute it and/or modify 10 it under the terms of the GNU General Public License as published by 11 the Free Software Foundation; version 2 of the License. 12 13 This program is distributed in the hope that it will be useful, 14 but WITHOUT ANY WARRANTY; without even the implied warranty of 15 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 16 GNU General Public License for more details. 17 18 You should have received a copy of the GNU General Public License 19 along with this program; if not, write to the Free Software 20 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA 21 </copyright> 22 23 <description> 24 This file lists all of the available character sets. 25 To make maintaining easier please: 26 - keep records sorted by collation number. 27 - change charsets.max-id when adding a new collation. 28 </description> 29 30 <charset name="big5"> 31 <family>Traditional Chinese</family> 32 <description>Big5 Traditional Chinese</description> 33 <alias>big-5</alias> 34 <alias>bigfive</alias> 35 <alias>big-five</alias> 36 <alias>cn-big5</alias> 37 <alias>csbig5</alias> 38 <collation name="big5_chinese_ci" id="1" order="Chinese"> 39 <flag>primary</flag> 40 <flag>compiled</flag> 41 </collation> 42 <collation name="big5_bin" id="84" order="Binary"> 43 <flag>binary</flag> 44 <flag>compiled</flag> 45 </collation> 46 </charset> 47 48 <charset name="latin2"> 49 <family>Central European</family> 50 <description>ISO 8859-2 Central European</description> 51 <alias>csisolatin2</alias> 52 <alias>iso-8859-2</alias> 53 <alias>iso-ir-101</alias> 54 <alias>iso_8859-2</alias> 55 <alias>iso_8859-2:1987</alias> 56 <alias>l2</alias> 57 <collation name="latin2_czech_cs" id="2" order="Czech" flag="compiled"/> 58 <collation name="latin2_general_ci" id="9" flag="primary"> 59 <order>Hungarian</order> 60 <order>Polish</order> 61 <order>Romanian</order> 62 <order>Croatian</order> 63 <order>Slovak</order> 64 <order>Slovenian</order> 65 <order>Sorbian</order> 66 </collation> 67 <collation name="latin2_hungarian_ci" id="21" order="Hungarian"/> 68 <collation name="latin2_croatian_ci" id="27" order="Croatian"/> 69 <collation name="latin2_bin" id="77" order="Binary" flag="binary"/> 70 </charset> 71 72 <charset name="dec8"> 73 <family>Western</family> 74 <description>DEC West European</description> 75 <collation name="dec8_bin" id="69" order="Binary" flag="binary"/> 76 <collation name="dec8_swedish_ci" id="3" flag="primary"> 77 <order>Dutch</order> 78 <order>English</order> 79 <order>French</order> 80 <order>German Duden</order> 81 <order>Italian</order> 82 <order>Latin</order> 83 <order>Portuguese</order> 84 <order>Spanish</order> 85 </collation> 86 </charset> 87 88 <charset name="cp850"> 89 <family>Western</family> 90 <description>DOS West European</description> 91 <alias>850</alias> 92 <alias>cspc850multilingual</alias> 93 <alias>ibm850</alias> 94 <collation name="cp850_general_ci" id="4" flag="primary"> 95 <order>Dutch</order> 96 <order>English</order> 97 <order>French</order> 98 <order>German Duden</order> 99 <order>Italian</order> 100 <order>Latin</order> 101 <order>Portuguese</order> 102 <order>Spanish</order> 103 </collation> 104 <collation name="cp850_bin" id="80" order="Binary" flag="binary"/> 105 </charset> 106 107 <charset name="latin1"> 108 <family>Western</family> 109 <description>cp1252 West European</description> 110 <alias>csisolatin1</alias> 111 <alias>iso-8859-1</alias> 112 <alias>iso-ir-100</alias> 113 <alias>iso_8859-1</alias> 114 <alias>iso_8859-1:1987</alias> 115 <alias>l1</alias> 116 <alias>latin1</alias> 117 <collation name="latin1_german1_ci" id="5" order="German Duden"/> 118 <collation name="latin1_swedish_ci" id="8" order="Finnish, Swedish"> 119 <flag>primary</flag> 120 <flag>compiled</flag> 121 </collation> 122 <collation name="latin1_danish_ci" id="15" order="Danish"/> 123 <collation name="latin1_german2_ci" id="31" order="German Phonebook" flag="compiled"/> 124 <collation name="latin1_spanish_ci" id="94" order="Spanish"/> 125 <collation name="latin1_bin" id="47" order="Binary"> 126 <flag>binary</flag> 127 <flag>compiled</flag> 128 </collation> 129 <collation name="latin1_general_ci" id="48"> 130 <order>Dutch</order> 131 <order>English</order> 132 <order>French</order> 133 <order>German Duden</order> 134 <order>Italian</order> 135 <order>Latin</order> 136 <order>Portuguese</order> 137 <order>Spanish</order> 138 </collation> 139 <collation name="latin1_general_cs" id="49"> 140 <order>Dutch</order> 141 <order>English</order> 142 <order>French</order> 143 <order>German Duden</order> 144 <order>Italian</order> 145 <order>Latin</order> 146 <order>Portuguese</order> 147 <order>Spanish</order> 148 </collation> 149 </charset> 150 151 <charset name="hp8"> 152 <family>Western</family> 153 <description>HP West European</description> 154 <alias>hproman8</alias> 155 <collation name="hp8_bin" id="72" order="Binary" flag="binary"/> 156 <collation name="hp8_english_ci" id="6" flag="primary"> 157 <order>Dutch</order> 158 <order>English</order> 159 <order>French</order> 160 <order>German Duden</order> 161 <order>Italian</order> 162 <order>Latin</order> 163 <order>Portuguese</order> 164 <order>Spanish</order> 165 </collation> 166 </charset> 167 168 <charset name="koi8r"> 169 <family>Cyrillic</family> 170 <description>KOI8-R Relcom Russian</description> 171 <alias>koi8-r</alias> 172 <alias>cskoi8r</alias> 173 <collation name="koi8r_general_ci" id="7" order="Russian" flag="primary"/> 174 <collation name="koi8r_bin" id="74" order="Binary" flag="binary"/> 175 </charset> 176 177 <charset name="swe7"> 178 <family>Western</family> 179 <description>7bit Swedish</description> 180 <alias>iso-646-se</alias> 181 <collation name="swe7_swedish_ci" id="10" order="Swedish" flag="primary"/> 182 <collation name="swe7_bin" id="82" order="Binary" flag="binary"/> 183 </charset> 184 185 <charset name="ascii"> 186 <family>Western</family> 187 <description>US ASCII</description> 188 <alias>us</alias> 189 <alias>us-ascii</alias> 190 <alias>csascii</alias> 191 <alias>iso-ir-6</alias> 192 <alias>iso646-us</alias> 193 <collation name="ascii_general_ci" id="11" order="English" flag="primary"/> 194 <collation name="ascii_bin" id="65" order="Binary" flag="binary"/> 195 </charset> 196 197 <charset name="ujis"> 198 <family>Japanese</family> 199 <description>EUC-JP Japanese</description> 200 <alias>euc-jp</alias> 201 <collation name="ujis_japanese_ci" id="12" order="Japanese"> 202 <flag>primary</flag> 203 <flag>compiled</flag> 204 </collation> 205 <collation name="ujis_bin" id="91" order="Japanese"> 206 <flag>binary</flag> 207 <flag>compiled</flag> 208 </collation> 209 </charset> 210 211 <charset name="sjis"> 212 <family>Japanese</family> 213 <description>Shift-JIS Japanese</description> 214 <alias>s-jis</alias> 215 <alias>shift-jis</alias> 216 <alias>x-sjis</alias> 217 <collation name="sjis_japanese_ci" id="13" order="Japanese"> 218 <flag>primary</flag> 219 <flag>compiled</flag> 220 </collation> 221 <collation name="sjis_bin" id="88" order="Binary"> 222 <flag>binary</flag> 223 <flag>compiled</flag> 224 </collation> 225 </charset> 226 227 <charset name="cp1251"> 228 <family>Cyrillic</family> 229 <description>Windows Cyrillic</description> 230 <alias>windows-1251</alias> 231 <alias>ms-cyr</alias> 232 <alias>ms-cyrillic</alias> 233 <collation name="cp1251_bulgarian_ci" id="14"> 234 <order>Belarusian</order> 235 <order>Bulgarian</order> 236 <order>Macedonian</order> 237 <order>Russian</order> 238 <order>Serbian</order> 239 <order>Mongolian</order> 240 <order>Ukrainian</order> 241 </collation> 242 <collation name="cp1251_ukrainian_ci" id="23" order="Ukrainian"/> 243 <collation name="cp1251_bin" id="50" order="Binary" flag="binary"/> 244 <collation name="cp1251_general_ci" id="51" flag="primary"> 245 <order>Belarusian</order> 246 <order>Bulgarian</order> 247 <order>Macedonian</order> 248 <order>Russian</order> 249 <order>Serbian</order> 250 <order>Mongolian</order> 251 <order>Ukrainian</order> 252 </collation> 253 <collation name="cp1251_general_cs" id="52"> 254 <order>Belarusian</order> 255 <order>Bulgarian</order> 256 <order>Macedonian</order> 257 <order>Russian</order> 258 <order>Serbian</order> 259 <order>Mongolian</order> 260 <order>Ukrainian</order> 261 </collation> 262 </charset> 263 264 <charset name="hebrew"> 265 <family>Hebrew</family> 266 <description>ISO 8859-8 Hebrew</description> 267 <alias>csisolatinhebrew</alias> 268 <alias>iso-8859-8</alias> 269 <alias>iso-ir-138</alias> 270 <collation name="hebrew_general_ci" id="16" order="Hebrew" flag="primary"/> 271 <collation name="hebrew_bin" id="71" order="Binary" flag="binary"/> 272 </charset> 273 274 <charset name="tis620"> 275 <family>Thai</family> 276 <description>TIS620 Thai</description> 277 <alias>tis-620</alias> 278 <collation name="tis620_thai_ci" id="18" order="Thai"> 279 <flag>primary</flag> 280 <flag>compiled</flag> 281 </collation> 282 <collation name="tis620_bin" id="89" order="Binary"> 283 <flag>binary</flag> 284 <flag>compiled</flag> 285 </collation> 286 </charset> 287 288 <charset name="euckr"> 289 <family>Korean</family> 290 <description>EUC-KR Korean</description> 291 <alias>euc_kr</alias> 292 <alias>euc-kr</alias> 293 <collation name="euckr_korean_ci" id="19" order="Korean"> 294 <flag>primary</flag> 295 <flag>compiled</flag> 296 </collation> 297 <collation name="euckr_bin" id="85"> 298 <flag>binary</flag> 299 <flag>compiled</flag> 300 </collation> 301 </charset> 302 303 <charset name="latin7"> 304 <family>Baltic</family> 305 <description>ISO 8859-13 Baltic</description> 306 <alias>BalticRim</alias> 307 <alias>iso-8859-13</alias> 308 <alias>l7</alias> 309 <collation name="latin7_estonian_cs" id="20"> 310 <order>Estonian</order> 311 </collation> 312 <collation name="latin7_general_ci" id="41"> 313 <order>Latvian</order> 314 <order>Lithuanian</order> 315 <flag>primary</flag> 316 </collation> 317 <collation name="latin7_general_cs" id="42"> 318 <order>Latvian</order> 319 <order>Lithuanian</order> 320 </collation> 321 <collation name="latin7_bin" id="79" order="Binary" flag="binary"/> 322 </charset> 323 324 <charset name="koi8u"> 325 <family>Cyrillic</family> 326 <description>KOI8-U Ukrainian</description> 327 <alias>koi8-u</alias> 328 <collation name="koi8u_general_ci" id="22" order="Ukranian" flag="primary"/> 329 <collation name="koi8u_bin" id="75" order="Binary" flag="binary"/> 330 </charset> 331 332 <charset name="gb2312"> 333 <family>Simplified Chinese</family> 334 <description>GB2312 Simplified Chinese</description> 335 <alias>chinese</alias> 336 <alias>iso-ir-58</alias> 337 <collation name="gb2312_chinese_ci" id="24" order="Chinese"> 338 <flag>primary</flag> 339 <flag>compiled</flag> 340 </collation> 341 <collation name="gb2312_bin" id="86"> 342 <flag>binary</flag> 343 <flag>compiled</flag> 344 </collation> 345 </charset> 346 347 <charset name="greek"> 348 <family>Greek</family> 349 <description>ISO 8859-7 Greek</description> 350 <alias>csisolatingreek</alias> 351 <alias>ecma-118</alias> 352 <alias>greek8</alias> 353 <alias>iso-8859-7</alias> 354 <alias>iso-ir-126</alias> 355 <collation name="greek_general_ci" id="25" order="Greek" flag="primary"/> 356 <collation name="greek_bin" id="70" order="Binary" flag="binary"/> 357 </charset> 358 359 <charset name="cp1250"> 360 <family>Central European</family> 361 <description>Windows Central European</description> 362 <alias>ms-ce</alias> 363 <alias>windows-1250</alias> 364 <collation name="cp1250_general_ci" id="26" flag="primary"> 365 <order>Hungarian</order> 366 <order>Polish</order> 367 <order>Romanian</order> 368 <order>Croatian</order> 369 <order>Slovak</order> 370 <order>Slovenian</order> 371 <order>Sorbian</order> 372 </collation> 373 <collation name="cp1250_croatian_ci" id="44"> 374 <order>Croatian</order> 375 </collation> 376 <collation name="cp1250_polish_ci" id="99"> 377 <order>Polish</order> 378 </collation> 379 <collation name="cp1250_czech_cs" id="34" order="Czech"> 380 <flag>compiled</flag> 381 </collation> 382 <collation name="cp1250_bin" id="66" order="Binary" flag="binary"/> 383 </charset> 384 385 <charset name="gbk"> 386 <family>East Asian</family> 387 <description>GBK Simplified Chinese</description> 388 <alias>cp936</alias> 389 <collation name="gbk_chinese_ci" id="28" order="Chinese"> 390 <flag>primary</flag> 391 <flag>compiled</flag> 392 </collation> 393 <collation name="gbk_bin" id="87" order="Binary"> 394 <flag>binary</flag> 395 <flag>compiled</flag> 396 </collation> 397 </charset> 398 399 <charset name="cp1257"> 400 <family>Baltic</family> 401 <description>Windows Baltic</description> 402 <alias>WinBaltRim</alias> 403 <alias>windows-1257</alias> 404 <collation name="cp1257_lithuanian_ci" id="29" order="Lithuanian"/> 405 <collation name="cp1257_bin" id="58" order="Binary" flag="binary"/> 406 <collation name="cp1257_general_ci" id="59" flag="primary"> 407 <order>Latvian</order> 408 <order>Lithuanian</order> 409 </collation> 410 <!--collation name="cp1257_ci" id="60"/--> 411 <!--collation name="cp1257_cs" id="61"/--> 412 </charset> 413 414 <charset name="latin5"> 415 <family>South Asian</family> 416 <description>ISO 8859-9 Turkish</description> 417 <alias>csisolatin5</alias> 418 <alias>iso-8859-9</alias> 419 <alias>iso-ir-148</alias> 420 <alias>l5</alias> 421 <alias>latin5</alias> 422 <alias>turkish</alias> 423 <collation name="latin5_turkish_ci" id="30" order="Turkish" flag="primary"/> 424 <collation name="latin5_bin" id="78" order="Binary" flag="binary"/> 425 </charset> 426 427 <charset name="armscii8"> 428 <family>South Asian</family> 429 <description>ARMSCII-8 Armenian</description> 430 <alias>armscii-8</alias> 431 <collation name="armscii8_general_ci" id="32" order="Armenian" flag="primary"/> 432 <collation name="armscii8_bin" id="64" order="Binary" flag="binary"/> 433 </charset> 434 435 <charset name="utf8"> 436 <family>Unicode</family> 437 <description>UTF-8 Unicode</description> 438 <alias>utf-8</alias> 439 <collation name="utf8_general_ci" id="33"> 440 <flag>primary</flag> 441 <flag>compiled</flag> 442 </collation> 443 <collation name="utf8_bin" id="83"> 444 <flag>binary</flag> 445 <flag>compiled</flag> 446 </collation> 447 </charset> 448 449 <charset name="ucs2"> 450 <family>Unicode</family> 451 <description>UCS-2 Unicode</description> 452 <collation name="ucs2_general_ci" id="35"> 453 <flag>primary</flag> 454 <flag>compiled</flag> 455 </collation> 456 <collation name="ucs2_bin" id="90"> 457 <flag>binary</flag> 458 <flag>compiled</flag> 459 </collation> 460 </charset> 461 462 <charset name="cp866"> 463 <family>Cyrillic</family> 464 <description>DOS Russian</description> 465 <alias>866</alias> 466 <alias>csibm866</alias> 467 <alias>ibm866</alias> 468 <alias>DOSCyrillicRussian</alias> 469 <collation name="cp866_general_ci" id="36" order="Russian" flag="primary"/> 470 <collation name="cp866_bin" id="68" order="Binary" flag="binary"/> 471 </charset> 472 473 <charset name="keybcs2"> 474 <family>Central European</family> 475 <description>DOS Kamenicky Czech-Slovak</description> 476 <collation name="keybcs2_general_ci" id="37" order="Czech" flag="primary"/> 477 <collation name="keybcs2_bin" id="73" order="Binary" flag="binary"/> 478 </charset> 479 480 <charset name="macce"> 481 <family>Central European</family> 482 <description>Mac Central European</description> 483 <alias>MacCentralEurope</alias> 484 <collation name="macce_general_ci" id="38" flag="primary"> 485 <order>Hungarian</order> 486 <order>Polish</order> 487 <order>Romanian</order> 488 <order>Croatian</order> 489 <order>Slovak</order> 490 <order>Slovenian</order> 491 <order>Sorbian</order> 492 </collation> 493 <collation name="macce_bin" id="43" order="Binary" flag="binary"/> 494 </charset> 495 496 <charset name="macroman"> 497 <family>Western</family> 498 <description>Mac West European</description> 499 <alias>Mac</alias> 500 <alias>Macintosh</alias> 501 <alias>csmacintosh</alias> 502 <collation name="macroman_general_ci" id="39" flag="primary"> 503 <order>Dutch</order> 504 <order>English</order> 505 <order>French</order> 506 <order>German Duden</order> 507 <order>Italian</order> 508 <order>Latin</order> 509 <order>Portuguese</order> 510 <order>Spanish</order> 511 </collation> 512 <collation name="macroman_bin" id="53" order="Binary" flag="binary"/> 513 <!--collation name="macroman_ci" id="54"/--> 514 <!--collation name="macroman_ci_ai" id="55"/--> 515 <!--collation name="macroman_cs" id="56"/--> 516 </charset> 517 518 <charset name="cp852"> 519 <family>Central European</family> 520 <description>DOS Central European</description> 521 <alias>852</alias> 522 <alias>cp852</alias> 523 <alias>ibm852</alias> 524 <collation name="cp852_general_ci" id="40" flag="primary"> 525 <order>Hungarian</order> 526 <order>Polish</order> 527 <order>Romanian</order> 528 <order>Croatian</order> 529 <order>Slovak</order> 530 <order>Slovenian</order> 531 <order>Sorbian</order> 532 </collation> 533 <collation name="cp852_bin" id="81" order="Binary" flag="binary"/> 534 </charset> 535 536 <charset name="cp1256"> 537 <family>Arabic</family> 538 <description>Windows Arabic</description> 539 <alias>ms-arab</alias> 540 <alias>windows-1256</alias> 541 <collation name="cp1256_bin" id="67" order="Binary" flag="binary"/> 542 <collation name="cp1256_general_ci" id="57" order="Arabic" flag="primary"> 543 <order>Arabic</order> 544 <order>Persian</order> 545 <order>Pakistani</order> 546 <order>Urdu</order> 547 </collation> 548 </charset> 549 550 <charset name="geostd8"> 551 <family>South Asian</family> 552 <description>GEOSTD8 Georgian</description> 553 <collation name="geostd8_general_ci" id="92" order="Georgian" flag="primary"/> 554 <collation name="geostd8_bin" id="93" order="Binary" flag="binary"/> 555 </charset> 556 557 <charset name="binary"> 558 <description>Binary pseudo charset</description> 559 <collation name="binary" id="63" order="Binary"> 560 <flag>primary</flag> 561 <flag>compiled</flag> 562 </collation> 563 </charset> 564 565 <charset name="cp932"> 566 <family>Japanese</family> 567 <description>SJIS for Windows Japanese</description> 568 <alias>ms_cp932</alias> 569 <alias>sjis_cp932</alias> 570 <alias>sjis_ms</alias> 571 <collation name="cp932_japanese_ci" id="95" order="Japanese"> 572 <flag>primary</flag> 573 <flag>compiled</flag> 574 </collation> 575 <collation name="cp932_bin" id="96" order="Binary"> 576 <flag>binary</flag> 577 <flag>compiled</flag> 578 </collation> 579 </charset> 580 581 <charset name="eucjpms"> 582 <family>Japanese</family> 583 <description>UJIS for Windows Japanese</description> 584 <alias>eucjpms</alias> 585 <alias>eucJP_ms</alias> 586 <alias>ujis_ms</alias> 587 <alias>ujis_cp932</alias> 588 <collation name="eucjpms_japanese_ci" id="97" order="Japanese"> 589 <flag>primary</flag> 590 <flag>compiled</flag> 591 </collation> 592 <collation name="eucjpms_bin" id="98" order="Japanese"> 593 <flag>binary</flag> 594 <flag>compiled</flag> 595 </collation> 596 </charset> 597 598 </charsets>
五、思考總結
經過大致3~4天的理論積累以及實踐操作,終於完成了這篇文章。可以這么說吧,這篇博客可能是我查詢文獻最多以及寫作時間最長的一篇。處女座的我,最受不了網上那些只告訴你這么干,不告訴為什么這么干的,或者只是照搬照抄沒有探索精神的文章。寫下本文望能與后來者共勉。
文章的排版可能不是特別合理,因為一直沒有搞定博客園html的跳轉問題,略微抱歉。
1. 字符編碼
utf-8字符集包含GBK字符集,但是這種包含關系並不是說utf-8的字符集包含的GBK字符集完全是一致的,而是指GBK的所有字符集都能夠在utf-8中找到相應的映射。例如,在GBK中中文字符占2字節,而在變長字符集utf-8中中文字符占3字節。不同字符集在MySQL數據管理系統內部能夠自動映射,但是在客戶端中顯示時必須匹配合適的字符編碼集,否則無法正確顯示,即出現亂碼問題。
2. mysql與mysqld區別
mysql指客戶端;mysqld=mysql daemon指mysql守護進程,即mysql后台服務器端。當我們在命令行輸入mysqld -install后,即安裝完成mysqld,此時在服務中出現MySQL服務。啟動MySQL服務,等價於啟動mysqld,即啟動mysql服務器。此后,才能在mysql客戶端登陸操作mysql數據庫。
3. 數據管理系統字符集層次
MySQL數據庫字符集具有多層次。在配置文件my.ini中僅定義default-character-set、default-set-server,然而MySQL服務器中卻存在多層次的字符集結構。
character-set-client 客戶端發送給服務器SQL語句字符集
character-set-connection sokect連接字符集
character-set-results 客戶端通過SELECT語句顯示數據庫查詢結果字符集
這三種字符集一般是與default-character-set字符集相同的。當然也可以自定義設置。
character-set-database 數據庫儲存的數據字符集
character-set-server 服務器字符集
character-set-system 系統字符集
這三種字符集一般是與default-set-server字符集相同的。當然也可以自定義設置。
character-set-filesystem 文件系統字符集
MySQL數據庫字符集
可以通過在命令行模糊查詢:SHOW variables LIKE 'char%'; 顯示上述圖所示類似內容。
本文系作者原創,轉載需要通過作者本人同意。