對於mysql數據庫而言,解決中文亂碼,可以從兩個方向考慮,一個是通過修改mysql服務器端的配置文件/etc/mysql/my.cnf來支持中文,比如:
... [mysql] default-character-set=utf8 ...
但是,修改配置文件需要重啟服務,尤其是對於已經在線上運行的數據庫或者“老”的數據庫實例(有可能是多實例或集群)而言,顯然通過修改配置文件來操作是不適合的,甚至是不被DBA允許的,通常能想到的辦法是在客戶端或者JDBC連接時定制支持中文的編碼格式(通常使用UTF-8),這樣插入數據的時候,讓mysql為自動為我們轉碼,可行的辦法有兩種:
1、如果是通過DriverManager.getConnection(url)編碼方式操作JDBC,可以在JDBC的url中追加useUnicode=true&characterEncoding=UTF-8解決亂碼問題。
jdbc.url=jdbc:mysql://127.0.0.1:3306/mydb?useUnicode=true&characterEncoding=UTF-8
2、如果是通過其它數據源,比如DBCP、tomcat-jdbc、c3p0、spring-jdbc、hibernate讀取配置文件,在url中追加useUnicode=true&characterEncoding=UTF-8是不起作用的,而是通過數據源自身的配置生效,比如下列配置:
<!-- Tomcat data source --> <bean id="dataSource" class="org.apache.tomcat.jdbc.pool.DataSource"> <property name="driverClassName" value="${jdbc.driverClassName}" /> <property name="url" value="${jdbc.url}" /> <property name="username" value="${jdbc.username}" /> <property name="password" value="${jdbc.password}" /> <property name="dbProperties"> <props> <prop key="useUnicode">yes</prop> <prop key="characterEncoding">utf8</prop> </props> </property> <!-- Configuration refer to optimizing connection performance --> <property name="initialSize" value="10" /> <property name="maxActive" value="100" /> <property name="maxIdle" value="50" /> <property name="minIdle" value="10" /> <property name="suspectTimeout" value="60" /> <property name="timeBetweenEvictionRunsMillis" value="30000" /> <property name="minEvictableIdleTimeMillis" value="60000" /> <property name="testOnBorrow" value="true" /> <property name="validationQuery" value="SELECT 1" /> <property name="validationInterval" value="30000" /> <!-- End Configuration refer to optimizing connection performance --> </bean>
其中:
<props> <prop key="useUnicode">yes</prop> <prop key="characterEncoding">utf8</prop> </props>
等價於url中的useUnicode=true&characterEncoding=UTF-8
查看當前數據庫的字符集(切換到某個數據庫前后比較):
mysql> show variables like 'char_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> use robot_classifymark; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> show variables like 'char_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec)
可以看到數據庫robot_classifymark采用的是utf8編碼的,正好對應創建數據庫時指定的編碼,對應建庫SQL:
CREATE DATABASE IF NOT EXISTS robot_classifymark DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
如果想在命令行下插入中文,可以執行set names utf8,此命令會影響三個參數:
mysql> show variables like 'char_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> set names utf8 -> ; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'char_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec)
可見:執行命令set names utf8,影響了三個參數:character_set_client,character_set_connection,character_set_results,如此,可以在當前命令行下插入中文了,但是,此命令只對當前命令行有效(*),如果想每次登入mysql命令行都生效,同樣需要修改mysql服務器端的配置文件/etc/mysql/my.cnf
... [client] default-character-set=utf8 ...
所以,如果JDBC連接時,如果url未指定useUnicode=true&characterEncoding=UTF-8,可以變相的通過每次執行insert或select語句之前先執行set names utf8來插入或查詢中文,顯然,此方式沒有在url中指定useUnicode=true&characterEncoding=UTF-8或者通過上面的數據源配置形式簡潔!