http://www.360doc.com/content/09/0915/15/61497_6003890.shtml不 過在實際使用中, 還是發現按照最基本的方式調用 HttpClient 時, 並不支持 UTF-8 編碼, 在網絡上找過一些文章, 也不得要領, 於是查看了 commons-httpClient3.0.1 的一些代碼, 首先在 PostMethod 中找到了 generateRequestEntity() 方法:
/** * Generates a request entity from the post parameters, if present. Calls * {@link EntityEnclosingMethod#generateRequestBody()} if parameters have not been set. * * @since 3.0 */ protected RequestEntity generateRequestEntity() { if (!this.params.isEmpty()) { // Use a ByteArrayRequestEntity instead of a StringRequestEntity. // This is to avoid potential encoding issues. Form url encoded strings // are ASCII by definition but the content type may not be. Treating the content // as bytes allows us to keep the current charset without worrying about how // this charset will effect the encoding of the form url encoded string. String content = EncodingUtil.formUrlEncode(getParameters(), getRequestCharSet()); ByteArrayRequestEntity entity = new ByteArrayRequestEntity( EncodingUtil.getAsciiBytes(content), FORM_URL_ENCODED_CONTENT_TYPE ); return entity; } else { return super.generateRequestEntity(); } } |
原來使用 NameValuePair 加入的 HTTP 請求的參數最終都會轉化為 RequestEntity 提交到 HTTP 服務器, 接着在 PostMethod 的父類 EntityEnclosingMethod 中找到了如下的代碼:
/** * Returns the request's charset. The charset is parsed from the request entity's * content type, unless the content type header has been set manually. * * @see RequestEntity#getContentType() * * @since 3.0 */ public String getRequestCharSet() { if (getRequestHeader("Content-Type") == null) { // check the content type from request entity // We can't call getRequestEntity() since it will probably call // this method. if (this.requestEntity != null) { return getContentCharSet( new Header("Content-Type", requestEntity.getContentType())); } else { return super.getRequestCharSet(); } } else { return super.getRequestCharSet(); } } |
解決方案
從上面兩段代碼可以看出是 HttpClient 是如何依據 "Content-Type" 獲得請求的編碼(字符集), 而這個編碼又是如何應用到提交內容的編碼過程中去的. 按照這個原來, 其實我們只需要重載 getRequestCharSet() 方法, 返回我們需要的編碼(字符集)名稱, 就可以解決 UTF-8 或者其它非默認編碼提交 POST 請求時的亂碼問題了.
測試
首先在 Tomcat 的 ROOT WebApp 下部署一個頁面 test.jsp, 作為測試頁面, 主要代碼片段如下:
<%@ page contentType="text/html;charset=UTF-8"%> <%@ page session="false" %> <% request.setCharacterEncoding("UTF-8"); String val = request.getParameter("TEXT"); System.out.println(">>>> The result is " + val); %> |
接着寫一個測試類, 主要代碼如下:
public static void main(String[] args) throws Exception, IOException { String url = "http://localhost:8080/test.jsp"; PostMethod postMethod = new UTF8PostMethod(url); //填入各個表單域的值 NameValuePair[] data = { new NameValuePair("TEXT", "中文"), }; //將表單的值放入postMethod中 postMethod.setRequestBody(data); //執行postMethod HttpClient httpClient= new HttpClient(); httpClient.executeMethod(postMethod); } //Inner class for UTF-8 support public static class UTF8PostMethod extends PostMethod{ public UTF8PostMethod(String url){ super(url); } @Override public String getRequestCharSet() { //return super.getRequestCharSet(); return "UTF-8"; } } |
運行這個測試程序, 在 Tomcat 的后台輸出中可以正確打印出 ">>>> The result is 中文" .