今天在解析json數據的時候得到了一堆這樣的數據:{"errNum":0,"errMsg":"success","retData":[{"title":"\u6536\u5e9f\u54c1\u5927\u53d4\u521a\u4e0a\u53f0\uff0c\u5c31\u60e8\u906d\u8bc4\u59d4\u706d\u706f\uff0c\u4f46\u63a5\u4e0b\u6765\u5168\u573a\u90fd\u9707\u60ca\u4e86\uff01","url":"http:\/\/toutiao.com\/group\/6263036756505920002\/","abstract":"\u8ba2\u9605\u6211\u83b7\u53d6\u66f4\u591a\u7cbe\u5f69\u5185\u5bb9\uff01","image_url":"http:\/\/p1.pstatp.com\/list\/2f90009a31a7ee8bb15"}]}
這是因為,為了更好的傳輸中文,json進行了Unicode編碼。
這樣一來,我們在解析json之前,就得要先將json數據中的Unicode編碼轉換為我們使用的中文;
一:http請求數據返回json中string字段包含unicode的轉碼
- public static String decodeUnicode(String theString) {
- char aChar;
- int len = theString.length();
- StringBuffer outBuffer = new StringBuffer(len);
- for (int x = 0; x < len;) {
- aChar = theString.charAt(x++);
- if (aChar == '\\') {
- aChar = theString.charAt(x++);
- if (aChar == 'u') {
- // Read the xxxx
- int value = 0;
- for (int i = 0; i < 4; i++) {
- aChar = theString.charAt(x++);
- switch (aChar) {
- case '0':
- case '1':
- case '2':
- case '3':
- case '4':
- case '5':
- case '6':
- case '7':
- case '8':
- case '9':
- value = (value << 4) + aChar - '0';
- break;
- case 'a':
- case 'b':
- case 'c':
- case 'd':
- case 'e':
- case 'f':
- value = (value << 4) + 10 + aChar - 'a';
- break;
- case 'A':
- case 'B':
- case 'C':
- case 'D':
- case 'E':
- case 'F':
- value = (value << 4) + 10 + aChar - 'A';
- break;
- default:
- throw new IllegalArgumentException(
- "Malformed \\uxxxx encoding.");
- }
- }
- outBuffer.append((char) value);
- } else {
- if (aChar == 't')
- aChar = '\t';
- else if (aChar == 'r')
- aChar = '\r';
- else if (aChar == 'n')
- aChar = '\n';
- else if (aChar == 'f')
- aChar = '\f';
- outBuffer.append(aChar);
- }
- } else
- outBuffer.append(aChar);
- }
- return outBuffer.toString();
- }
二、普通string含有unicode轉碼方法
- public static String reEncoding(String text, String newEncoding) {
- String str = null;
- try {
- str = new String(text.getBytes(), newEncoding);
- } catch (UnsupportedEncodingException e) {
- log.error("不支持的字符編碼" + newEncoding);
- throw new RuntimeException(e);
- }
- return str;
- }
三、說一下比較奇怪的方案,測試中無意發現的,暫時沒弄明白原理(有明白原理的大神,請告知一聲,謝謝)
我用HttpClent的post方式獲取的json數據,得到的是帶Unicode碼的數據,需要轉換成中文才行,但是轉換的時間感覺有點長,就用HttpURLConnection的get方式又試了一下,在不轉碼的情況下,經過gson解析后,竟然神奇的自動轉換成了中文:
簡直是太神奇了,而且需要的時間相對於HttpClient的post請求方式的請求和處理時間更短,所以,果斷換用HttpURLConnection的get方式了
①現在先貼一下HttpURLConnection的get的方式:
- @Test
- public void test() {
- try {
- long start = System.currentTimeMillis();
- URL url = new URL("http://apis.baidu.com/songshuxiansheng/news/news");
- HttpURLConnection connection = (HttpURLConnection) url.openConnection();
- connection.addRequestProperty("apikey","0fc807e45a37ce264f45d169646f4a9e" );
- String dataString = new String(GsonTools.IsToByte(connection.getInputStream()),"utf-8");
- HeadlineJson newsJson = GsonTools.getObjectData(dataString, HeadlineJson.class);
- List<Headline>list = newsJson.getRetData();
- System.out.println(list.toString());
- long end = System.currentTimeMillis();
- System.out.println("timeGap:"+(end-start));
- } catch (Exception e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- }
- }
調用的GsonTools的方法:(之前的博文中有寫到過)
- public static <T> T getObjectData(String jsonString, Class<T> type) {
- T t = null;
- try {
- Gson gson = new Gson();
- t = gson.fromJson(jsonString, type);
- } catch (JsonSyntaxException e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- }
- return t;
- }
②然后貼一下HttpClient的post方式:
- @Test
- public void TestHeadLine() {
- long start = System.currentTimeMillis();
- List<NameValuePair> params = new ArrayList<NameValuePair>();
- String url = "http://apis.baidu.com/songshuxiansheng/news/news";
- String jsonString = HttpUtils.getBaiDuString2(url, params);
- HeadlineJson lineJson = GsonTools.getObjectData(jsonString, HeadlineJson.class);
- System.out.println(lineJson.toString());
- long end = System.currentTimeMillis();
- System.out.println("timeGap:"+(end-start));
- }
調用的HttpUtils的方法:
- public static String getBaiDuString(String url,List<NameValuePair> params) {
- String serverDataString = null;
- HttpPost post = new HttpPost(url);
- try {
- post.setEntity(new UrlEncodedFormEntity(params, HTTP.UTF_8));
- post.addHeader("apikey", UrlUtils.BAIDU_API_KEY);
- HttpClient client = new DefaultHttpClient();
- HttpResponse response = client.execute(post);
- int code = response.getStatusLine().getStatusCode();
- System.out.println("StatusCode:" + code);
- if (code == 200) {
- serverDataString = decodeUnicode(EntityUtils.toString(response.getEntity()));
- // serverDataString = EntityUtils.toString(response.getEntity());
- System.out.println("接收字符串數據成功\nServerData:"+serverDataString);
- }
- } catch (Exception e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- }
- return serverDataString;
- }
③調用的HttpClient的get方式
- public static String getBaiDuString2(String url,List<NameValuePair> params) {
- String serverDataString = null;
- HttpGet get = new HttpGet(url);
- try {get.addHeader("apikey", UrlUtils.BAIDU_API_KEY);
- HttpClient client = new DefaultHttpClient();
- HttpResponse response = client.execute(get);
- int code = response.getStatusLine().getStatusCode();
- System.out.println("StatusCode:" + code);
- if (code == 200) {
- // serverDataString = decodeUnicode(EntityUtils.toString(response.getEntity()));
- serverDataString = EntityUtils.toString(response.getEntity());
- System.out.println("接收字符串數據成功\nServerData:"+serverDataString);
- }
- } catch (Exception e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- }
- return serverDataString;
- }
谷歌提供的HttpClient的通信和HttpURLConnection網絡通信的時間間隔我也做了比較,明顯,HttpURLConnection的請求時間更短,所以果斷使用HttpURLConnection的方式
四、java中本身就提供了對Unicode 的url進行解碼的方法了:
- System.out.println(URLDecoder.decode("\u82f9\u679c", "utf-8")); 詳細介紹請查看全文:https://cnblogs.com/qianzf/