一般我們會使用dom4j、SAX、w3c來解析xml文件,網上也大多提供此類解決方案。
但在實際項目中,也有會解析xml格式的字符串報文的。
比如,有如下字符串:
String = "<Response service="OrderWebService"><Head>OK</Head><Body><OrderResponse><customerOrderNo>201605110015</customerOrderNo><mailNo>070000314903</mailNo><printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl><invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl></OrderResponse></Body></Response>";
對如上字符串進行格式化之后:
<Response service="OrderWebService"> <Head>OK</Head> <Body> <OrderResponse> <customerOrderNo>201605110015</customerOrderNo> <mailNo>070000314903</mailNo> <printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl> <invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl> </OrderResponse> </Body> </Response>
即使格式化之后,我們也發現這串報文比較特殊,因為它使用了屬性而不是元素節點來描述對象。
下面提供dom4j的解決方案:
public HashMap<String, Object> stringToXmlByDom4j(String content){
HashMap<String, Object> result = new HashMap<String, Object>();
try {
SAXReader saxReader=new SAXReader();
org.dom4j.Document docDom4j=saxReader.read(new ByteArrayInputStream(content.getBytes("utf-8")));
org.dom4j.Element root = docDom4j.getRootElement();
List<Attribute> rooAttrList = root.attributes();
for (Attribute rootAttr : rooAttrList) {
System.out.println(rootAttr.getName() + ": " + rootAttr.getValue());
result.put(rootAttr.getName(), rootAttr.getValue());
}
List<org.dom4j.Element> childElements = root.elements();
for (org.dom4j.Element e1 : childElements) {
System.out.println("第一層:"+e1.getName() + ": " + e1.getText());
result.put(e1.getName(), e1.getText());
}
for (org.dom4j.Element child : childElements) {
//未知屬性名情況下
List<Attribute> attributeList = child.attributes();
for (Attribute attr : attributeList) {
System.out.println("第二層:"+attr.getName() + ": " + attr.getValue());
result.put(attr.getName(), attr.getValue());
}
//已知屬性名情況下
// System.out.println("id: " + child.attributeValue("id"));
//未知子元素名情況下
List<org.dom4j.Element> elementList = child.elements();
for (org.dom4j.Element ele : elementList) {
System.out.println("第二層:"+ele.getName() + ": " + ele.getText());
result.put(ele.getName(), ele.getText());
List<Attribute> kidAttr = ele.attributes();
for (Attribute kidattr : kidAttr) {
System.out.println("第三層:"+kidattr.getName() + ": " + kidattr.getValue());
result.put(kidattr.getName(), kidattr.getValue());
}
List<org.dom4j.Element> lidList = ele.elements();
int size = lidList.size();
if(size>0){
for (org.dom4j.Element e2 : lidList) {
System.out.println("第三層:"+e2.getName() + ": " + e2.getText());
result.put(e2.getName(), e2.getText());
}
}
}
// System.out.println();
//已知子元素名的情況下
// System.out.println("title" + child.elementText("title"));
// System.out.println("author" + child.elementText("author"));
//這行是為了格式化美觀而存在
// System.out.println();
}
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
寫一個main方法測試結果如下:
響應結果:<Response service="OrderWebService"><Head>OK</Head><Body><OrderResponse><customerOrderNo>201605110015</customerOrderNo><mailNo>070000314903</mailNo><printUrl>http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</printUrl><invoiceUrl>http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==</invoiceUrl></OrderResponse></Body></Response> service: OrderWebService 第一層:Head: OK 第一層:Body: 第二層:OrderResponse: 第三層:customerOrderNo: 201605110015 第三層:mailNo: 070000314903 第三層:printUrl: http://10.202.18.24:8080/osms/wbs/print/printOrder.pub?mailno=vYrygnDPNe9Csjz35xwzwQ== 第三層:invoiceUrl: http://10.202.18.24:8080/osms/wbs/print/printInvoice.pub?mailno=vYrygnDPNe9Csjz35xwzwQ==
