XML
scala提供了對xml字面量的內建支持,我們可以很容易的在程序代碼中生成xml片段,
scala類庫也包含了對xml常用處理的支持
有時候scala會錯誤識別出xml字面量 如x < y 沒問題,x <y 錯誤,解決方法就是在<后加一個
空格
scala> val doc= <html><head><title>hello world!</title></head><body></body></html> //有空格
doc: scala.xml.Elem = <html><head><title>hello world!</title></head><body></body></html>
scala> val doc=<html><head><title>hello world!</title></head><body></body></html> //沒有空格
|
|
You typed two blank lines. Starting a new command.
xml節點
node類是所有xml節點類的祖先,它有兩個最重要的子類Text和Elem。
Elem類描述的是xml元素
scala> val elm= <a href="http://salca-lang.org"> the <em>scala</em> language</a>
elm: scala.xml.Elem = <a href="http://salca-lang.org"> the <em>scala</em> language</a>
label屬性產出標簽名稱(這里是“a”),child對應的是后代的序列,本例中是兩個text一個elem節點
如下:
scala> for(i<-elm.child)println(i)
the
<em>scala</em>
language
節點序列類型是NodeSeq,它是Seq[Node]的子類,加入了對類XPATH 操作的支持,你可以對xml節點序列
使用Seq操作。
單個節點相當於長度為1的序列
注釋<!--.........-->,實體引用<&.....;>處理指令<?....?> 也分別有節點類與之對應
如果通過編程方式構建節點序列,可以使用NodeBuffer,它是ArrayBuffer[Node]的子類
scala> val item=new NodeBuffer()
item: scala.xml.NodeBuffer = ArrayBuffer()
scala> item+=<li>apple</li> //無空格報錯
<console>:17: error: value +=< is not a member of scala.xml.NodeBuffer
item+=<li>apple</li>
scala> item+= <li>apple</li>
res11: item.type = ArrayBuffer(<li>apple</li>)
scala> item+= <li>banana</li>
res12: item.type = ArrayBuffer(<li>apple</li>, <li>banana</li>)
scala> item+= <li>pear</li>
res13: item.type = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>pear</li>)
scala> item+= <li>orange</li>
res15: item.type = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>pear</li>, <li>orange</li>)
scala> item
res16: scala.xml.NodeBuffer = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>pear</li>, <li>orange</li>)
scala> val nodes:NodeSeq=item
nodes: scala.xml.NodeSeq = NodeSeq(<li>apple</li>, <li>banana</li>, <li>pear</li>, <li>orange</li>)
NodeBuffer是一個Seq[Node],可以被隱式轉換為NodeSeq,一旦完成,最好別再修改它,因為XML節點
序列應該是不可變的
元素屬性
要處理某個元素的屬性和值,可以使用 Atrributes屬性,它將產生一個matedata的對象,
幾乎等同於從一個屬性鍵到屬性值的映射,你可以用()操作符訪問給定鍵的值;
產生的結果是一個節點序列,而不是一個字符串,因為XML屬性可以包含實體引用
scala> elm
res22: scala.xml.Elem = <a href="http://salca-lang.org"> the <em>scala</em> language</a>
scala> elm.attribute("href")
res23: Option[Seq[scala.xml.Node]] = Some(http://salca-lang.org)
scala> elm.attributes("href")
res24: Seq[scala.xml.Node] = http://salca-lang.org
如果確定屬性中不存在未被解析的實體,可以使用text方法將節點列表轉化為字符串
scala> elm.attributes("href").text
res48: String = http://salca-lang.org
scala> elm.attributes("gg")
res52: Seq[scala.xml.Node] = null
scala> elm.attributes.get("name")
res53: Option[Seq[scala.xml.Node]] = None
scala> for(it<-elm.attributes) println(it.key+":"+it.value)
href:http://salca-lang.org
內嵌表達式
你可以在xml字面量里面中包含scala代碼,動態計算出元素內容;
代碼塊產生的是一個節點序列,節點序列會被直接添加到XML。所有其他值都會被放到
一個Atom[T]中,這是一個針對類型T的容器,通過這種方式,你可以在xml中放任何值
;你也可以通過Atom節點的data屬性取回這些值。
scala> item
res65: scala.xml.NodeBuffer = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>pear</li>, <li>orange</li>)
scala> val tmp= <fruit>{for(i<-item) yield i}</fruit>
tmp: scala.xml.Elem = <fruit><li>apple</li><li>banana</li><li>pear</li><li>orange</li></fruit>
scala> arr
res69: Array[String] = Array(banana, apple, orange, pear)
//xml可以包含scala代碼,scala代碼中也可以包含xml字面量
//fruit元素中包含了scala代碼<fruit>{.........}</fruit>;字面量 <li>{i}</li>
//包含了另一個scala代碼塊{i}
scala> val tmp= <fruit>{for(i<-arr) yield <li>{i}</li>}</fruit>
tmp: scala.xml.Elem = <fruit><li>banana</li><li>apple</li><li>orange</li><li>pear</li></fruit>
scala> val tmp= <fruit>{for(i<-arr) yield <li>i</li>}</fruit>
tmp: scala.xml.Elem = <fruit><li>i</li><li>i</li><li>i</li><li>i</li></fruit>
scala> val tmp= <fruit>{arr}</fruit>
tmp: scala.xml.Elem = <fruit>banana apple orange pear</fruit>
屬性中使用表達式
scala> url
res80: String = www.baidu.com
//內嵌的代碼塊也可以產出一個節點序列,如果代碼塊返回null或者none,
//該屬性就不會被設置
scala> val bb= <a href={url}> the <em>scala</em> language</a>
bb: scala.xml.Elem = <a href="www.baidu.com"> the <em>scala</em> language</a>
scala> val bb= <a href={url}>{for(i<- 0 to 2) yield <num>{i}</num>}</a>
bb: scala.xml.Elem = <a href="www.baidu.com"><num>0</num><num>1</num><num>2</num></a>
特殊節點類型 ??未研究
類xpath表達式
NodeSeq類提供了類似xpath中 / 和 //的操作符方法,在scala中用\ 和 \\代替(//在scala中是注釋)
\操作符定位某個節點或節點序列的直接后代
scala> tmp
res90: scala.xml.Elem = <fruit><li>banana</li><li>apple</li><li>orange</li><li>pear</li></fruit>
scala> tmp \ "li"
res91: scala.xml.NodeSeq = NodeSeq(<li>banana</li>, <li>apple</li>, <li>orange</li>, <li>pear</li>)
scala> for (i<- tmp \"li") println(i)
<li>banana</li>
<li>apple</li>
<li>orange</li>
<li>pear</li>
通配符可以匹配任何元素
\\ 可以定位任何深度的后代
scala> val tmp2= <fruit nm="shuiguo"><li>banana</li><li><li>green apple</li><li>red apple</li></li><li>orange</li><li>pear</li></fruit>
tmp2: scala.xml.Elem = <fruit nm="shuiguo"><li>banana</li><li><li>green apple</li><li>red apple</li></li><li>orange</li><li>pear</li></fruit>
scala> tmp2 \ "li"
res108: scala.xml.NodeSeq = NodeSeq(<li>banana</li>, <li><li>green apple</li><li>red apple</li></li>, <li>orange</li>, <li>pear</li>)
scala> for(i<-tmp2 \"li")println(i)
<li>banana</li>
<li><li>green apple</li><li>red apple</li></li>
<li>orange</li>
<li>pear</li>
scala> tmp2 \\ "li"
res109: scala.xml.NodeSeq = NodeSeq(<li>banana</li>, <li><li>green apple</li><li>red apple</li></li>, <li>green apple</li>, <li>red apple</li>, <li>orange</li>, <li>pear</li>)
scala> for(i<-tmp2 \\"li")println(i)
<li>banana</li>
<li><li>green apple</li><li>red apple</li></li>
<li>green apple</li>
<li>red apple</li>
<li>orange</li>
<li>pear</li>
以@開頭的可以定位屬性
scala> val tmp1= <fruit nm="shuiguo"><li>banana</li><li>apple</li><li>orange</li><li>pear</li></fruit>
tmp1: scala.xml.Elem = <fruit nm="shuiguo"><li>banana</li><li>apple</li><li>orange</li><li>pear</li></fruit>
scala> tmp1 \ "@nm"
res105: scala.xml.NodeSeq = shuiguo
scala> tmp1 \\ "@nm"
res110: scala.xml.NodeSeq = NodeSeq(shuiguo)
scala> (tmp2 \ "li").text
res120: String = bananagreen applered appleorangepear
模式匹配
可以用表達式匹配單個后代
scala> def xmlmatch(node:Node){node match {case <li>{_}</li> =>println(node.text);case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> tmp
res38: scala.xml.Elem = <fruit><li>banana</li><li>apple</li><li>orange</li><li>pear</li></fruit>
scala> xmlmatch(tmp)
not match
scala> items
res40: scala.xml.NodeBuffer = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>orange</li>)
scala> xmlmatch(items)
<console>:21: error: type mismatch;
found : scala.xml.NodeBuffer
required: scala.xml.Node
xmlmatch(items)
^
scala> xmlmatch(items(0))
apple
scala> def xmlmatch(node:Node){node match {case <li>{_}</li> => println(node);case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
<li>apple</li>
如果li元素有多個后代
scala> aa
res49: scala.xml.Elem = <li><a>green apple</a><b>red apple</b></li>
scala> def xmlmatch(node:Node){node match {case <li>{_}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(aa)
not match
scala> def xmlmatch(node:Node){node match {case <li>{_*}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(aa)
<li><a>green apple</a><b>red apple</b></li>
在xml中,{ }表示代碼模式,而不是被求值的代碼
除了通配符,還可以使用變量名。
scala> items
res64: scala.xml.NodeBuffer = ArrayBuffer(<li>apple</li>, <li>banana</li>, <li>orange</li>)
scala> def xmlmatch(node:Node){node match {case <li>{child}</li> => println(child) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
apple
scala> def xmlmatch(node:Node){node match {case <li>{child}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
<li>apple</li>
要匹配一個文本:
scala> def xmlmatch(node:Node){node match {case <li>{Text(child)}</li> => println(child) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
apple
scala> def xmlmatch(node:Node){node match {case <li>{Text(child)}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
<li>apple</li>
scala> def xmlmatch(node:Node){node match {case <li>{Text(node)}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
apple
scala> def xmlmatch(node:Node){node match {case <li>{Text(_)}</li> => println(node) ;case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(items(0))
<li>apple</li>
把節點綁定到變量
scala> def xmlmatch(node:Node){node match { case <li>{sub @ _*}</li> =>println(sub);case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> aa
res75: scala.xml.Elem = <li><a>green apple</a><b>red apple</b></li>
scala> xmlmatch(aa)
ArrayBuffer(<a>green apple</a>, <b>red apple</b>)
scala> def xmlmatch(node:Node){node match { case <li>{sub @ _*}</li> =>for(i<-sub) println(i);case _=>println("not match")}}
xmlmatch: (node: scala.xml.Node)Unit
scala> xmlmatch(aa)
<a>green apple</a>
<b>red apple</b>
在case語句中,只能用一個節點
xml模式不能有屬性;要匹配到屬性,需要守衛
修改元素和屬性
scala中,xml節點和序列是不可變的,如果想編輯一個節點,則必須創建一個copy,給出需要做的修改,然后copy未被修改的部分。
新舊兩個列表aa,bb后代是共享的
拷貝Elem節點,使用copy方法,它有5個帶名參數:label,attributes,child,還有用於命名空間的prifix和scope
scala> aa
res83: scala.xml.Elem = <li><a>green apple</a><b>red apple</b></li>
scala> val bb=aa.copy(label="app")
bb: scala.xml.Elem = <app><a>green apple</a><b>red apple</b></app>
添加一個后代
scala> val cc=aa.copy(child=aa.child ++ <c>yellow apple</c>)
cc: scala.xml.Elem = <li><a>green apple</a><b>red apple</b><c>yellow apple</c></li>
添加或修改一個屬性,可用%操作符
scala> elm
res94: scala.xml.Elem = <a href="http://salca-lang.org"> the <em>scala</em> language</a>
//Attribute(null,"href","baidu.com",Null) 第一個參數是命名空間,最后一個是額外的元數據列表
scala> val elm100=elm % Attribute(null,"href","baidu.com",Null) //修改
elm100: scala.xml.Elem = <a href="baidu.com"> the <em>scala</em> language</a>
scala> val elm100=elm % Attribute(null,"href1","baidu.com",Null)//添加
elm100: scala.xml.Elem = <a href1="baidu.com" href="http://salca-lang.org"> the <em>scala</em> language</a>
//修改和添加可以串聯一塊操作
scala> val elm100=elm % Attribute(null,"href1","baidu.com",Attribute(null,"href","sohu.com",Null))
elm100: scala.xml.Elem = <a href="sohu.com" href1="baidu.com"> the <em>scala</em> language</a>
scala> val elm100=elm % Attribute(null,"href1","baidu.com",Attribute(null,"href","sohu.com",Attribute(null,"href2","sohu.com",Null)))
elm100: scala.xml.Elem = <a href2="sohu.com" href="sohu.com" href1="baidu.com"> the <em>scala</em> language</a>
XML變換
未驗證成功
加載和保存
import scala.xml._
scala> val xml=XML.loadFile("/root/tmpdata/xml.txt")
xml: scala.xml.Elem =
<breakfast_menu>
<food><name>Belgian Waffles</name>
............................
scala> println(xml)
<breakfast_menu>
<food><name>Belgian Waffles</name>
<price>$5.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>$7.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>French Toast</name>
<price>$4.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
scala> xml \\ "name"
res13: scala.xml.NodeSeq = NodeSeq(<name>Belgian Waffles</name>, <name>Strawberry Belgian Waffles</name>, <name>Berry-Berry Belgian Waffles</name>, <name>French Toast</name>, <name>Homestyle Breakfast</name>)
scala> for(x<- xml \\ "name") println(x.text)
Belgian Waffles
Strawberry Belgian Waffles
Berry-Berry Belgian Waffles
French Toast
Homestyle Breakfast
scala> (xml \ "food").size
res18: Int = 5
scala> for(i<-xml.child if i.child.size>0) println ((i \ "name").text->(i \ "price").text)
(Belgian Waffles,$5.95)
(Strawberry Belgian Waffles,$7.95)
(Berry-Berry Belgian Waffles,$8.95)
(French Toast,$4.50)
(Homestyle Breakfast,$6.95)
scala> val yy=for(i<-xml.child if i.child.size>0) yield ((i \ "name").text->(i \ "price").text)
yy: Seq[(String, String)] = List((Belgian Waffles,$5.95), (Strawberry Belgian Waffles,$7.95), (Berry-Berry Belgian Waffles,$8.95), (French Toast,$4.50), (Homestyle Breakfast,$6.95))
scala> yy.toMap
res72: scala.collection.immutable.Map[String,String] = Map(Strawberry Belgian Waffles -> $7.95, Belgian Waffles -> $5.95, French Toast -> $4.50, Berry-Berry Belgian Waffles -> $8.95, Homestyle Breakfast -> $6.95)