1 Using lxml 2 By default this library uses Python's standard ElementTree module for parsing XML, but it can be configured to use lxml module instead when importing the library. The resulting element structure has same API regardless which module is used for parsing. 3 The main benefits of using lxml is that it supports richer xpath syntax than the standard ElementTree and enables using Evaluate Xpath keyword. It also preserves the doctype and possible namespace prefixes saving XML. 4 The lxml support is new in Robot Framework 2.8.5.
Robot framework--內置庫xml學習(一)
學習XML內置庫,我認為需要掌握以下幾個知識點:
第一:內置庫的概念?有哪些內置庫,大概都有什么關鍵字?有區分版本嗎?跟RF版本有關么?為什么內置庫有些需要import,有些不需要import?
第二:XML內置庫使用的是python的哪個標准庫?對這個標准庫需要有哪些基本的了解?
第三:內置庫是怎么構建起來的?基本關鍵字是否能靈活的使用?
第四:有時候可能需要稍微修改下內置庫,比如增加一些關鍵字等,該怎么修改?
1、內置庫的概念
內置庫在官網上稱為standard library,即標准庫;其他比如:seleniumlibrary、androidlibrary等,官網稱為external library,即外部庫,第三方庫。

(1)對於標准庫,這些庫直接綁定在robot framework內,在python安裝目錄下\lib\site-packages\robot\libraries下可以看到,無需在下載。
(2)對於外部庫,需要根據個人或者公司需要,下載之后安裝導入才能使用的。
對於標准庫,又分兩類,類似於builtin庫是robot framework自動加載到內存的,安裝后按下F5就能直接使用,不需要import;而xml庫需要在再次import才能正常使用。因為BuiltIn library 提供了很多常用的關鍵字,比如Should Be Equal,Convert To Integer等,所以RF就把這個常用的庫自動加載到了內存。
不同版本的RF,支持不同的內置庫而且相同的內置庫里的關鍵字可能也是不一樣的,以RF3.0(使用命令robot --version查看RF版本)為例,3.0是目前最新的RF的版本,支持很多的內置庫,查看python安裝目錄下\Lib\site-packages\robot下的py文件,可以看到:

基本官網寫的10個標准庫都能在這里面找到相應的py文件。BuiltIn,Collections,DateTime,Dialogs,Process,OperatingSystem,Remote(沒有關鍵字,暫時不算在內),Screenshot,String,Telnet,XML.這11個庫,有些是在RF2.0的時候就已經有了的,最晚的DateTime,Process,XML是在RF2.8之后才內置的,也就是說如果當前使用的是RF2.8之前的版本,內置庫是無法直接import XML就是使用的,需要下載安裝才能使用,這點需要注意下,不同的RF版本,相同的標准庫之間也是會細微的區別,這需要仔細的去查看保准庫內每個版本的使用文檔。

2、11個標准庫的簡介

這個表的來源是來自官網的,官網的用戶手冊文檔已經描述的非常詳細了。學習的時候可以詳細的查看官網的相關文檔。
3、XML內置庫的學習
從內置庫的XML的源碼可以看出,RF使用的是ETree來對xml進行解析的,部分源碼如下:
1 import copy 2 import re 3 import os 4 5 try: 6 from lxml import etree as lxml_etree 7 except ImportError: 8 lxml_etree = None 9 10 from robot.api import logger 11 from robot.libraries.BuiltIn import BuiltIn 12 from robot.utils import (asserts, ET, ETSource, is_string, is_truthy, 13 plural_or_not as s) 14 from robot.version import get_version 15 16 17 should_be_equal = asserts.assert_equal 18 should_match = BuiltIn().should_match 19 20 21 class XML(object): 22 ROBOT_LIBRARY_SCOPE = 'GLOBAL' 23 ROBOT_LIBRARY_VERSION = get_version() 24 _xml_declaration = re.compile('^<\?xml .*\?>') 25 def __init__(self, use_lxml=False): 26 use_lxml = is_truthy(use_lxml) 27 if use_lxml and lxml_etree: 28 self.etree = lxml_etree 29 self.modern_etree = True 30 self.lxml_etree = True 31 else: 32 self.etree = ET 33 self.modern_etree = ET.VERSION >= '1.3' 34 self.lxml_etree = False 35 if use_lxml and not lxml_etree: 36 logger.warn('XML library reverted to use standard ElementTree ' 37 'because lxml module is not installed.') 38 39 40 def parse_xml(self, source, keep_clark_notation=False): 41 with ETSource(source) as source: 42 tree = self.etree.parse(source) 43 if self.lxml_etree: 44 strip = (lxml_etree.Comment, lxml_etree.ProcessingInstruction) 45 lxml_etree.strip_elements(tree, *strip, **dict(with_tail=False)) 46 root = tree.getroot() 47 if not is_truthy(keep_clark_notation): 48 NameSpaceStripper().strip(root) 49 return root
python提供了幾個標准庫都可以對xml進行解析,之前我使用的是DOM,基於RF使用的是ETree,便開始學習了下ETree的開發文檔。學習對XML文件的操作,那肯定也得對XML本身有最基本的了解,比如XML的用途,樹結構,節點類型(DOM),帶命名空間的xml。下面是部分的知識點的總結:
xml是一種可擴展的標記語言。要求標記需要成對的出現(有時候會進行簡寫<b/>)。一個典型的xml文檔如下所示:
1 <example> 2 <first id="1">text</first> 3 <second id="2"> 4 <child/> 5 </second> 6 <third> 7 <child>more text</child> 8 <second id="child"/> 9 <child><grandchild/></child> 10 </third> 11 </example>
A. 整個xml文檔是一個文檔節點,屬於根節點,比如上述文檔的<example>節點就是一個根節點,一個xml文件只能有一個根節點,否則解析的時候胡報錯的
B.每個 XML 標簽是一個元素節點,比如<first> 和<second>, <third>都屬於元素節點,卻屬於<example>的子節點。
C.attribute值:表示節點元素的屬性值,比如first 有一個屬性id,屬性值為1;second也有id屬性,屬性值為2,而third沒有屬性。
D.Text值:表示元素中的文本內容。比如:first 的text值就為1;second沒有,third也沒有;
一個xml還包含其他的內容:比如處理指令和一些注釋;在python的etree標准庫解析的過程中,是直接把這二個給剔除掉了。有興趣的可以根據官網給出的開發文檔,把常用的一些方法都敲一遍,主要的還是使用2個類 Element Objects和ElementTree Objects。
4、xml官網學習
1 XML 2 3 Library version: 3.0.2 4 Library scope: global 5 Named arguments: supported 6 Introduction 7 8 Robot Framework test library for verifying and modifying XML documents. 9 As the name implies, XML is a test library for verifying contents of XML files. In practice it is a pretty thin wrapper on top of Python's ElementTree XML API. 10 The library has the following main usages: 11 Parsing an XML file, or a string containing XML, into an XML element structure and finding certain elements from it for for further analysis (e.g. Parse XML and Get Element keywords). 12 Getting text or attributes of elements (e.g. Get Element Text and Get Element Attribute). 13 Directly verifying text, attributes, or whole elements (e.g Element Text Should Be and Elements Should Be Equal). 14 Modifying XML and saving it (e.g. Set Element Text, Add Element and Save XML). 15 Table of contents 16 Parsing XML 17 Using lxml 18 Example 19 Finding elements with xpath 20 Element attributes 21 Handling XML namespaces 22 Boolean arguments 23 Shortcuts 24 Keywords
介紹Robot Framework測試庫,用於驗證和修改XML文檔。顧名思義,XML是用於驗證XML文件內容的測試庫。 實際上,它是Python的ElementTree XML API的一個非常薄的包裝器。庫有以下主要用途:將XML文件或包含XML的字符串解析為XML元素結構,並從中找出某些元素以供進一步分析(例如,解析XML和獲取元素關鍵字)。獲取元素的文本或屬性(例如獲取元素文本和獲取元素屬性)。直接驗證文本,屬性或整個元素(例如元素文本應該與元素應該相等)。修改XML並保存(例如設置元素文本,添加元素和保存XML)。
1 Parsing XML 2 XML can be parsed into an element structure using Parse XML keyword. It accepts both paths to XML files and strings that contain XML. The keyword returns the root element of the structure, which then contains other elements as its children and their children. Possible comments and processing instructions in the source XML are removed. 3 XML is not validated during parsing even if has a schema defined. How possible doctype elements are handled otherwise depends on the used XML module and on the platform. The standard ElementTree strips doctypes altogether but when using lxml they are preserved when XML is saved. With IronPython parsing XML with a doctype is not supported at all. 4 The element structure returned by Parse XML, as well as elements returned by keywords such as Get Element, can be used as the source argument with other keywords. In addition to an already parsed XML structure, other keywords also accept paths to XML files and strings containing XML similarly as Parse XML. Notice that keywords that modify XML do not write those changes back to disk even if the source would be given as a path to a file. Changes must always saved explicitly using Save XML keyword. 5 When the source is given as a path to a file, the forward slash character (/) can be used as the path separator regardless the operating system. On Windows also the backslash works, but it the test data it needs to be escaped by doubling it (\\). Using the built-in variable ${/} naturally works too. 6 解析XML 7 可以使用Parse XML關鍵字將XML解析為元素結構。它接受XML文件和包含XML的字符串的路徑。關鍵字返回結構的根元素,然后包含其他元素作為其子元素和子元素。源XML中的可能注釋和處理指令將被刪除。 8 即使定義了模式,XML在分析過程中也不會被驗證。如何處理doctype元素,否則取決於使用的XML模塊和平台。標准ElementTree完全去除doctypes,但是當使用lxml時,保存XML時會保留它們。 IronPython用doctype解析XML根本不被支持。 9 由Parse XML返回的元素結構以及由Get Element等關鍵字返回的元素可用作其他關鍵字的源參數。除已經解析的XML結構之外,其他關鍵字也接受XML文件的路徑和包含XML的字符串,類似於解析XML。請注意,修改XML的關鍵字不會將這些更改寫回到磁盤,即使源將作為文件的路徑提供。必須始終使用保存XML關鍵字保存更改。 10 當源被指定為文件的路徑時,無論操作系統如何,正斜杠字符(/)都可以用作路徑分隔符。在Windows上,反斜杠也可以工作,但是它需要通過加倍(\\)來轉義測試數據。使用內置變量$ {/}自然也可以。
使用lxml
默認情況下,這個庫使用Python的標准ElementTree模塊來解析XML,但是當導入庫時,它可以被配置為使用lxml模塊。 無論使用哪個模塊解析,生成的元素結構都具有相同的API。
使用lxml的主要好處是它支持比標准ElementTree更豐富的xpath語法,並支持使用Evaluate Xpath關鍵字。 它還保留了doctype和保存XML的可能的名稱空間前綴。
lxml支持是Robot Framework 2.8.5中的新功能。
1 Example 2 The following simple example demonstrates parsing XML and verifying its contents both using keywords in this library and in BuiltIn and Collections libraries. How to use xpath expressions to find elements and what attributes the returned elements contain are discussed, with more examples, in Finding elements with xpath and Element attributes sections. 3 In this example, as well as in many other examples in this documentation, ${XML} refers to the following example XML document. In practice ${XML} could either be a path to an XML file or it could contain the XML itself. 4 <example> 5 <first id="1">text</first> 6 <second id="2"> 7 <child/> 8 </second> 9 <third> 10 <child>more text</child> 11 <second id="child"/> 12 <child><grandchild/></child> 13 </third> 14 <html> 15 <p> 16 Text with <b>bold</b> and <i>italics</i>. 17 </p> 18 </html> 19 </example> 20 ${root} = Parse XML ${XML} 21 Should Be Equal ${root.tag} example 22 ${first} = Get Element ${root} first 23 Should Be Equal ${first.text} text 24 Dictionary Should Contain Key ${first.attrib} id 25 Element Text Should Be ${first} text 26 Element Attribute Should Be ${first} id 1 27 Element Attribute Should Be ${root} id 1 xpath=first 28 Element Attribute Should Be ${XML} id 1 xpath=first 29 Notice that in the example three last lines are equivalent. Which one to use in practice depends on which other elements you need to get or verify. If you only need to do one verification, using the last line alone would suffice. If more verifications are needed, parsing the XML with Parse XML only once would be more efficient.
例
下面的簡單示例演示解析XML並驗證其內容,使用此庫中的關鍵字以及BuiltIn和Collections庫中的關鍵字。討論如何使用xpath表達式來查找元素以及返回元素包含的屬性,以及使用xpath和Element屬性部分查找元素的更多示例。
在本例中,以及本文檔中的許多其他示例中,$ {XML}引用以下示例XML文檔。實際上,$ {XML}既可以是XML文件的路徑,也可以包含XML本身。
<實例>
<first id =“1”>文字</ first>
<second id =“2”>
<子/>
</秒>
<第三>
<child>更多文字</ child>
<second id =“child”/>
<子> <孫子/> </子>
</第三>
<HTML>
<P>
帶有<b>粗體</ b>和<i>斜體</ i>的文本。
</ p>
</ HTML>
</示例>
$ {root} =解析XML $ {XML}
應該是等於$ {root.tag}的例子
$ {first} =首先獲取元素$ {root}
應該等於$ {first.text}文本
字典應包含密鑰$ {first.attrib} id
元素文本應該是$ {first}文本
元素屬性應該是$ {first} id 1
元素屬性應該是$ {root} id 1 xpath = first
元素屬性應該是$ {XML} id 1 xpath = first
注意在這個例子中最后三行是等價的。在實踐中使用哪一個取決於您需要獲取或驗證的其他元素。如果您只需要進行一次驗證,單單使用最后一行就足夠了。如果需要更多的驗證,只用解析XML解析XML將會更有效率。
1 Finding elements with xpath 2 ElementTree, and thus also this library, supports finding elements using xpath expressions. ElementTree does not, however, support the full xpath syntax, and what is supported depends on its version. ElementTree 1.3 that is distributed with Python 2.7 supports richer syntax than earlier versions. 3 The supported xpath syntax is explained below and ElementTree documentation provides more details. In the examples ${XML} refers to the same XML structure as in the earlier example. 4 If lxml support is enabled when importing the library, the whole xpath 1.0 standard is supported. That includes everything listed below but also lot of other useful constructs. 5 Tag names 6 When just a single tag name is used, xpath matches all direct child elements that have that tag name. 7 ${elem} = Get Element ${XML} third 8 Should Be Equal ${elem.tag} third 9 @{children} = Get Elements ${elem} child 10 Length Should Be ${children} 2 11 Paths 12 Paths are created by combining tag names with a forward slash (/). For example, parent/child matches all child elements under parent element. Notice that if there are multiple parent elements that all have child elements, parent/child xpath will match all these child elements. 13 ${elem} = Get Element ${XML} second/child 14 Should Be Equal ${elem.tag} child 15 ${elem} = Get Element ${XML} third/child/grandchild 16 Should Be Equal ${elem.tag} grandchild 17 Wildcards 18 An asterisk (*) can be used in paths instead of a tag name to denote any element. 19 @{children} = Get Elements ${XML} */child 20 Length Should Be ${children} 3 21 Current element 22 The current element is denoted with a dot (.). Normally the current element is implicit and does not need to be included in the xpath. 23 Parent element 24 The parent element of another element is denoted with two dots (..). Notice that it is not possible to refer to the parent of the current element. This syntax is supported only in ElementTree 1.3 (i.e. Python/Jython 2.7 and newer). 25 ${elem} = Get Element ${XML} */second/.. 26 Should Be Equal ${elem.tag} third 27 Search all sub elements 28 Two forward slashes (//) mean that all sub elements, not only the direct children, are searched. If the search is started from the current element, an explicit dot is required. 29 @{elements} = Get Elements ${XML} .//second 30 Length Should Be ${elements} 2 31 ${b} = Get Element ${XML} html//b 32 Should Be Equal ${b.text} bold 33 Predicates 34 Predicates allow selecting elements using also other criteria than tag names, for example, attributes or position. They are specified after the normal tag name or path using syntax path[predicate]. The path can have wildcards and other special syntax explained above. 35 What predicates ElementTree supports is explained in the table below. Notice that predicates in general are supported only in ElementTree 1.3 (i.e. Python/Jython 2.7 and newer). 36 Predicate Matches Example 37 @attrib Elements with attribute attrib. second[@id] 38 @attrib="value" Elements with attribute attrib having value value. *[@id="2"] 39 position Elements at the specified position. Position can be an integer (starting from 1), expression last(), or relative expression like last() - 1. third/child[1] 40 tag Elements with a child element named tag. third/child[grandchild] 41 Predicates can also be stacked like path[predicate1][predicate2]. A limitation is that possible position predicate must always be first.
用xpath查找元素
ElementTree,也就是這個庫,支持使用xpath表達式查找元素。然而,ElementTree不支持完整的xpath語法,支持的內容取決於它的版本。與Python 2.7分布的ElementTree 1.3支持比早期版本更豐富的語法。
受支持的xpath語法如下所述,ElementTree文檔提供了更多細節。在示例中,$ {XML}引用與前面示例中相同的XML結構。
如果在導入庫時啟用了lxml支持,則支持整個xpath 1.0標准。這包括下面列出的一切,但也有很多其他有用的結構。
標簽名稱
當僅使用單個標簽名稱時,xpath將匹配具有該標簽名稱的所有直接子元素。
$ {elem} =獲取元素$ {XML}的三分之一
應該等於$ {elem.tag}第三
@ {children} =獲取元素$ {elem}孩子
長度應該是$ {children} 2
路徑
路徑是通過將標記名稱與正斜杠(/)組合來創建的。例如,父/子匹配父元素下的所有子元素。請注意,如果有多個父元素都具有子元素,則父/子xpath將匹配所有這些子元素。
$ {elem} =獲取元素$ {XML}秒/子
應該是平等$ {elem.tag}孩子
$ {elem} =獲取元素$ {XML}第三/子/孫
應該等於$ {elem.tag}孫子
通配符
星號(*)可用於路徑而不是標簽名稱來表示任何元素。
@ {children} =獲取元素$ {XML} * / child
長度應該是$ {children} 3
當前元素
當前元素用點(。)表示。通常,當前元素是隱式的,不需要包含在xpath中。
父元素
另一個元素的父元素用兩個點(..)表示。請注意,不可能引用當前元素的父元素。這個語法僅在ElementTree 1.3(即Python / Jython 2.7和更新版本)中被支持。
$ {elem} =獲取元素$ {XML} * /秒/ ..
應該等於$ {elem.tag}第三
搜索所有子元素
兩個正斜杠(//)意味着搜索所有子元素,而不僅僅是直接子元素。如果搜索從當前元素開始,則需要一個明確的點。
@ {elements} =獲取元素$ {XML}。// second
長度應該是$ {elements} 2
$ {b} =獲取元素$ {XML} html // b
應該等於$ {b.text}大膽
謂詞
謂詞允許使用標記名稱以外的其他標准來選擇元素,例如屬性或位置。它們是在使用語法path [predicate]的標准名稱或路徑之后指定的。路徑可以有通配符和上面解釋的其他特殊語法。
下表說明了ElementTree支持的謂詞。請注意,一般來說謂詞僅在ElementTree 1.3(即Python / Jython 2.7及更新版本)中受支持。
謂詞匹配示例
@attrib具有屬性attrib的元素。第二[@id]
@ attrib =“value”屬性attrib具有值的元素。 * [@ ID = “2”]
在指定位置放置元素。位置可以是一個整數(從1開始),表達式last()或像last() - 1的相對表達式。third / child [1]
標簽具有子元素名稱標簽的元素。第三/兒童[孫子]
謂詞也可以像路徑[predicate1] [predicate2]一樣堆疊。一個限制是可能的位置謂詞必須始終是第一位的。
