今天突然發現了一個lxml的坑。
假設我們有一個節點
<id>123</id>
有兩個父節點都要用上述節點,則必須把上面的節點寫兩遍!用同一個會出錯!
出錯例子:
#!/usr/bin/env python #encoding:utf8 from lxml import etree if __name__ == "__main__": root1 = etree.Element("root1") #根節點1 root2 = etree.Element("root2") #根節點2 ver_node = etree.Element("id") #子節點 ver_node.text = "123" root1.append(ver_node) #都加入了同一個子節點 root2.append(ver_node) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8') print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
結果:
<?xml version='1.0' encoding='UTF-8'?> <root1/> <?xml version='1.0' encoding='UTF-8'?> <root2> <id>123</id> </root2>
只有后面一個有子節點,前面一個沒有!
正確寫法:
#!/usr/bin/env python #encoding:utf8 from lxml import etreeimport copy if __name__ == "__main__": root1 = etree.Element("root1") root2 = etree.Element("root2") ver_node1 = etree.Element("id") ver_node1.text = "123" ver_node2 = copy.deepcopy(ver_node1) #深拷貝! root1.append(ver_node1) root2.append(ver_node2) print etree.tostring(root1, pretty_print=True, xml_declaration=True, encoding='UTF-8') print etree.tostring(root2, pretty_print=True, xml_declaration=True, encoding='UTF-8')
結果:
<?xml version='1.0' encoding='UTF-8'?> <root1> <id>123</id> </root1> <?xml version='1.0' encoding='UTF-8'?> <root2> <id>123</id> </root2>
