Python USES ElementTree to manipulate XML to get nodes to read attributes to beautify XML

  • 2020-04-02 13:15:28
  • OfStack

1. The import library
You need three classes, ElementTree, Element, and subsubelement, to subclass the wrapper class
The from XML. Etree. ElementTree import ElementTree
The from XML. Etree. ElementTree import Element
From xml.etree.elementtree import SubElement as SE

2. Read in and parse
The tree = ElementTree (file = xmlfile)
Root = tree. Getroot ()
After reading, tree is the type of ElementTree, get the XML root using the getroot() method;

XML sample file:


<item sid='1712' name = ' big CC'  >
<a id=1></a>
<a id=2></a>
</item>

3. Get son nodes
Find all children of Element:


AArry = item.findall('a')
 Also can use getchildren() : 
childs =  item.getchildren()
     for subItem in childs:
           print subItem.get('id')

4. Insert son node
Method one:


 item = Element("item", {'sid' : '1713', 'name' : 'ityouhui'})
 root.append(item)

Method 2:

SE(root,'item',{'sid':'1713','name':'ityouhui'})

The advantage of method one is that you can continue with the item after insertion. The second method is simple in writing, where SE is SubElement and is declared in the introduction.

5. Operation attributes
Gets an attribute value of Element (eg: gets the name of item)


print root.find('item/name').text
print item.get('name')

Gets all the attributes of the Element

print item.items()       # [('sid', '1712'), ('name', ' big CC')]
print item.attrib        # {'sid': '1712', 'name': ' big CC'}

6. Beautify the XML
Before writing, pass in root to call this function, and the XML file is written in neat and beautiful format:


indent(root)
book.write(xmlfile,'utf-8')


## Get pretty look
def indent( elem, level=0):
    i = "n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        for e in elem:
            indent(e, level+1)
        if not e.tail or not e.tail.strip():
            e.tail = i
    if level and (not elem.tail or not elem.tail.strip()):
        elem.tail = i
    return elem


Related articles: