Overview
This blog post will include parsing the XML file, appending new elements to write to the XML, and updating the value of a node in the original XML file. The python xml.dom.minidom
package is used, and the details can be seen in its official document: xml.dom.minidom official document. The full text will operate around the following customer.xml
:
<?xml version="1.0" encoding="utf-8" ?>
<!-- This is list of customers -->
<customers>
<customer ID="C001">
<name>Acme Inc.</name>
<phone>12345</phone>
<comments>
<![CDATA[Regular customer since 1995]]>
</comments>
</customer>
<customer ID="C002">
<name>Star Wars Inc.</name>
<phone>23456</phone>
<comments>
<![CDATA[A small but healthy company.]]>
</comments>
</customer>
</customers>
CDATA: part of the data in XML that is not parsed by the parser.
declaration: in this article, nodes and nodes are considered to be the same concept, you can replace them anywhere in the whole text, I personally feel the difference is not very big, of course, you can also view it as my typing error.
1. Parse XML file
when parsing XML, all text is stored in a text node, and the text nodes are regarded as nodes child elements, such as: 2005, element nodes, has a text node value is “2005”, “2005” is not the value of the element, the most commonly used method is the getElementsByTagName () method, and then further access to the nodes according to the document structure parsing.
specific theory is not enough to describe, with the above XML file and the following code, you will clearly see the operation method, the following code is to perform all node names and node information output as follows:
# -*- coding: utf-8 -*-
"""
@Author : LiuZhian
@Time : 2019/4/24 0024 上午 9:19
@Comment :
"""
from xml.dom.minidom import parse
def readXML():
domTree = parse("./customer.xml")
# 文档根元素
rootNode = domTree.documentElement
print(rootNode.nodeName)
# 所有顾客
customers = rootNode.getElementsByTagName("customer")
print("****所有顾客信息****")
for customer in customers:
if customer.hasAttribute("ID"):
print("ID:", customer.getAttribute("ID"))
# name 元素
name = customer.getElementsByTagName("name")[0]
print(name.nodeName, ":", name.childNodes[0].data)
# phone 元素
phone = customer.getElementsByTagName("phone")[0]
print(phone.nodeName, ":", phone.childNodes[0].data)
# comments 元素
comments = customer.getElementsByTagName("comments")[0]
print(comments.nodeName, ":", comments.childNodes[0].data)
if __name__ == '__main__':
readXML()
2. Write to XML file When writing
, I think there are two ways:
Create a new XML file
in both cases, the method for creating element nodes is similar, all you have to do is create/get a DOM object, and then create a new node based on the DOM.
in the first case, you can create it by dom= minidom.document ()
; In the second case, you can get the dom object directly by parsing the existing XML file, for example dom = parse("./customer.xml")
when creating element/text nodes, you’ll probably write a four-step sequence like this:
now, I need to create a new customer node with the following information :
<customer ID="C003">
<name>kavin</name>
<phone>32467</phone>
<comments>
<![CDATA[A small but healthy company.]]>
</comments>
</customer>
code as follows:
def writeXML():
domTree = parse("./customer.xml")
# 文档根元素
rootNode = domTree.documentElement
# 新建一个customer节点
customer_node = domTree.createElement("customer")
customer_node.setAttribute("ID", "C003")
# 创建name节点,并设置textValue
name_node = domTree.createElement("name")
name_text_value = domTree.createTextNode("kavin")
name_node.appendChild(name_text_value) # 把文本节点挂到name_node节点
customer_node.appendChild(name_node)
# 创建phone节点,并设置textValue
phone_node = domTree.createElement("phone")
phone_text_value = domTree.createTextNode("32467")
phone_node.appendChild(phone_text_value) # 把文本节点挂到name_node节点
customer_node.appendChild(phone_node)
# 创建comments节点,这里是CDATA
comments_node = domTree.createElement("comments")
cdata_text_value = domTree.createCDATASection("A small but healthy company.")
comments_node.appendChild(cdata_text_value)
customer_node.appendChild(comments_node)
rootNode.appendChild(customer_node)
with open('added_customer.xml', 'w') as f:
# 缩进 - 换行 - 编码
domTree.writexml(f, addindent=' ', encoding='utf-8')
if __name__ == '__main__':
writeXML()
3. Update XML file
when updating XML, we only need to find the corresponding element node first, and then update the value of the text node or attribute under it, and then save it to the file. I will not say more about the details, but I have made the idea clear in the code, as follows:
def updateXML():
domTree = parse("./customer.xml")
# 文档根元素
rootNode = domTree.documentElement
names = rootNode.getElementsByTagName("name")
for name in names:
if name.childNodes[0].data == "Acme Inc.":
# 获取到name节点的父节点
pn = name.parentNode
# 父节点的phone节点,其实也就是name的兄弟节点
# 可能有sibNode方法,我没试过,大家可以google一下
phone = pn.getElementsByTagName("phone")[0]
# 更新phone的取值
phone.childNodes[0].data = 99999
with open('updated_customer.xml', 'w') as f:
# 缩进 - 换行 - 编码
domTree.writexml(f, addindent=' ', encoding='utf-8')
if __name__ == '__main__':
updateXML()
if there is anything wrong, please advise ~
Read More:
- Python: How to parses HTML, extracts data, and generates word documents
- The Python DOM method iterates over all the XML in a folder
- Python writes DICOM file (attributeerror: ‘filemetadataset’ object has no attribute ‘transfersyntax uid’ solution)
- Extracting Data from XML (Using Python to Access Web Data)
- Python: How to Delete Empty Files or Folders in the Directory
- SSL error of urllib3 when Python uploads files using Minio
- Python: SVN deletes files on local and remote repositories
- How to Solve Python Pandas Read or Import Files Error
- Python recursively traverses all files in the directory to find the specified file
- Python traverses all files under the specified path and retrieves them according to the time interval
- [zipfile] Python packages files as zip packages & decompresses them
- Python PIP Fatal error in launcher: Unable to create process using ‘“e:\program files\programdata
- [Solved] NPM install Error: check python checking for Python executable python2 in the PATH
- Invalid python sd, Fatal Python error: init_fs_encoding: failed to get the Python cod [How to Solve]
- How to Solve Python WARNING: Ignoring invalid distribution -ip (e:\python\python_dowmload\lib\site-packages)
- [Solved] opencv-python: recipe for target ‘modules/python3/CMakeFiles/opencv_python3.dir/all‘ failed
- Python Error: pip install mysql-connector-python failed
- Linux installs Python and upgrades Python
- Mxnet Export onnx Symbol and params files provided are invalid
- npm install Error: stack Error: Can’t find Python executable “python”