Reading and Writing XML Files in Python

Introduction:

In this tutorial, we learn about reading and writing the XML Files in Python. XML or Extensible Markup Language is a special language designed to be easily interpreted by humans and computers. A language describes a process for encoding data in a specific format. XML files are widely used in data communication between many computers. Python comes with various built-in libraries for processing and manipulating XML files. This tutorial explains how to read and write XML files in Python.

Generally, the process of reading data from an XML file and analyzing its content is called parsing. So, when we mean reading an xml file, we mean parsing an XML file. In this tutorial, we will look at two libraries that can be used for xml parsing, which are given as follows:

  1. Using the BeautifulSoup alongside with the lxml parser
  2. Using the Elementtree library

Now we learn about these two libraries in below -

1. Using the BeautifulSoup alongside with the lxml parser:

We will use the BeautifulSoup Python library to read and write xml files. Type the following command in the terminal to install the library.

The Beautiful Soup supports the HTML parser included in the Python standard library, but it also supports many third-party Python parsers. One of them is the lxml parser (for parsing XML or HTML files). The lxml can be installed by running the following command in your system:

First, we will learn how to read XML files. We will also analyze the data stored in it. Next, we will learn how to create an XML file and write information to it.

Reading XML files in Python:

Parsing the xml file requires two steps, which are given below -

  1. Finding Tags
  2. Extracting from tags

Program Code:

Here, we give a program code for Parsing or reading the xml file in Python by using the BeautifulSoup alongside the lxml parser. The code is given below -

Output:

Now, we run the above code and find the output from it. The output is given below -

[<unique>

   Add a video URL in here

  </unique>, <unique>

   Add a workbook URL here

 </unique>]

<child name="Frank" test="0">

    FRANK likes EVERYONE

  </child>

0

Writing XML files in Python:

Writing an XML file is an important process because it is not encoded in a special way. To change various aspects of the XML file, parsing is required first. In the code below, we will replace some of the above XML files.

Program Code:

Here, we give a program code for Parsing or writing the xml file in Python Python by using the BeautifulSoup alongside the lxml parser. The code is given below -

Output:

Now, we run the above code and find the output from it. The output is given below -

[<unique>

   Add a video URL in here

  </unique>, <unique>

   Add a workbook URL here

 </unique>]

<child name="Frank" test="0">

    FRANK likes EVERYONE

  </child>

0

Writing XML files in Python:

Writing an XML file is an important process because it is not encoded in a special way. To change various aspects of the XML file, parsing is required first. In the code below, we will replace some of the above XML files.

Program Code:

Here, we give a program code for Parsing or writing the xml file in Python Python by using the BeautifulSoup alongside the lxml parser. The code is given below -

Output:

Now we run the above code and find the output from it. The output is given in below -

<?xml version="1.0" encoding="utf-8"?>
<saranghe>
   <child name="Frank" test=" Hello"> 
     FRANK likes EVERYONE
   </child>
   <unique>
     Add a video URL in here
   </unique>
   <child name="Texas" test="1">
     TEXAS is a PLACE
   </child>
   <child name="Frank" test=" Hello"> 
     Exclusively
   </child>
   <unique>
     Add a workbook URL here
   </unique>
   <data>
     Add the content of your article here
     <family>
       Add the font family of your text here
     </family>
     <size>
       Add the font size of your text here
     </size>
  </data>
</saranghe>

2. Using the Elementtree library:

The Elementtree module provides us with many tools for processing XML files. The best part is that it includes standard Python's built-in libraries. So, you do not need to install any external modules for this purpose. Since xmlformat is a hierarchical file, it is easier to represent it as a tree. The ElementTree provided by this model provides a way to represent the entire XML document as a tree.

In the examples below, we will look at discrete methods for reading data from an XML file and writing data to the XML file.

Reading XML files in Python:

To read XML files using ElementTree, we first import the ElementTree class from the xml library with the name E. Then pass the xml file name to the ElementTree.parse() method to enable parsing of the xml file. Then, use the getroot() function to get the r (parent tag) of the xml file. Then, view (print) the root tag (not clear) of our xml file. Then, use r[0].attrib to display the attributes of the parent or child tags. r[0] is used to get the initial tag of the parent root, and attrib is used to get its attributes. We will display the text in the 1st subtag of the 5th subtag of the root tag.

Program Code:

Here, we give a program code for Parsing or reading the xml file in Python by using the Elementtree library. The code is given below -

Output:

Now, we run the above code and find the output from it. The output is given below -

<Element 'saranghe' at 0x7fe97c2cee08>
{'name': 'Frank', 'test': '0'}
    Add the font family of your text here

Writing XML files in Python:

Now, let us look at some of the methods for writing information to XML files. In this example, we will create an XML file from scratch.

Similarly, we first create a root tag (parent) under the name of chess by using the E.Element('chess') command. All tags will be under this tag, so once the root tag is defined other sub-elements can be created under it. We then use E.SubElement() to create a subtag or subelement called chess tag. Then, we create two more subtags named E4 and D4 open tags. Then, we use the set() method to add new attributes to the tags E4 and D4. The set() is a method used to check the attributes of tags in SubElement().

We will add text between the E4 and D4 tags using the text property in the SubElement function. Finally, we use E.tostring() to convert the contents of the element we created from "xml.etree.ElementTree.Element" into a bytes object for use (although sometimes the function name is tostring()). It is written as "bytes" instead of "str". Finally, we export the file to a file called gameofsquares.xml, which opens in "wb" mode to allow binaries to be written. Finally, we save the data into our file.

Program Code:

Here, we give a program code for writing the xml file in Python by using the Elementtree library. The code is given below -

Output:

Now, we run the above code and find the output from it. The output is given below -

<chess><Opening><E4 type="Accepted">Hello Coders</E4><D4
type="Declined">Welcome to javatpoint</ D4></Opening></chess>