Pretty Printing XML in Python

Introduction:

In this tutorial we are learning about how to Pretty Printing XML in Python. When processing XML data in Python, making it readable and structured can improve the understandability and manageability of your code. Printing the XML nicely or formatting it with appropriate indentation and line breaks is an important process to achieve these goals. Basically, the XML stands for Extensible Markup Language. It is used to encode information that humans and machines can understand. Data stored in XML format is easy to understand and manipulate. There are three important things to remember when working with XML which are simplicity, versatility, and usability.

In this tutorial, we will look at two ways to print XML using Python: xml.dom.minidom and xml.etree.ElementTree. By understanding this process, developers can effectively present XML data in an organized and visual way, making it easier to analyze and manage.

How do you print XML in Python?

There are two methods to do pretty printing XML in Python, which are given below -

Method 1: Using the xml.dom.minidom

Here, we are given some steps to do pretty printing XML using the xml.dom.minidom in Python. The steps are as follows -

  1. Import all the required modules: Firstly, we need to import the 'xml.dom.minidom' module, which makes deep use of the Document Object Model (DOM) API for XML parsing.
  2. Defining the 'pretty_print_xml_minidom' function:The function accepts an XML string as input and is responsible for parsing and printing the pretty XML using the "xml.dom .minidom" module.
  3. Parsing the string in XML: In the 'pretty_print_xml_minidom' function, we use the 'xml.dom.minidom.parseString()' method to parse the string in XML and then create a DOM object.
  4. Print the pretty XML: Next, we use the "toprettyxml()" method of the DOM object to create a pretty XML string. We pass the "indent" parameter with the value `" "` to specify the desired indentation level (in this case, two spaces).
  5. Remove blank lines: By default, the 'toprettyxml()' function adds blank or empty lines to the output. To remove these blank lines, we nicely split the XML string with newlines (`\n`), remove the leading or white spaces of each line, and then the non-empty lines join back together.
  6. Pretty print the XML: Lastly, we print the result of the Pretty XML string in Python.

Program Code:

Here, we give the program code to print the pretty XML string using the xml.dom.minidom in Python. It is parsing the DOM API. The code is given below -

Output:

Now, we compile and run the above program and find the pretty printing XML string in Python. The pretty printing XML output is given below -

<?xml version="1.0" ?>
<jtp>
  <element attribute="value">
    <Name>Priyanka Adhikary</Name>
  </element>
</jtp>

Method 2: Using the xml.etree.ElementTree

Here, we have given some steps to do pretty printing XML using the xml.etree.ElementTree in Python. The steps are -

  1. The required modules imported: Firstly, we import the 'xml.etree.ElementTree' module, which provides a fast and efficient API for parsing and processing XML.
  2. Definition of the 'Indent' function: This is a special function that recursively adds indents to XML elements. It must have an "elem" parameter (XML element) and an optional "level" parameter that indicates the current indentation level (default is 0).
  3. Indent XML: In the "indent" function, we add indentation by changing the "text" and "tail" attributes of the XML element. The "text" character represents the text immediately after the open label, and the "tail" character represents the text immediately before the closed label. By adding indentations to these products, we achieve better printing.
  4. Defining the 'pretty_print_xml_elementtree' function: This function takes an XML string as input and is responsible for parsing the XML and printing it nicely using the 'xml.etree.ElementTree' module in Python.
  5. Parsing the XML string: In the 'pretty_print_xml_elementtree' function, we use the `ET.fromstring()` method to parse the XML string and create the ElementTree object.
  6. Indent XML:We call the 'indent()' function of the XML element to recursively indent all elements.
  7. Convert the XML content back to a string: We use the ET.tostring() method to convert the XML content back to a string representation. To ensure the correct encoding of the resulting string, we pass the "encoding" parameter with the value "unicode".
  8. Printing the pretty XML:Finally, we print the pretty XML string in Python.

Program Code:

Here we give the program code to print the pretty XML string using the xml.etree.ElementTree module in Python. It defines a custom function that recursively indents XML elements. The code is given below -

Output:

Now, we compile and run the above program and find the pretty printing XML string in Python. The pretty printing XML output is given below -

<jtp>
  <element attribute="value">
    <Name>Priyanka Adhikary</Name>
  </element>
</jtp>

Conclusion:

In this tutorial, we are learning about Printing Pretty XML in Python. As a result, printing pretty XML in Python is crucial to improving the readability and structure of your XML files. Developers can easily format XML with quality, whether using the xml.dom.minidom or xml.etree.ElementTree modules. By using these techniques, programmers can improve code understanding, simplify debugging, and support better collaboration when working with XML files in Python projects.