How to Extract a Substring from Inside a String in Python?

In the following tutorial, we will learn how to use Python to extract a substring from within a string.

There are different methods available for the extraction of a Substring from a string. One such method is by utilizing the regular expressions.

Let us discuss how to use regular expression for the process of extraction with the help of some examples.

Using Regular Expression

A regex, or regular expression, is a string of characters that forms a pattern for searching. To find out if a text has a certain search pattern, use RegEx.

Using the Regular Expressions re.search() function, we will look for the string provided by the regular expression and extract it.

Example 1:

In the example below, we take a string as input and use the regular expression '(\$[0-9\,]*)' to extract the text's numerical substring.

Output:

 
The given string is
The phone is priced at $15,745.95 and has a camera.
The numeric substring is:
$15,745.95   

Explanation:

This Python software searches a specified text, str1, for a numeric substring using the re module. It searches for a pattern that includes a dollar sign, any number of numbers, commas, and exactly two digits that stand for cents. If a match is discovered, the numeric substring is printed. In this instance, the text 'The phone is priced at $15,745.95 and features a camera' is successfully extracted and printed as '$15,745.95'.

Example 2:

To extract a substring from within a string, utilize group capturing in regular expressions. The format and surrounding characters of the substring you wish to extract must be known. For instance, you may use the following if you have a line and wish to extract financial data in the style $xxx, xxx. xx.

Output:

 
$15,745.95   

Explanation:

This Python script uses the re-module to look for a numeric substring in the provided text. It uses a regular expression pattern to match a dollar sign, any number of digits, and commas. It ends with exactly two numbers, which indicate cents. If a match is detected, it reports the matching substring. '$15,745.95' is successfully extracted and printed from the original string. 'The phone is priced at $15,745.95 and features a camera.'

Note: The specific regex will vary based on your use case's circumstances.