In this article, we going to learn regular expressions with the “RE module”. This method allows you to search for a string, or within an array.
This lesson is suitable for the intermediate level if you are unfamiliar with the classes and functions it is not recommended to start this article, you can prepare it from the article here.
What is Re Module?
Re, short for Regular expression, is a python library. With the help of this module, in-sentence search recognition programs can be developed.
import re x = re.compile("[a-e]") print(x.findall("Hello, Mr. Mark")
Compile: ["e" , "a"]
As you can see, all the letters in the brackets section were searched in the sentence and added to the array when found.
The concept of “a – e” that we enter into gives us the letters a, b, c, d, and e. so we can use “a-e, b-d” instead of entering an array one by one.
Special Search Commands
The re module offers you many shortcuts. Now let’s look at the shortcuts used for numerical operations.
import re x = re.compile("\d") print(x.findall("Year 2021")) y = re.compile("\d+") print(y.findall("Year 2021"))
Output: [2 , 0 , 2 , 1] Output: 
If you do not put +, each letter will get a different index, if you put +, the combined numbers will be calculated as a single index.
There are many shortcuts like this one. Let’s check the other shortcuts. You can use all shortcuts in the same way.
import re x = re.compile("\w") y = re.compile("\w+") z = re.compile("\W") print(x.findall("Hi, Welcome")) print(x.findall("Hi, Welcome")) print(z.findall("Hi, Welcome..."))
output: ["H" , "i" , "," , "W" , "e" , "l" , "c" , "o" , "m" , "e" output: ["Hi" , "Welcome"] output: ["," , "..."]
With the “*” operator, the last letter you typed replaces the * operator, so you can get a complete string with the same as the last letter.
import re x = re.compile('xyz*') print(x.findall("xyzxyzzzzz"))
Output: ['xyz', 'xyzzzzz']
As you can see * is assigned as continuous z so all the letters z are added to the string. Let’s move on to a new function now.
The first function we learned was finding the concepts for us. This function is to find and manipulate those concepts.
from re import split print(split("W+" , "Hi, my name is ...")) print(split("d+" , "this year is 2021"))
Output: ["Hi" , "my" , "name" , "is"] Output: ["this" , "year" , "is"]
It is used to replace, an expression with another expression, very similar to the other function, it saves the modified version instead of deleting it from the string.
from re import sub print(sub("th" , "--" , "this book")) #case sensitive print(sub("th" , "**" , "This book" , flags = re.IGNORECASE))
Output: --is book Output: **is book
It is an adjustment tool, it fills the spaces with “\”. It works with other escape characters, for example, it fills the tab space you create.
from re import escape print(escape("this is program")) print(escape("this \t is \t program"))
Output: this\ is\ program Output: this\ \ \ is\ \ \ program