Regex 101

Welcome to the "Regex 101" lesson! In this lesson, we'll dive into the fundamentals of regular expressions (Regex) and explore their syntax and usage in Python. Regex patterns are powerful tools for pattern matching and string manipulation, allowing you to search, validate, and extract specific patterns from text data. Let's get started!

First I want to make you aware of an invaluable tool when it comes to learning and visualizing Regex patterns. My favorite regex website, regex101.com. When you open the website, make sure you set the flavor to Python before using it to understand how these patterns work. I regularly use this tool to build and test regex patterns quickly, and I think you will find it useful too!

Now that you know that, lets begin by understanding the basic syntax of Regex. A Regex pattern is composed of characters and metacharacters that define a specific pattern to match in a string. Let's consider a simple example where we want to match the word "apple" in a text. We can construct a Regex pattern using the re module in Python as follows:

import re

text = "I love apples!"
pattern = r"apple"

matches = re.findall(pattern, text)
print(matches)

Let's break down what is happening in the code above.

  • import the re module,
  • define a sample text that contains the word "apple,"
  • construct a Regex pattern r"apple".

The r before the pattern string denotes a raw string, which is commonly used with Regex patterns to avoid potential issues with escape characters.

  • re.findall() function to find all occurrences of the pattern in the given text.

The function returns a list of matches. In this case, it will print ['apple'], indicating that the pattern was found once in the text.

Try running the code above in your the Python editor, then try changing it to extract the word "love" from the text variable.

Now, let's explore some common metacharacters used in Regex. One of the most widely used metacharacters is the dot (.), which matches any character except a newline. For example, consider the following code:

pattern = r"lov."

In this code snippet, the pattern r"lov." will match any three-character sequence that starts with "lov" and is followed by any character. The re.findall() function will return ['love'], indicating that the pattern was found once in the text.

Here are some other commonly used characters:

  • ^ matches the beginning of a string
  • $ matches the end of a string
  • * matches zero or more occurrences of the preceding character or group.
  • + matches one or more occurrences of the preceding character or group.
  • ? matches zero or one occurrence of the preceding character or group.
  • | is an OR statement...
  • - Is used to specify a range (0-9, a-z, etc...)

Additionally, we can use square brackets ([]) to define character classes in Regex. For instance:

pattern = r"[aeiou]"

In this code snippet, the pattern r"[aeiou]" matches any vowel character. The re.findall() function will return ['o', 'e', 'a', 'e'], indicating that the vowel characters in the text were found.

An example of the pipe (|) or statement would be something like the following:

import re

text = "I love apples and you love oranges!"
pattern = r"love apples|love oranges"

matches = re.findall(pattern, text)
print(matches)

Here we are using the pattern "love apples|love oranges" to state we should match either A or B separated by the pipe. This will return the following:

['love apples', 'love oranges']

Let's take a look at selecting all lower case words in the following string: I love apples and you love oranges!

pattern = r"[a-z]+"

This example will select all groups (technically called a character class, represented inside of the [ ] characters) of lower case letters (a through z).

Server Academy Members Only

Want to access this lesson? Just sign up for a free Server Academy account and you'll be on your way. Already have an account? Click the Sign Up Free button to get started..

0 0 votes
Lesson Rating
Subscribe
Notify of
profile avatar
0 Comments
Inline Feedbacks
View all comments

Saving Progress...

Sign up for free!

Sign up for free and get instant access to this course!.

Python 3 for Beginners

0%

0/1 Lessons

Installing Python on Windows

• 1hr 17min

0 / 4 lessons complete

Python Basics

• 28min

0 / 7 lessons complete

Python Variables

• 41min

0 / 8 lessons complete

Even more Python Variables!

• 41min

0 / 6 lessons complete

Conditional Statements

• 15min

0 / 3 lessons complete

Writing Functions

• 30min

0 / 5 lessons complete

Python Loops

• 23min

0 / 5 lessons complete

Python PIP and Modules

• 18min

0 / 4 lessons complete

RegEx

• 26min

0 / 4 lessons complete

Working with APIs

• 12min

0 / 3 lessons complete

Course Conclusion

• 2min

0 / 1 lessons complete