A regular expression (often shortened to regex or regexp) is a sequence of characters that defines a search pattern. It's a powerful tool used for matching, locating, and manipulating text.
Think of a regular expression as a code or a template that describes what you're looking for in a piece of text. This pattern can be a single character, a sequence of characters, or a combination of both, including special characters that represent more complex patterns.
Pattern | Meaning | Example Match |
---|---|---|
. | A dot matches any single character. | a, 5, ? |
* | An asterix matches zero or more of the preceding character or group. | ab* matches a, ab, abb, etc. |
+ | A plus matches one or more of the preceding character or group. | ab+ matches ab, abb, but not a |
? | A question mark matches zero or one of the preceding character or group. | ab? matches a, ab |
[] | Square brackets define a character set and match any single character within the brackets. | [abc] matches a, b, or c |
() | Parentheses create a capturing group, allowing you to extract specific parts of the matched text. | (ab)c captures "ab" in the match "abc" |
Let's say you want to find all words in a text that start with the letter "a". You could use the following regular expression:
\ba\w+\bMost programming languages (including JavaScript, Python, Java, etc.) and many text editors and tools provide ways to use regular expressions. The syntax might vary slightly between different tools, but the core concepts remain the same.
Pattern | Meaning | Example Match |
---|---|---|
. |
A dot matches any single character (except newline). | a , 5 , ? |
\d |
Any digit (0-9). | 3 , 7 |
\w |
Any word character (a-z, A-Z, 0-9, ). In other words, \w will match any single character that is typically considered part of a word. | b , E , 8 , - |
\s |
Any whitespace character (space, tab, newline). | (space) |
\D |
Any non-digit character. | a , ! |
\W |
Any non-word character. | # , $ |
\S |
Matches any character that is NOT a whitespace character (space, tab, newline). | a ,3 |
Pattern | Meaning | Example Match |
---|---|---|
* |
Zero or more of the preceding character/group. | ab* matches a, ab, abb, etc. |
+ |
One or more of the preceding character/group. | ab+ matches ab, abb, but not a |
? |
Zero or one of the preceding character/group. Example: ab? a must be present, the quantifier does not follow. b? b is optional. It can appear zero or one time. |
ab? matches a, ab |
{n} |
Exactly n occurrences of the preceding character/group. | a{3} matches aaa |
{n,} |
n or more occurrences. | a{2,} matches aa, aaa, etc. |
{n,m} |
At least n and at most m occurrences. | a{2,4} matches aa, aaa, aaaa |
Pattern | Meaning |
---|---|
^ |
Start of string. |
$ |
End of string. |
\b |
Word boundary (between \w and \W ). |
\B |
Non-word boundary. |
Pattern | Meaning | Example Match |
---|---|---|
[abc] |
Any single character from the set. | a, b, or c |
[^abc] |
Any single character NOT in the set. | d, 1, ! |
[a-z] |
Any lowercase letter. | a, b, c, etc. |
(abc) |
Group of characters (can be used with quantifiers and for capturing). | abc |
(?:abc) |
Non-capturing group (grouping without capturing the match). | |
\1 , \2 , etc. |
Backreferences (match the same text as a previous capturing group). |
Pattern | Meaning | Example Match |
---|---|---|
| |
Alternation (OR). | cat|dog matches "cat" or "dog". |
\n |
Newline character. | Matches a newline character in a string. |
\t |
Tab character. | Matches a tab character in a string. |
\\ |
Literal backslash. | \\ matches a single backslash. |
\. , \* , etc. |
Escape special characters to match them literally. | \. matches a literal dot, \* matches a literal asterisk. |