hometrix
 

List of Regular Expressions and How to Use

 

What is a Regular Expression?

A regular expression (often shortened to regex or regexp) is a sequence of characters that defines a search pattern. It's a powerful tool used for matching, locating, and manipulating text.

How It Works:

Think of a regular expression as a code or a template that describes what you're looking for in a piece of text. This pattern can be a single character, a sequence of characters, or a combination of both, including special characters that represent more complex patterns.

Common Uses:

  • Search and Replace: Find all occurrences of a specific word or phrase in a document and replace them with something else.
  • Data Validation: Ensure that user input (like an email address or phone number) follows a specific format.
  • Data Extraction: Pull out specific information (like numbers, dates, or names) from a larger text string.
  • Syntax Highlighting: In code editors, regexes are used to highlight different parts of code (keywords, comments, etc.).

Key Concepts:

  • Literal Characters: These are the characters you want to match exactly (e.g., "a", "1", "$").
  • Special Characters: These have special meanings in regular expressions and are used to create more complex patterns:
Pattern Meaning Example Match
. A dot matches any single character. a, 5, ?
* An asterix matches zero or more of the preceding character or group. ab* matches a, ab, abb, etc.
+ A plus matches one or more of the preceding character or group. ab+ matches ab, abb, but not a
? A question mark matches zero or one of the preceding character or group. ab? matches a, ab
[] Square brackets define a character set and match any single character within the brackets. [abc] matches a, b, or c
() Parentheses create a capturing group, allowing you to extract specific parts of the matched text. (ab)c captures "ab" in the match "abc"

Example:

Let's say you want to find all words in a text that start with the letter "a". You could use the following regular expression:

\ba\w+\b
  • \b: Word boundary (ensures we match whole words, not just parts)
  • a: The literal letter "a"
  • \w+: One or more word characters (letters, numbers, or underscores)
  • \b: Another word boundary

Using Regular Expressions:

Most programming languages (including JavaScript, Python, Java, etc.) and many text editors and tools provide ways to use regular expressions. The syntax might vary slightly between different tools, but the core concepts remain the same.



List of Regular Expressions

Basic Building Blocks:

Pattern Meaning Example Match
. A dot matches any single character (except newline). a, 5, ?
\d Any digit (0-9). 3, 7
\w Any word character (a-z, A-Z, 0-9, ). In other words, \w will match any single character that is typically considered part of a word. b, E, 8, -
\s Any whitespace character (space, tab, newline). (space)
\D Any non-digit character. a, !
\W Any non-word character. #, $
\S Matches any character that is NOT a whitespace character (space, tab, newline). a,3

Quantifiers:

Pattern Meaning Example Match
* Zero or more of the preceding character/group. ab* matches a, ab, abb, etc.
+ One or more of the preceding character/group. ab+ matches ab, abb, but not a
? Zero or one of the preceding character/group.
Example: ab?
a must be present, the quantifier does not follow.
b? b is optional. It can appear zero or one time.
ab? matches a, ab
{n} Exactly n occurrences of the preceding character/group. a{3} matches aaa
{n,} n or more occurrences. a{2,} matches aa, aaa, etc.
{n,m} At least n and at most m occurrences. a{2,4} matches aa, aaa, aaaa

Anchors:

Pattern Meaning
^ Start of string.
$ End of string.
\b Word boundary (between \w and \W).
\B Non-word boundary.

Character Classes and Groups:

Pattern Meaning Example Match
[abc] Any single character from the set. a, b, or c
[^abc] Any single character NOT in the set. d, 1, !
[a-z] Any lowercase letter. a, b, c, etc.
(abc) Group of characters (can be used with quantifiers and for capturing). abc
(?:abc) Non-capturing group (grouping without capturing the match).
\1, \2, etc. Backreferences (match the same text as a previous capturing group).

Alternation and Escaped Characters:

Pattern Meaning Example Match
| Alternation (OR). cat|dog matches "cat" or "dog".
\n Newline character. Matches a newline character in a string.
\t Tab character. Matches a tab character in a string.
\\ Literal backslash. \\ matches a single backslash.
\., \*, etc. Escape special characters to match them literally. \. matches a literal dot, \* matches a literal asterisk.