Regular Expression
Regular Expression Cheat Sheet
Regular Expression Websites
https://regexone.com/ - Very nice and quick questions to learn more about regex
Basic Matching
Each symbol matches a single character
.
Anything
\d
Digit in 0,1,2,3,4,5,6,7,8,9
\D
Non-digit
\w
"Word" (letters and digits and _)
\W
Non-word
\c
Control Character
{SPACE_KEYPRESS}
space
\t
tab
\r
return
\n
new line
\s
whitespace
\S
non-whitespace
\x
Hexadecimal digit
\O
Octal digit
Character Classes
[xyz]
[abcd]
is the same as [a-d]
. They match the "b" in "brisket", and the "c" in "chop".
[^xyz]
[^abc]
is the same as [^a-c]
. They initially match "o" in "bacon" and "h" in "chop".
Escape Sequences
"Escaping" is a way of treating characters which have a special meaning in regular expressions literally, rather than as special characters. The escape character is usually \
\
Escape following character
\Q
Begin literal sequence
\E
End literal sequence
Boundaries
Boundary characters are helpful in "anchoring" your pattern to some edge, but do not select any characters themselves
\b
word boundaries (as defined as any edge between a \w and \W)
\B
non-word boundaries
\A
Start of string
\Z
End of string
\<
Start of word
\>
End of word
^
the beginning of the line
$
The end of the line
Example: \bcat\b
finds a match in "the cat in the hat" but not in "locate"
Quantifiers
By default quantifiers just apply to the one character. Use (...) to specify explicit quantifier "scope"
X*
0 or more repetitions of X
X+
1 or more repetitions of X
X?
0 or 1 instances of X
X{m}
Exactly m instances of X
X{m,}
At least m instances of X
X{m,n}
Between m and n (inclusive) instances of X
Disjunction
(X|Y)
X or Y
Example: \b(cat|dog)s\b
matches cats and dogs.
Special Characters
The character {} [] ^ $ . | * + > \
(and - inside [...]) have special meaning in regex, so they must be "escaped" with \ to match them
Example: \.
matches the period .
and \\
matches the backslash \
Backreferences
Count your open parentheses (
from the left, starting with 1. Whatever is matched by parthesis number n
can be refernced later by \n
Example: \b(\w+) \1\b
matches two identical words with a space in between
Example 2: \b(\w+)er\b
and replacing with more \1
will map "the taller man" -> "the more tall man" and "I am shorter" -> "I am more short"
Last updated