Cheatsheets

RegEx

RegEx

Regular expressions (regex or regexp) are patterns used to match character combinations in strings. They are powerful tools for pattern matching, validation, and text processing across many programming languages.

6 Categories 18 Sections 54 Examples
RegEx Regular Expressions Pattern Matching Text Processing Validation Syntax Anchors Character Classes

Getting Started

Fundamental regex concepts and basic pattern matching techniques.

Basic Syntax

Introduction to regex pattern matching, flags, and basic syntax.

Simple pattern matching

Tests if the string contains the literal pattern 'abc'.

Code
abc
Execution
pattern.test('abcdef')
Input
abcdef
Output
true
  • Simple patterns match literal characters in order.
  • The test() method returns true if pattern is found.

Case-insensitive matching

The 'i' flag makes the pattern case-insensitive.

Code
/hello/i
Execution
/hello/i.test('HELLO world')
Input
HELLO world
Output
true
  • The 'i' flag ignores case when matching.
  • Useful for user input validation.

Global matching

The 'g' flag finds all matches, not just the first one.

Code
/a/g
Execution
'banana'.match(/a/g)
Input
banana
Output
['a', 'a', 'a']
  • Without 'g', only the first match is returned.
  • 'g' is essential for global replacements.

Character Classes

Matching sets of characters using brackets, ranges, and negation.

Matching character sets

Matches any single vowel character from the set.

Code
[aeiou]
Execution
/[aeiou]/.test('hello')
Input
hello
Output
true
  • [abc] matches any one character from the set.
  • Characters are evaluated individually.

Character ranges

Ranges match characters within specified inclusive boundaries.

Code
[a-z], [A-Z], [0-9], [a-zA-Z0-9]
Execution
/[a-z]+/.test('abc')
Input
abc
Output
true
  • [a-z] matches lowercase letters.
  • [0-9] matches numeric digits.

Negated character class

The '^' at the start means NOT, matching any non-digit character.

Code
[^0-9]
Execution
/[^0-9]/.test('abc7')
Input
abc7
Output
true
  • [^abc] matches any character except a, b, or c.
  • ^ must be the first character in the class.

Shorthand Classes

Using shorthand character class escapes like \d, \w, \s for common patterns.

Digit matching with \d

\d matches any digit, + means one or more. Extracts all numbers.

Code
\d+
Execution
'Price: $25.99'.match(/\d+/g)
Input
Price $25.99
Output
['25', '99']
  • \d is equivalent to [0-9].
  • \D matches non-digits.

Word character matching

\w matches word characters (letters, digits, underscores).

Code
\w+
Execution
'hello_world123'.match(/\w+/g)
Input
hello_world123
Output
['hello_world123']
  • \w matches [a-zA-Z0-9_].
  • \W matches non-word characters.

Whitespace matching

\s matches any whitespace character (space, tab, newline).

Code
\s+
Execution
'hello world'.split(/\s+/)
Input
hello world
Output
['hello', 'world']
  • \s is equivalent to [ \t\n\r\f\v].
  • \S matches non-whitespace.

Anchors and Boundaries

Using anchors to match positions in strings and word boundaries.

Anchors

Using ^ and $ to match start and end of strings or lines.

Start of string anchor

^ anchors the pattern to the start of the string.

Code
^hello
Execution
/^hello/.test('hello world')
Input
hello world
Output
true
  • ^ must be at the beginning to anchor to string start.
  • Only matches if 'hello' is at position 0.

End of string anchor

$ anchors the pattern to the end of the string.

Code
world$
Execution
/world$/.test('hello world')
Input
hello world
Output
true
  • Only matches if 'world' is at the very end.
  • Useful for validating complete strings.

Exact string matching

Both ^ and $ ensure the entire string matches exactly.

Code
^hello world$
Execution
/^hello world$/.test('hello world')
Input
hello world
Output
true
  • Useful for strict validation.
  • The string must be exactly 'hello world'.

Word Boundaries

Detecting word boundaries with \b and \B.

Matching whole words only

\b ensures 'cat' is a whole word, not part of another word.

Code
\bword\b
Execution
/\bcat\b/.test('concatenate')
Input
concatenate
Output
false
  • \b matches between a word and non-word character.
  • Prevents partial matches within larger words.

Word boundary matching

Matches 'cat' only when it's a standalone word.

Code
\bword\b
Execution
/\bcat\b/.test('the cat sat')
Input
the cat sat
Output
true
  • Works with word boundaries around punctuation too.
  • Very useful for word-based search and replace.

Non-word boundary

\B matches when NOT at a word boundary (inside a word).

Code
\Bword\B
Execution
/\Bcat\B/.test('concatenate')
Input
concatenate
Output
true
  • \B is the opposite of \b.
  • Useful for finding patterns within words.

Multiline Matching

Using the multiline flag to match across multiple lines.

Single-line mode (default)

Without m flag, ^ only matches start of entire string.

Code
/^hello/
Execution
/^hello/.test('foo\nhello')
Input
foo
hello
Output
false
  • 'hello' on second line doesn''t match ^hello without m flag.'

Multiline mode with flag

With m flag, ^ matches after newlines too, not just string start.

Code
/^hello/m
Execution
/^hello/m.test('foo\nhello')
Input
foo
hello
Output
true
  • The 'm' flag makes ^ and $ line-aware.
  • Useful for multiline text processing.

Matching line patterns

Matches entire 'Error' line using multiline anchors.

Code
/^Error:.*/m
Execution
/^Error:.*/m.test('Info\nError: Failed')
Input
Info
Error: Failed
Output
true
  • Combine m flag with ^ and $ for line-based patterns.

Quantifiers and Repetition

Specifying how many times elements should match.

Quantifiers

Using *, +, ?, and {n,m} to specify repetition counts.

Zero or more matches

'*' matches zero or more occurrences. Even no 'a' matches.

Code
a*
Execution
/a*/.test('bbb')
Input
bbb
Output
true
  • a* matches '', 'a', 'aa', 'aaa', etc.
  • Always matches because * includes 0 occurrences.

One or more matches

'+' matches one or more occurrences. At least one required.

Code
a+
Execution
/a+/.test('aaa')
Input
aaa
Output
true
  • a+ requires at least one 'a'.
  • a+ doesn't match empty string.

Exact quantity with braces

'{3}' matches exactly 3 occurrences.

Code
a{3}
Execution
/a{3}/.test('aaaaaa')
Input
aaaaaa
Output
true
  • a{3} matches exactly 'aaa'.
  • a{1,3} matches 1 to 3 occurrences.

Greedy vs Lazy Matching

Understanding greedy and lazy (non-greedy) quantifiers.

Greedy matching

Greedy .* matches as much as possible, stopping at last 'b'.

Code
a.*b
Execution
'axxxbxxxb'.match(/a.*b/)
Input
axxxbxxxb
Output
['axxxbxxxb']
  • .* is greedy; it matches from first 'a' to last 'b'.
  • Quantifiers are greedy by default.

Lazy matching

Lazy .*? matches as little as possible, stopping at first 'b'.

Code
a.*?b
Execution
'axxxbxxxb'.match(/a.*?b/)
Input
axxxbxxxb
Output
['axxxb']
  • .*? is lazy; it matches from first 'a' to first 'b'.
  • Add ? after any quantifier to make it lazy.

Lazy with + quantifier

a+? matches minimally - just one 'a' instead of all.

Code
a+?
Execution
'aaaa'.match(/a+?/)
Input
aaaa
Output
['a']
  • Adding ? makes any quantifier lazy.
  • Lazy quantifiers match minimum instead of maximum.

Alternation

Using | to match one pattern from multiple choices.

Simple alternation

| means OR - matches either 'cat' or 'dog'.

Code
cat|dog
Execution
/cat|dog/.test('I have a cat')
Input
I have a cat
Output
true
  • cat|dog matches 'cat' or 'dog'.
  • Leftmost match wins if multiple alternatives match.

Multiple alternation options

Matches any of the three colors.

Code
red|green|blue
Execution
/red|green|blue/.test('the sky is blue')
Input
the sky is blue
Output
true
  • Use | to separate multiple alternatives.
  • Order matters; first matching option is used.

Alternation within groups

Parentheses group alternatives; must match before 'Smith'.

Code
(Mr|Ms|Mrs) Smith
Execution
/^(Mr|Ms|Mrs) Smith$/.test('Ms Smith')
Input
Ms Smith
Output
true
  • (cat|dog) applies alternation to grouped part only.
  • Without (), cat|dog box matches 'cat' or 'dog box'.

Groups and Capture

Using parentheses for grouping and capturing matched text.

Capturing Groups

Using parentheses to capture and reference matched text.

Basic capturing group

Parentheses create capture groups. Result includes full match and each group.

Code
(\w+) (\w+)
Execution
'hello world'.match(/(\w+) (\w+)/)
Input
hello world
Output
['hello world', 'hello', 'world']
  • Group 1 captures 'hello', Group 2 captures 'world'.
  • Array includes full match at index 0.

Backreference in pattern

\1 refers back to what the first group captured.

Code
(\w+) \1
Execution
/(\w+) \1/.test('hello hello')
Input
hello hello
Output
true
  • \1 references the first group.
  • Useful for matching repeated patterns.

Replace with capture groups

$1, $2, etc. reference captured groups in replacement.

Code
(\w+) (\w+)
Execution
'hello world'.replace(/(\w+) (\w+)/, '$2 $1')
Input
hello world
Output
world hello
  • $0 is the full match.
  • Useful for rearranging captured text.

Non-Capturing Groups

Using parentheses for grouping without capturing.

Non-capturing group syntax

(?:...) groups without capturing the match.

Code
(?:cat|dog)
Execution
/(?:cat|dog)/.test('I have a cat')
Input
I have a cat
Output
true
  • (?:...) works like (...) but doesn't capture.
  • Useful when you only need grouping, not extraction.

Non-capturing vs capturing

Non-capturing groups don't create extra array entries.

Code
(?:foo|bar) baz vs (foo|bar) baz
Execution
'foo baz'.match(/(?:foo|bar) baz/)
Input
foo baz
Output
['foo baz']
  • Capturing group would create index [1].
  • Non-capturing is slightly more efficient.

Complex non-capturing groups

Groups pattern without capturing each domain segment.

Code
\b(?:\w+\.)+com\b
Execution
/\b(?:\w+\.)+com\b/.test('example.com')
Input
example.com
Output
true
  • Useful for repeated grouping patterns.
  • Makes regex cleaner when captures not needed.

Lookahead and Lookbehind

Using assertions to match patterns with conditional lookahead/lookbehind.

Positive lookahead

(?=...) matches only if followed by the pattern, but doesn't consume it.

Code
\w+(?=@)
Execution
'user@example.com'.match(/\w+(?=@)/)
Input
user@example.com
Output
['user']
  • Matches 'user' only if followed by @.
  • The @ is not included in the match.

Negative lookahead

(?!...) matches only if NOT followed by the pattern.

Code
\w+(?!@)
Execution
/'example@'.match(/\w+(?!@)/)
Input
example@
Output
['example']
  • Matches word chars not followed by @.
  • Useful for exclusion patterns.

Lookbehind assertion

(?<=...) matches only if preceded by the pattern.

Code
(?<=\$)\d+
Execution
"/'Price: \$50'.match(/(?<=\\$)\\d+/)"
Input
Price $50
Output
['50']
  • Matches digits only if preceded by $.
  • The $ is not included in the match.

Escaped Characters and Special Sequences

Escaping special characters and using special sequences.

Escape Sequences

Using backslash to escape metacharacters and special characters.

Escaping metacharacters

Backslash escapes special characters so they match literally.

Code
\. \* \+ \?
Execution
/\./.test('end.')
Input
end.
Output
true
  • \. matches literal dot, not any character.
  • Most regex metacharacters need escaping.

Escaping brackets and parentheses

Escape brackets and parentheses to match them literally.

Code
\( \) \[ \] \{ \}
Execution
/(test)/.test('(test)')
Input
(test)
Output
true
  • \( matches literal ( not a group.
  • All bracket types need escaping.

Escaping dollar and caret

Escape $ and ^ when you need literal matches.

Code
\$ \^
Execution
/\$/.test('cost: $50')
Input
cost: $50
Output
true
  • \$ matches literal $.
  • \^ matches literal ^.

Character Escape

Escaping specific characters and special sequences like tabs and newlines.

Tab and newline escapes

\t matches tab, \n matches newline, \r matches carriage return.

Code
\t, \n, \r
Execution
/\t/.test('name\tvalue')
Input
name value
Output
true
  • These are whitespace character escapes.
  • Useful for parsing structured data.

Matching whitespace patterns

Matches Windows-style line endings (CRLF).

Code
\r\n
Execution
/\r\n/.test('line1\r\nline2')
Input
line1
line2
Output
true
  • \r\n is Windows line ending.
  • \n is Unix line ending.

Null and other escapes

Special escapes for null, vertical tab, and form feed.

Code
\0, \v, \f
Execution
/\0/.test('null\0char')
Input
nullchar
Output
true
  • \0 matches null character.
  • \v is vertical tab, \f is form feed.

Unicode and Special Sequences

Matching Unicode characters and special named sequences.

Unicode hex escape

\u0041 represents Unicode character 'A' (U+0041).

Code
\uXXXX, \u0041
Execution
/\u0041/.test('ABC')
Input
ABC
Output
true
  • Unicode escapes use 4 hex digits.
  • Useful for matching international characters.

Unicode codepoint escape

\u{...} with u flag matches Unicode by codepoint with variable length.

Code
\u{XXXXX}, \u{1F600}
Execution
/\u{1F600}/.test('😀')
Input
😀
Output
true
  • Requires 'u' flag for proper surrogate pair handling.
  • Supports emoji and beyond-BMP characters.

Unicode property escapes

\p{...} matches Unicode character properties with 'u' flag.

Code
\p{Letter}, \P{Number}
Execution
/\p{Letter}/u.test('café')
Input
café
Output
true
  • Requires 'u' flag.
  • Very powerful for international text.

Practical Examples and Flags

Common real-world patterns, flags, and string operations.

Common Patterns

Real-world regex patterns for validation and matching.

Email validation pattern

Basic email pattern matching username@domain.extension.

Code
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Execution
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test('user@example.com')
Input
user@example.com
Output
true
  • This is simplified; RFC 5322 is more complex.
  • Works for most common email formats.

URL matching pattern

Matches HTTP and HTTPS URLs with domain validation.

Code
^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)$
Execution
/^https?:/.test('https://example.com')
Input
https://example.com
Output
true
  • Simplified example; full URL regex is quite complex.
  • Use URL parsing libraries for production.

Phone number pattern (US format)

Matches various US phone number formats.

Code
^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$
Execution
/^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/.test('(555) 123-4567')
Input
(555) 123-4567
Output
true
  • Handles parentheses, dashes, dots, and spaces.
  • Captures area code, exchange, and line number.

Regex Flags

Using flags to modify regex behavior globally.

Global flag (g)

The 'g' flag finds all matches, not just the first.

Code
/pattern/g
Execution
'hello hello'.match(/hello/g)
Input
hello hello
Output
['hello', 'hello']
  • Without 'g', only first match is returned.
  • Essential for replace-all operations.

Case-insensitive flag (i)

The 'i' flag ignores case when matching.

Code
/pattern/i
Execution
/HELLO/i.test('hello')
Input
hello
Output
true
  • Useful for case-insensitive searches.
  • Affects both pattern and input.

Multiline and Dotall flags (m, s)

'm' makes ^ and $ match lines. 's' makes . match newlines.

Code
/pattern/m, /pattern/s
Execution
/^test/m.test('\ntest')
Input
\ntest
Output
true
  • 'm' flag processes multiline text.
  • 's' flag makes . match including newlines.

String Operations with Regex

Using regex with JavaScript string methods.

Test method

test() returns true if pattern matches, false otherwise.

Code
/pattern/.test(string)
Execution
/hello/.test('hello world')
Input
hello world
Output
true
  • Returns boolean only.
  • Fastest method for simple matching.

Match method

match() returns array of all matches with 'g' flag.

Code
string.match(/pattern/g)
Execution
'hello world'.match(/\w+/g)
Input
hello world
Output
['hello', 'world']
  • Returns null if no match found.
  • Without 'g', returns match with capture groups.

Replace method

replace() replaces first match. Use 'g' flag for all matches.

Code
string.replace(/pattern/g, 'replacement')
Execution
'hello world'.replace(/world/, 'universe')
Input
hello world
Output
hello universe
  • Can use $1, $2 for capture group references.
  • Can be used with callback functions.