Regex Cheat Sheet: Regular Expressions for Developers

Regular expressions are one of those tools that developers either love or avoid. Once you understand the building blocks, a pattern that looks like line noise starts to read like plain English. This cheat sheet walks through everything you need — from the simplest character match to lookahead assertions — and closes with a table of real-world patterns you can copy straight into your code.

What Is a Regular Expression?

A regular expression (regex) is a sequence of characters that defines a search pattern. Virtually every programming language supports them: JavaScript, Python, Go, Java, Ruby, Rust — all ship with a regex engine. You use them to validate input, extract substrings, replace text, and split strings in ways that simple indexOf or split calls can't handle cleanly.

A regex like /^\d4-\d2-\d2$/ immediately communicates "four digits, a dash, two digits, a dash, two digits — nothing else." That intent is harder to express with plain string operations.

Character Classes

Character classes match a single character from a defined set.

. — Any character except a newline (use the s flag to include newlines).
\d — Any digit: [0-9].
\D — Any non-digit.
\w — Any word character: [a-zA-Z0-9_].
\W — Any non-word character.
\s — Any whitespace: space, tab, newline, carriage return, form feed.
\S — Any non-whitespace character.
[abc] — Exactly one of a, b, or c.
[^abc] — Any character that is not a, b, or c.
[a-z] — Any lowercase letter. Combine ranges: [a-zA-Z0-9].

// Match a single hex digit
/[0-9a-fA-F]/

// Match anything that is not a digit or whitespace
/[^\d\s]/

Anchors

Anchors don't match characters — they match positions in the string.

^ — Start of the string (or start of a line in multiline mode).
$ — End of the string (or end of a line in multiline mode).
\b — Word boundary: the position between a word character and a non-word character.
\B — Non-word boundary.

// Only match "cat" as a whole word, not "concatenate"
/\bcat\b/

// Only match strings that are exactly five digits
/^\d{5}$/

Forgetting anchors is one of the most common regex mistakes. /\d+/ matches any string that contains digits, not one that is digits. Add ^ and $ when you want a full-string match.

Quantifiers: Greedy vs. Lazy

Quantifiers control how many times the preceding element must match.

* — Zero or more times.
+ — One or more times.
? — Zero or one time (makes the preceding element optional).
{n} — Exactly n times.
{n,} — At least n times.
{n,m} — Between n and m times (inclusive).

By default, quantifiers are greedy — they match as much as possible. Add ? after a quantifier to make it lazy (match as little as possible).

const html = '<b>bold</b> and <i>italic</i>';

// Greedy: matches from the first < to the LAST >
html.match(/<.+>/);   // '<b>bold</b> and <i>italic</i>'

// Lazy: matches the shortest possible sequence
html.match(/<.+?>/);  // '<b>'

Greedy matching can silently over-consume input. If your pattern isn't capturing what you expect, try making the quantifiers lazy first.

Groups

Groups let you apply quantifiers to a sequence of characters, capture submatches, and create named references.

(abc) — Capturing group. Matches abc and stores the match. Accessible as $1, $2, etc. in replacements.
(?:abc) — Non-capturing group. Groups without storing the match — useful when you need grouping for a quantifier but don't need the submatch.
(?<name>abc) — Named capturing group. Like a capturing group, but accessible by name (e.g., match.groups.name in JavaScript).
abc|def — Alternation. Matches either abc or def. Often used inside a group: (jpg|png|gif).

// Named groups for date parsing
const re = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const { year, month, day } = '2026-05-21'.match(re).groups;
// year = '2026', month = '05', day = '21'

Lookahead and Lookbehind

Lookaround assertions let you match something only if it is (or isn't) followed or preceded by another pattern — without including that surrounding text in the match.

(?=...) — Positive lookahead. Match only if what follows fits the pattern.
(?!...) — Negative lookahead. Match only if what follows does not fit.
(?<=...) — Positive lookbehind. Match only if what precedes fits.
(?<!...) — Negative lookbehind. Match only if what precedes does not fit.

// Extract the number after "price: " without including "price: " in the match
/(?<=price:\s)\d+(\.\d{2})?/

// Match a word NOT followed by a colon (i.e., not a key in a key: value pair)
/\w+(?!:)/

// Match "USD" only when preceded by a dollar amount
/(?<=\$[\d,]+\s?)USD/

Flags

Flags modify how the regex engine interprets the pattern. In JavaScript, they go after the closing slash: /pattern/flags.

g — Global. Find all matches rather than stopping after the first.
i — Case-insensitive. /hello/i matches Hello, HELLO, etc.
m — Multiline. ^ and $ match the start and end of each line, not just the whole string.
s — Dot-all. Makes . match newline characters too.
u — Unicode. Enables full Unicode matching (including surrogate pairs and \u{...} escapes). Always use this flag in modern code.

Real-World Patterns

The patterns below cover the most common validation tasks. Copy them directly, but remember that ultra-strict validation belongs on the server — for UX purposes, a pattern that catches obvious typos is usually enough.

Use Case	Pattern
Email address	`/^[^\s@]+@[^\s@]+\.[^\s@]+$/`
URL (http/https)	`/https?:\/\/[\w.-]+(?:\/[\w./?=%&-]*)?/`
IPv4 address	`/^(\d{1,3}\.){3}\d{1,3}$/`
US phone number	`/^\+?1?\s?$?\d{3}$?[\s.-]\d{3}[\s.-]\d{4}$/`
Date (YYYY-MM-DD)	`/^\d{4}-(0[1-9]\|1[0-2])-(0[1-9]\|[12]\d\|3[01])$/`
Hex color	`/^#([0-9a-fA-F]{3}\|[0-9a-fA-F]{6})$/`
UUID v4	`/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i`

Common Mistakes

Forgetting anchors. /\d+/ matches "abc123". If you want to reject that, use /^\d+$/.
Unescaped special characters. Characters like . * + ? ( ) [ ] { } ^ $ | \ have special meaning. If you mean a literal dot, write \. not ..
Greedy over-matching. If your pattern consumes more than you intended, add ? after the quantifier or restrict the character class.
Catastrophic backtracking. Patterns like /(a+)+b/ can cause exponential backtracking on long non-matching strings, hanging your application. Avoid nested quantifiers over the same characters. Use atomic groups or possessive quantifiers if your engine supports them.
Testing on too few examples. A pattern that works on your three test strings may fail on edge cases — empty strings, Unicode characters, very long input, or inputs with special characters. Always test with a dedicated tool before shipping to production.

Quick Reference Summary

# Anchors
^        start of string
$        end of string
\b       word boundary

# Character classes
.        any char (except newline)
\d \w \s digit, word char, whitespace
[abc]    one of: a, b, c
[^abc]   not a, b, or c
[a-z]    range

# Quantifiers (add ? to make lazy)
*        0 or more
+        1 or more
?        0 or 1
{n,m}    between n and m times

# Groups
(abc)       capturing group
(?:abc)     non-capturing group
(?<name>)   named group
a|b         alternation

# Lookaround
(?=abc)     positive lookahead
(?!abc)     negative lookahead
(?<=abc)    positive lookbehind
(?<!abc)    negative lookbehind

# Flags (JS)
g  global    i  case-insensitive
m  multiline s  dot-all   u  unicode

Regular expressions reward time spent learning them. Start with the character classes and quantifiers, write patterns for real problems you encounter, and test everything with an interactive tool. The patterns that look intimidating today will feel natural after a few hours of practice.