Developers & Technical · Guide · Developer Utilities
Regex Cheat Sheet: All Patterns Explained
Complete regex reference: every operator, flavor differences (ECMAScript, PCRE, Python, Go), and 30 patterns covering 95% of real matching tasks.
A complete regex reference: every operator, the difference between flavors (ECMAScript, PCRE, Python, Go RE2), and 30 patterns that cover ~95% of real-world matching tasks. Each pattern is shown with input, output, and a flavor compatibility note. Use this as a working reference — bookmark it, search-in-page for what you need, copy and adapt.
Most regex tutorials over-explain syntax and under-explain the engine differences that bite you in production. This guide goes the other way: short syntax recap, long pattern library, and explicit flavor warnings.
Advertisement
Core syntax recap (in 90 seconds)
.— any character except newline (usesflag for “dotall” mode where dot matches newlines too).\d— digit. Equivalent to[0-9]in most flavors. Unicode-aware in Python, ECMAScript withuflag.\w— word character.[a-zA-Z0-9_]in most flavors.\s— whitespace (space, tab, newline, etc.).\D \W \S— uppercase = negated.[abc]— character class: a, b, or c.[a-z]— range.[^abc]— negation.|— alternation:cat|dogmatches cat OR dog.?— 0 or 1 occurrences.*— 0 or more.+— 1 or more.{n}— exactly n.{n,}— n or more.{n,m}— between n and m.
Anchors and boundaries
^— start of string. Withmflag, start of line.$— end of string. Withmflag, end of line.\b— word boundary (between\wand\W).\bcat\bmatches “cat” but not “catalog”.\B— non-word boundary.\Bcat\Bmatches “concatenate” but not “cat box”.\A— absolute start of string (Python, PCRE). Not in ECMAScript.\Z/\z— absolute end of string (Python, PCRE). Not in ECMAScript.
Most common gotcha: ^ and $ default to string start/end, not line start/end. To match line by line, add the multiline m flag: /^foo$/m.
Quantifiers: greedy vs lazy vs possessive
Three quantifier strategies in modern regex engines (not all flavors support all three):
- Greedy (default): match as much as possible, then back off.
.*on “abc” matches “abc”. - Lazy / reluctant:
.*?,.+?. Match as little as possible. Useful for “match between delimiters” patterns. - Possessive:
.*+,.++. Like greedy but never give back. Fail-fast on no-match. Available in PCRE, Java, Ruby; NOT in ECMAScript or Python.
Worked example on <b>hello</b> <b>world</b>:
- Greedy
<b>.*</b>→ matches the entire string (one big match). - Lazy
<b>.*?</b>→ matches each<b>...</b>separately.
Character classes and shortcuts
[abc]— one of a, b, or c.[a-zA-Z0-9]— alphanumeric ASCII.[^abc]— NOT a, b, or c (one char).[\d.-]— digit, dot, or hyphen. Inside[], most metacharacters lose special meaning.-goes at start/end to be literal.\p{Letter}— Unicode property class: any letter (Greek, Cyrillic, etc.). Requiresuflag in ECMAScript.\p{Number}— any Unicode digit (Arabic, Devanagari, etc.).
Groups, captures, backreferences
(abc)— capturing group. Accessible as$1in replace,match[1]in code.(?:abc)— non-capturing group. Same grouping behavior, no capture overhead.(?<name>abc)— named capture. Accessible asmatch.groups.name.\1 \2 ...— backreference to captured group.(a)\1matches “aa”.\k<name>— backreference by name.
Worked example: extract user and domain from email. Pattern: (?<user>\w+)@(?<domain>[\w.-]+). On “hello@example.com”:match.groups.user === "hello", match.groups.domain === "example.com".
Lookahead and lookbehind
Zero-width assertions: they check whether a position has certain context, but don’t consume characters.
(?=...)— positive lookahead.foo(?=bar)matches “foo” only if followed by “bar”.(?!...)— negative lookahead.foo(?!bar)matches “foo” not followed by “bar”.(?<=...)— positive lookbehind.(?<=foo)barmatches “bar” preceded by “foo”.(?<!...)— negative lookbehind.(?<!foo)barmatches “bar” NOT preceded by “foo”.
Flavor support: Python re requires fixed-width lookbehind; regex module supports variable-width. ECMAScript supports both as of ES2018. Go RE2 has no lookaround at all (linear-time guarantee).
Engine differences (ECMAScript, PCRE, Python, Go)
| Feature | ECMAScript | PCRE / Perl | Python re | Go RE2 |
|---|---|---|---|---|
| Lookbehind | ES2018+ (any width) | Yes (any) | Fixed-width only | NO |
| Possessive quantifiers | NO | Yes | NO | NO |
| Recursion / subroutines | NO | Yes | NO | NO |
| Named groups | (?<name>) | (?P<name>) or (?<name>) | (?P<name>) | (?P<name>) |
| Backtracking | Yes | Yes | Yes | NO (linear time) |
| Unicode property classes | With u flag | Yes | Yes | Yes |
Practical implication: a pattern that works in regex101.com’s PCRE mode may fail in your JavaScript code. Always test in the engine you’ll deploy to. The browser regex tester uses ECMAScript exactly as your production code will.
Common patterns: validation
Each pattern is in ECMAScript flavor unless noted. Translate as needed.
Email (pragmatic)
/^[\w.+-]+@[\w-]+\.[\w.-]+$/
Don’t try to match RFC 5321 — the full spec regex is 6,425 characters. The above accepts ~99.9% of real emails and rejects most invalid input. For bullet-proof validation, send a confirmation email instead.
URL (HTTP/HTTPS)
/^https?:\/\/[\w.-]+(?::\d+)?(?:\/[^\s]*)?$/
US phone number
/^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/Matches: (415) 555-1234, 415-555-1234, 415.555.1234, 4155551234.
IPv4 address
/^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$/Strong password (8+ chars, mixed case, digit, special)
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/Better approach: skip composition rules entirely, require length 12+, and check against breach databases (HIBP). Modern security guidance has moved away from composition requirements.
ISO 8601 date (YYYY-MM-DD)
/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/Hex color
/^#?(?:[0-9a-f]{3}|[0-9a-f]{6})$/iSlug (URL-safe identifier)
/^[a-z0-9]+(?:-[a-z0-9]+)*$/
Common patterns: extraction
Match between delimiters (lazy)
/<title>(.*?)<\/title>/
Caveat: don’t parse HTML with regex for anything beyond the simplest cases. Use DOMParser instead.
All numbers in a string
/-?\d+(?:\.\d+)?/g
Quoted strings (handles escaped quotes)
/"((?:[^"\\]|\\.)*)"/g
Hashtags from a tweet
/#[\w_]+/g
Markdown link
/\[([^\]]+)\]\(([^)]+)\)/g
Captures: $1 = link text, $2 = URL.
CSV row (simple, no embedded commas)
/[^,\n]+/g
For real CSV with quoted fields and embedded commas, use a CSV parser library.
Common patterns: replacement
Strip HTML tags
text.replace(/<[^>]+>/g, '')
Collapse multiple spaces
text.replace(/\s+/g, ' ').trim()
Convert camelCase to snake_case
text.replace(/([a-z])([A-Z])/g, '$1_$2').toLowerCase()
Mask email middle
email.replace(/^(.{2}).*?(@.*)$/, '$1***$2')Output: he***@example.com
Convert phone to E.164
text.replace(/[^\d]/g, '').replace(/^/, '+1')
Catastrophic backtracking and ReDoS
Regular Expression Denial of Service (ReDoS) is a real attack class. Vulnerable patterns have nested quantifiers that produce exponential paths on adversarial input. The classic example:
/^(a+)+$/
On input “aaaaaaaaaaaaaaaaaaa!” (no matching $), the regex engine tries every possible split of a characters between the inner and outer quantifier. Time grows as 2^n. 30 a’s = 1 billion paths = 30+ second hang.
Common ReDoS patterns to audit:
(a+)+,(a*)*— nested quantifiers on overlapping classes.(a|aa)+— alternation with overlap.- Email regex
^([a-zA-Z0-9._-]+)+@— nested group with permissive inner quantifier.
Defenses: (1) Use Go RE2 or RE2-compatible engines (Cloudflare, Google’s open-source RE2 library) for untrusted input — linear time guarantee. (2) Add timeouts when running user-supplied patterns. (3) Use static analysis tools (rxxr2, safe-regex) to flag risky patterns. (4) Avoid nested quantifiers; prefer atomic groups (?>...) or possessive quantifiers where supported.
Performance tips
- Anchor your patterns:
^abcis dramatically faster thanabcon long input where matches start at position 0. - Prefer character classes to alternation:
[abc]is faster thana|b|c. - Compile once, reuse many times: in Python
re.compile()and JavaPattern.compile(), save the compiled pattern for hot loops. ECMAScript engines cache regex literals automatically. - Use non-capturing groups
(?:...)when you don’t need the capture — saves memory. - Profile before optimizing: most regex performance issues are catastrophic backtracking, not micro-optimization. Use a regex profiler.
Don’t do these
- Parse HTML with regex. HTML is recursive; regex isn’t. Use
DOMParser, BeautifulSoup, jsoup, or html.parser. - Parse JSON with regex. Use
JSON.parseor your language’s equivalent. - Match RFC 5321 emails with one regex. The proper regex is 6,425 chars; nobody actually uses it. Validate format with a pragmatic pattern, then send a confirmation email.
- Validate SQL identifiers with permissive regex. Use parameterized queries; don’t hand-roll SQL injection prevention.
- Match balanced delimiters with regex. Recursion is required; most regex engines don’t support it. Use a stack-based parser.
- Trust user-supplied regex without timeouts. ReDoS will hang your process.
The 80/20 takeaway
Master 6 things and you can handle ~95% of real-world regex tasks: character classes, quantifiers (greedy and lazy), anchors, capture groups, alternation, backreferences. The rest (lookaround, possessive quantifiers, atomic groups) is situational. Test in the exact engine you’ll deploy to (the regex tester uses ECMAScript). Audit any pattern that handles untrusted input for ReDoS. And always have a non-regex fallback ready — HTML parsers, JSON parsers, real CSV libraries — for cases regex can’t handle correctly.
Use these while you read
Tools that pair with this guide
- Regex TesterTest regular expressions against sample text with live match highlighting. Supports common flags and groups.Developer Utilities
- AI Regex GeneratorDescribe what you want to match in plain English — get a canonical regex (email, URL, phone, UUID, etc.) plus a live test.AI & Prompt Tools
- JSON FormatterPaste JSON to format, validate, and minify. Clear error messages with line numbers. Free and runs in your browser.Developer Utilities
- URL Encoder & DecoderEncode URLs for safe use in links and query strings, or decode encoded URLs back to readable text.Developer Utilities
Frequently asked questions
What's the difference between regex flavors?
Major engines: ECMAScript (browsers, Node.js), PCRE (PHP, Perl), Python re, Go RE2, Java's java.util.regex, Ruby. Differences include lookbehind support (Go RE2 has none), recursion (only PCRE/Perl), Unicode handling, possessive quantifiers (PCRE/Java/Ruby only). Same pattern can match in one flavor and fail in another. Always test in the exact engine you'll deploy to.
Why is my regex pattern hanging or timing out?
Likely catastrophic backtracking — a ReDoS pattern. Common culprits: nested quantifiers like (a+)+, alternation with overlap like (a|aa)+, permissive nested groups like ([a-z]+)+. Time grows exponentially with input length. Defenses: (1) rewrite the pattern to remove nested quantifiers, (2) use Go RE2 or RE2-compatible engines for untrusted input (linear-time guaranteed), (3) add execution timeouts, (4) run static analyzers like safe-regex to flag risky patterns.
How do I write a regex for emails properly?
Don't try for RFC 5321 perfection — the canonical regex is 6,425 characters. Use a pragmatic pattern like /^[\w.+-]+@[\w-]+\.[\w.-]+$/ that catches ~99.9% of real emails and rejects most invalid input. For high-stakes validation (signup forms): pragmatic regex first to filter typos, then send a confirmation email — only the inbox owner can click the link, which proves both syntactic AND deliverable validity. Don't combine validation regex with deliverability checks; separate concerns.
What's the fastest regex engine?
Go RE2 is the fastest for guaranteed worst-case performance (linear time, no catastrophic backtracking). It's used by Cloudflare, Google, and many search engines. Trade-off: no lookbehind, no recursion. For features-rich speed: PCRE2 with JIT compilation is fastest. ECMAScript engines (V8 in Node/Chrome) are fast for most patterns due to heavy optimization but vulnerable to ReDoS on adversarial input. Python re is consistently the slowest of major engines; the third-party 'regex' module is meaningfully faster.
Advertisement
Continue reading
- Developers & TechnicalGitHub Actions Without Being a DevOps ExpertPractical playbook for using GitHub Actions for the 90% case. Automated tests, deploy patterns, speed-up automations, common templates from the marketplace.
- Developers & TechnicalBest Practices for Building Developer ToolsPractical patterns for CI/CD pipeline tools, IDE choice, what companies pay for, testing approaches, documentation standards, success metrics
- Developers & TechnicalHow to Contribute to Open Source Developer ToolsPractical playbook for OSS contribution — finding the right projects, your first PR (without getting laughed at), scaling to substantial contributions
- Developers & TechnicalHow to Design CLI Tools Developers LoveSix design principles for CLIs developers love (composability, sensible defaults, human errors, trust by default, predictability, fast feedback).
- Developers & TechnicalPassword Security Guide with Real Entropy ExamplesReal password entropy math, attacker speeds in 2026, diceware passphrases, password managers, 2FA. Modern guidance based on actual attack vectors.
- Developers & TechnicalJSON Format Rules Every Developer Should KnowStrict JSON spec rules, JSON5 vs JSONC, top 10 parser errors, JSON Schema validation, streaming for huge files, security: prototype pollution and DoS.