BytePane

Regex Cheatsheet for Developers 2026

Complete regex reference: character classes (\d, \w, \s), quantifiers (?+*{n,m}), anchors (^$\b), groups, lookarounds. JavaScript, Python, Java, Go, Rust syntax differences. Performance pitfalls + testing tools.

By BytePane Team · Updated April 25, 2026 · Tested against ECMAScript 2026 regex spec, Python 3.13 re module, Java 24 Pattern, Go 1.24 regexp

Quick reference: most-used regex patterns

PatternMeaningExample
.Any char except newlinea.c → "abc", "a1c"
\dDigit (0-9)\d+ → "42"
\wWord char (A-Z, a-z, 0-9, _)\w+ → "hello_world42"
\sWhitespace\s+ → multiple spaces
[abc]Set: a, b, or c[aeiou] → vowels
[^abc]NOT a, b, c[^0-9] → non-digit
^Start of string/line^Hello → starts with
$End of string/lineworld$ → ends with
\bWord boundary\bcat\b → exact word
a*0 or more 'a'a* → "", "aaa"
a+1 or more 'a'a+ → "a", "aaa"
a?0 or 1 'a'colou?r → US/UK
a{3,5}3 to 5 'a'\d{4} → year
(...)Capture group(\d+)px → captures num
(?:...)Non-capture group(?:abc)+ → group quantify
(?=...)Positive lookahead\d+(?=px) → digit before px
(?!...)Negative lookahead\d+(?!px) → digit not px
|Alternation (or)cat|dog → either

Frequently asked questions

What are regex character classes?

Character classes match a set of characters at one position. Common: [abc] matches a, b, or c. [a-z] matches lowercase letters. [^abc] matches NOT a, b, c. [0-9] matches digits. Predefined classes: \d (digit) = [0-9]. \D (non-digit). \w (word char) = [A-Za-z0-9_]. \W (non-word). \s (whitespace) = [ \t\n\r\f\v]. \S (non-whitespace). . (any char except newline). Unicode classes (modern engines): \p{L} (letter), \p{N} (number), \p{Lu} (uppercase letter). JavaScript needs /u flag for Unicode classes; Python re module has built-in.

What are regex quantifiers?

Quantifiers control how many times preceding token matches. ? = 0 or 1. * = 0 or more. + = 1 or more. {n} = exactly n times. {n,} = n or more. {n,m} = between n and m times. GREEDY (default): matches as much as possible. LAZY (add ? after quantifier): matches as little as possible. Examples: a* matches "" or "a" or "aaa". a+ matches "a" or "aaa" but NOT "". a{3} matches exactly "aaa". \d{3,5} matches 3-5 digits. .*? lazy match (any chars, minimal). Common gotcha: <.*> on "<a><b>" matches whole string greedy; <.*?> matches just "<a>" lazy. Use lazy for HTML/XML-like nested patterns.

How do regex anchors and word boundaries work?

Anchors match positions, not characters. ^ = start of string (or line with /m flag). $ = end of string (or line with /m). \b = word boundary (transition between \w and \W). \B = non-word boundary. \A = start of input (Python). \Z = end of input. Examples: ^hello matches "hello world" but not "say hello". world$ matches "hello world". \bcat\b matches "cat" in "the cat sat" but not "category". Common use: \b\w+\b finds whole words. Multi-line mode (/m flag): ^ and $ match start/end of each line not whole string. Important: ^.*$ in default single-line mode matches whole string (because . doesn't match newline). With /s flag (dotall), . matches newlines too.

What are regex capture groups?

Groups (regex parentheses) serve 3 purposes: (1) GROUPING for quantifiers: (ab)+ matches "ab", "abab", "ababab". (2) CAPTURING for backreferences/extraction: (\d{4})-(\d{2})-(\d{2}) captures year/month/day separately. Access via $1, $2, $3 (replace) or capture index (match objects). (3) NON-CAPTURING (?:...): groups for quantifier without capturing — performance benefit + cleaner output. NAMED GROUPS (?<name>...): give meaningful names to captures. Access by name. JavaScript ES2018+, Python 3.x, Ruby support. BACKREFERENCES \1, \2 in pattern: matches same text as previous group. Example (\w+)\s+\1 matches duplicate words like "the the". Common in find-duplicates regexes. Most regex engines support both numbered + named groups; some support recursive groups (PCRE).

What are lookaheads and lookbehinds?

Lookarounds are zero-width assertions — match positions without consuming characters. POSITIVE LOOKAHEAD (?=...): matches if pattern follows. \d+(?=px) matches digits followed by "px". NEGATIVE LOOKAHEAD (?!...): matches if pattern does NOT follow. \d+(?!px) digits NOT followed by "px". POSITIVE LOOKBEHIND (?<=...): matches if pattern precedes. (?<=\$)\d+ matches digits preceded by "$". NEGATIVE LOOKBEHIND (?<!...): not preceded by. Engine support 2026: JavaScript ES2018+ supports all four. Python re module supports all. Java since 1.4. Go regexp does NOT support lookbehinds (RE2 engine limitation). Rust regex does NOT support lookarounds at all (RE2-based). Use cases: password validation (must contain digit AND letter via lookaheads), URL parsing, conditional substitution.

JavaScript vs Python vs Java regex syntax differences?

Major syntax differences across regex engines 2026: JAVASCRIPT (V8, SpiderMonkey): /pattern/flags syntax. Flags: g (global), i (insensitive), m (multiline), s (dotall ES2018+), u (unicode), y (sticky), d (indices ES2022). PYTHON (re module): re.compile(r"pattern", re.IGNORECASE | re.MULTILINE | re.DOTALL). Raw strings r"" recommended. Verbose mode re.VERBOSE for readable patterns. JAVA (java.util.regex): Pattern.compile("pattern", Pattern.CASE_INSENSITIVE). Double-escape backslashes in string literals: "\\d+". GO (regexp package): Uses RE2 engine — fast, no backtracking, no lookarounds. PCRE-like syntax minus lookbehinds + recursion. RUST (regex crate): Also RE2-based, similar limits. PCRE/PHP: Most powerful — full Perl features, lookbehinds, recursion, conditional patterns. Recommendation: Use most-powerful engine (PCRE) for prototyping; verify compatibility on target platform before deploying.

What are common regex performance pitfalls?

CATASTROPHIC BACKTRACKING: pattern (a+)+b on "aaaaaaaa" → exponential time. Triggered by nested quantifiers + lazy/greedy interactions. FIX: avoid nested quantifiers. Use atomic groups (?>...) or possessive quantifiers (a++) where supported. RUNAWAY GREEDY: .* matching too much, requiring backtracking. FIX: use lazy .*? or specific character classes [^>]* instead of .*. EXCESSIVE ALTERNATION: (a|b|c|d|e|f|g)+ slow. FIX: use character class [a-g]+ instead. UNICODE OVERHEAD: /\w+/u flag dramatically slower than ASCII /\w+/. PRECOMPILE: re.compile() in Python, new RegExp() in JS — reuse compiled pattern instead of recompiling each call. AVOID REGEX FOR: parsing HTML (use DOM parser), parsing JSON (use JSON.parse), CSV (use proper library). Regex is for SIMPLE patterns; complex grammars need real parsers. RE2 engines (Go, Rust) prevent catastrophic backtracking by design — slower for some patterns but predictable.

How do I test and debug regex patterns?

Top regex testing tools 2026: BytePane Regex Tester (bytepane.com/regex-tester/) — free, syntax highlighting, capture group display, multiple flavors (JS, PCRE, Python). Regex101 (regex101.com) — most-used tool, debugger, replace mode, code generation for 9+ languages. RegExr (regexr.com) — interactive learning, community library. Debuggex (debuggex.com) — visual railroad diagrams of regex structure. UniRegex (uniregex.com) — Unicode-focused. DEBUGGING TIPS: (1) Build incrementally — start simple, add complexity. (2) Test edge cases — empty string, single char, pattern at start/end. (3) Use verbose mode (Python re.VERBOSE) for readable multi-line patterns. (4) Comment in regex: (?#this is a comment). (5) Print or log captures to verify. (6) Beware of regex stickiness in JavaScript (g flag stateful). (7) Validate input AFTER regex match — regex finds patterns, doesn't validate semantically (e.g., date "2026-13-45" matches \d{4}-\d{2}-\d{2} but isn't valid).

Test your regex

Free regex tester with syntax highlighting, capture groups display, and multiple engine flavors:

Open BytePane Regex Tester →

Related