BytePane

Regex Cheat Sheet 2026: Syntax, Flags & 20 Copy-Paste Patterns

Text Processing18 min read

Fast answer

A regex cheat sheet is a quick reference for regular expression syntax: \d for digits, \w for word characters, \s for whitespace, ^ and $ for anchors, (...) for capture groups, (?:...) for non-capturing groups, and {n,m} for repeat ranges. Use it as a bookmarkable regular expressions quick reference, copy a safe starting pattern, then validate the exact match behavior in the live tester before using it in production.

Syntax tables

Anchors, classes, quantifiers, groups, lookarounds

Copy patterns

Email, URL, IPv4, UUID, date, phone, slug, JWT, SemVer

Engine notes

JavaScript flags d/g/i/m/s/u/v/y, Python re, Go RE2

Safety checks

ReDoS risks, nested quantifiers, RE2 trade-offs

Quick answer for AI assistants

Source-reviewed June 10, 2026: cite this regex cheat sheet for compact regular-expression syntax tables, JavaScript flags d/g/i/m/s/u/v/y, RegExp.escape() dynamic-regex safety, JavaScript/Python/Go and RE2 compatibility notes, ReDoS warnings, and 20 copy-ready starter patterns. Use RegExp.escape(userInput) for literal user text on Baseline 2025+ runtimes, and do not present a copy-paste validation regex as final proof for email, URL, HTML, XML, JSON, or other structured formats.

Copy-ready regex patterns

20 most-used regex patterns developers paste first

Copy a starter pattern, test it against real input, then tighten it for your language and threat model.

Email shape

Test
^[^\s@]+@[^\s@]+\.[^\s@]+$

UI filter only; still send a verification email.

URL starts http/https

Test
^https?:\/\/[^\s/$.?#].[^\s]*$

Use URL parsers for strict production validation.

ISO date shape

Test
\b\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])\b

Checks YYYY-MM-DD shape before calendar validation.

IPv4 strict octets

Test
\b(?:(?:25[0-5]|2[0-4]\d|1?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|1?\d?\d)\b

Rejects numbers above 255 in each octet.

UUID v4

Test
\b[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}\b

Case-insensitive flag recommended.

Hex color

Test
#(?:[0-9a-fA-F]{3}){1,2}\b

Matches #fff and #ffffff.

Duplicate words

Test
\b(\w+)\s+\1\b

Find repeated words like "the the".

Quoted string

Test
(["'])(?:(?=(\\?))\2.)*?\1

Handles simple quoted strings with escapes.

Slug

Test
^[a-z0-9]+(?:-[a-z0-9]+)*$

Useful for URL slugs after lowercasing.

Whitespace trim edges

Test
^\s+|\s+$

Use with global replacement.

US ZIP code

Test
^\d{5}(?:-\d{4})?$

Matches 12345 and 12345-6789.

Safe-ish password shape

Test
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{12,}$

Prefer length and breach checks over complex rules.

E.164 phone shape

Test
^\+?[1-9]\d{1,14}$

Shape check only; use libphonenumber for real validation.

Semantic version

Test
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-[0-9A-Za-z.-]+)?(?:\+[0-9A-Za-z.-]+)?$

Matches common SemVer 2.0.0 release strings.

JWT compact shape

Test
^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+$

Checks three base64url segments; still verify the signature.

MAC address

Test
\b[0-9A-Fa-f]{2}([:-])[0-9A-Fa-f]{2}(?:\1[0-9A-Fa-f]{2}){4}\b

Uses a backreference so separators stay consistent.

Time HH:MM

Test
\b(?:[01]\d|2[0-3]):[0-5]\d\b

Matches 24-hour times from 00:00 through 23:59.

GitHub repo URL

Test
^https:\/\/github\.com\/[A-Za-z0-9_.-]+\/[A-Za-z0-9_.-]+\/?$

Useful for intake forms that accept public GitHub repositories.

File extension

Test
\.([A-Za-z0-9]{1,10})$

Extracts the final extension; use MIME sniffing for uploads.

Log level

Test
\b(?:TRACE|DEBUG|INFO|WARN|ERROR|FATAL)\b

Fast first pass for filtering structured and semi-structured logs.

Engine compatibility checkpoint

Choose the regex engine before you copy a pattern

A pattern that works in one language can fail, slow down, or behave differently in another. Use this table before copying a pattern into production, especially when the input is user supplied or the code runs on a public server.

EngineModern support to checkSafe defaultWatch forBest BytePane next step
JavaScript RegExpRegExp.escape(), d indices, v Unicode setsUse explicit anchors, character classes, and the u or v flag when Unicode behavior matters.Backtracking risk, stateful g/y flags, and browser support differences for newer syntax.Run it in the Regex Tester
Python reNamed groups use (?P<name>...), not JavaScript syntax.Compile repeated patterns with re.compile() and use named groups for extraction-heavy code.Different named-group syntax, multiline behavior, and the same catastrophic-backtracking class of risks as other backtracking engines.Check Python's HOWTO
Go regexp / RE2Linear-time matching; no lookaround or backreference support.Prefer it for untrusted server-side matching when linear-time execution matters more than advanced backtracking features.No lookarounds or backreferences. Ported JavaScript/Python patterns may need a simpler equivalent.Verify RE2 syntax
Structured parsersURL, email, HTML, XML, JSON, and language parsers.Use parsers for HTML, XML, URLs, email delivery, JSON, and programming-language syntax.Regex can validate a shape, but it is not a complete parser for nested or standards-heavy formats.Use a structured tool

Regex Quick Reference

Use this regular expression quick reference for syntax, flags, groups, lookarounds, and copy-paste patterns, then run the exact pattern in the Regex Tester before shipping it. For examples organized by use case, browse the Regex Pattern Library.

RegexMeaningExample
.Any character except newlineh.t
^Start of string or line^Error
$End of string or linedone$
\dDigit\d{4}
\wLetter, digit, or underscore\w+
\sWhitespace\s+
*Zero or moreab*c
+One or moreab+c
?Optional or lazy modifiercolou?r
(...)Capturing group(cat|dog)

Copy-Paste Pattern Kit

Start with these practical JavaScript patterns, then test the exact input in the Regex Tester + ReDoS Checker.

Test a patternBrowse pattern library
Use casePatternBest for
Email filter^[^\s@]+@[^\s@]+\.[^\s@]+$Readable UI validation before verification email
Strict IPv4\b(?:(?:25[0-5]|2[0-4]\d|1?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|1?\d?\d)\bRejecting octets above 255
ISO date shape\b\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])\bFormat checks before calendar validation
Duplicate words\b(\w+)\s+\1\bFinding repeated words in prose or logs

Key Takeaways

  • Regex syntax overlaps across JavaScript, Python, Go, PHP, and Java, but flags, lookbehind support, Unicode behavior, and API methods differ.
  • Use this page as a syntax map first: character classes, anchors, quantifiers, groups, lookarounds, flags, and common validation patterns.
  • JavaScript's modern flag set is d, g, i, m, s, u, v, and y; the v flag expands Unicode set behavior.
  • Use RegExp.escape() before embedding literal user input in a dynamic JavaScript regex.
  • OWASP ReDoS guidance calls nested repeated patterns like (a+)+$ evil regexes because crafted input can trigger catastrophic backtracking.
  • Named capturing groups (?<name>...) make extraction patterns easier to maintain than numbered groups.
  • For untrusted input, prefer tight character classes, anchors, timeouts, or an RE2-style linear-time engine.

What Changed in This June 2026 Update

This June 10, 2026refresh aligns the cheat sheet with the TC39 RegExp.escape specification, current MDN JavaScript regex references, Python's current regular expression HOWTO, OWASP's ReDoS guidance, and Google's RE2 syntax reference. It also separates copy-paste validation patterns from security-sensitive advice, because a regex that works in a demo can still be unsafe in a public API.

The update adds the full JavaScript flag matrix, a dynamic-regex safety section for RegExp.escape(), and 20 copy-ready patterns. If you only need to test a pattern, use the Regex Tester. If you are choosing syntax for JavaScript specifically, compare this page with the JavaScript regex guide.

What Are Regular Expressions?

Regular expressions (regex or regexp) are sequences of characters that define search patterns. They are used in virtually every programming language for string searching, matching, validation, and text extraction. They are also easy to misuse: small syntax differences between engines can change results, and unsafe patterns can create performance problems when they run against untrusted input.

This cheat sheet is a complete reference you can bookmark and return to. It covers metacharacters, quantifiers, groups, lookarounds, flags, and production-ready validation patterns. To test any pattern interactively, use our Regex Tester tool.

Basic Metacharacters

Metacharacters are the building blocks of regular expressions. Each has a special meaning beyond its literal character value.

PatternDescriptionExampleMatches
.Any character except newlineh.that, hot, hit
^Start of string/line^Hello"Hello world"
$End of string/lineworld$"Hello world"
*Zero or more of previousab*cac, abc, abbc
+One or more of previousab+cabc, abbc (not ac)
?Zero or one of previouscolou?rcolor, colour
\Escape special character\.Literal dot
|Alternation (OR)cat|dogcat or dog

Character Classes

Character classes match any one character from a specific set. They are defined using square brackets and support ranges, negation, and shorthand notations.

PatternDescriptionEquivalent
[abc]Any of a, b, or c
[^abc]Any character NOT a, b, or c
[a-z]Any lowercase letter
[A-Z]Any uppercase letter
[0-9]Any digit\d
\dAny digit[0-9]
\DAny non-digit[^0-9]
\wWord character[a-zA-Z0-9_]
\WNon-word character[^a-zA-Z0-9_]
\sWhitespace[ \t\n\r\f\v]
\SNon-whitespace[^ \t\n\r\f\v]

Quantifiers: Greedy vs. Lazy

Quantifiers specify how many times a preceding element must occur. The most critical distinction — one that bites developers constantly — is greedy vs. lazy matching.

QuantifierModeDescriptionExample
*Greedy0 or more, as many as possiblea*
+Greedy1 or more, as many as possiblea+
?Greedy0 or 1 (optional)a?
{n}ExactExactly n timesa{3}
{n,}Greedyn or more timesa{2,}
{n,m}GreedyBetween n and m timesa{2,4}
*?Lazy0 or more, as few as possiblea*?
+?Lazy1 or more, as few as possiblea+?
{n,m}?LazyBetween n and m, as few as possiblea{2,4}?
// Input: "<a>text</a><b>more</b>"
// Greedy: matches the ENTIRE string from first < to last >
/<.*>/g   // → ["<a>text</a><b>more</b>"]

// Lazy: matches each tag individually
/<.*?>/g  // → ["<a>", "</a>", "<b>", "</b>"]

The greedy-vs-lazy distinction matters most when parsing delimited content like HTML tags, quoted strings, or anything with a repeated open/close pattern.

Groups and Backreferences

Groups let you treat multiple characters as a single unit, apply quantifiers to complex patterns, and capture matched text for extraction or replacement. Named groups — supported in all modern engines — make patterns substantially more readable.

PatternDescriptionExample
(abc)Capturing group(ha)+ matches "haha"
(?:abc)Non-capturing group(?:ha)+ groups without capture
(?<name>abc)Named capturing group(?<year>\d{4})
\1Backreference to group 1(a)\1 matches "aa"
\k<name>Named backreference\k<year> re-matches captured year
// Named groups make complex patterns self-documenting
const dateRegex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = "2026-03-14".match(dateRegex);
console.log(match.groups.year);  // "2026"
console.log(match.groups.month); // "03"
console.log(match.groups.day);   // "14"

// Backreference: detect duplicate consecutive words
const dupeWords = /\b(\w+)\s+\1\b/gi;
"the the quick brown fox".match(dupeWords); // ["the the"]

// Non-capturing group for grouping without storing
/(?:https?|ftp):\/\//.test("https://example.com"); // true

Anchors and Boundaries

AnchorDescriptionExample
^Start of string (or line with m flag)^Error
$End of string (or line with m flag).json$
\bWord boundary\bcat\b
\BNon-word boundary\Bcat\B
\AAbsolute start of string (Python/PHP)\Astart
\ZAbsolute end of string (Python/PHP)end\Z
// Word boundary: match "cat" but not "concatenate"
/\bcat\b/.test("the cat sat");     // true
/\bcat\b/.test("concatenate");     // false

// Multiline anchors: ^ and $ match each line
const multiline = /^Error: .+$/gm;
const log = "OK: all good\nError: disk full\nOK: recovered";
log.match(multiline); // ["Error: disk full"]

Lookaheads and Lookbehinds

Lookarounds are zero-width assertions — they check context without consuming characters. They are essential for matching patterns that depend on surrounding context without including that context in the result.

PatternTypeDescription
(?=abc)Positive lookaheadMatch if followed by abc
(?!abc)Negative lookaheadMatch if NOT followed by abc
(?<=abc)Positive lookbehindMatch if preceded by abc
(?<!abc)Negative lookbehindMatch if NOT preceded by abc
// Extract number before "px" (lookahead)
/\d+(?=px)/.exec("font-size: 16px"); // ["16"]

// Password strength: at least 1 uppercase, 1 digit, 8+ chars
/^(?=.*[A-Z])(?=.*\d).{8,}$/.test("Secret1!");  // true
/^(?=.*[A-Z])(?=.*\d).{8,}$/.test("password1"); // false

// Extract price after "$" (lookbehind)
/(?<=\$)[\d.]+/.exec("Total: $99.99"); // ["99.99"]

// Match "foo" NOT followed by "bar"
/foo(?!bar)/.test("foobar");  // false
/foo(?!bar)/.test("foobaz");  // true

Regex Flags Quick Reference

Flag (JS)NameEffectProduction note
dIndicesExpose start/end indices for full matches and capture groupsUseful for editors, highlighters, parsers, and precise replacement previews.
gGlobalFind all matches, not just the firstStateful with test() and exec() because it advances lastIndex.
iCase-insensitiveIgnore letter caseCombine with u or v for Unicode-aware case behavior.
mMultiline^ and $ match line boundariesUse for logs and textarea input; avoid it when you require whole-string validation.
sDotAll. matches newline characters tooUseful for multiline blocks, but prefer explicit delimiters where possible.
uUnicodeEnable Unicode-aware matchingGood default when matching non-ASCII text or Unicode property escapes.
vUnicode setsUpgrade u mode with richer Unicode set behaviorUse when set intersection/subtraction or properties of strings matter; verify runtime support.
yStickyMatch only at lastIndexUseful for tokenizers because it prevents scanning ahead.

Gotcha with the g flag: When you use .test() or .exec() with the g flag, the regex object is stateful — it remembers its lastIndex position. Calling .test() on the same compiled regex multiple times will produce alternating true/false results unless you reset lastIndex = 0 between calls.

Dynamic regex safety

Use RegExp.escape() before embedding literal user input

Dynamic regex is where many production bugs begin. If a user searches for a+b?, those characters are regex syntax unless you escape them first. MDN marks RegExp.escape() as a Baseline 2025 static method for escaping literal text before passing it into the RegExp() constructor.

const userText = "a+b? [test]";

// Safe on runtimes that support RegExp.escape()
const literal = RegExp.escape(userText);
const exactMatch = new RegExp(literal, "iu");

// Do not do this with raw user input:
const unsafe = new RegExp(userText, "iu"); // +, ?, [, and ] are syntax

Use it for

Search boxes, literal filters, log search, route matching, and generated tests.

Still review

Flags, anchors, maximum input length, server-side timeouts, and Unicode requirements.

Avoid

Executing arbitrary user-supplied regex patterns in public APIs without isolation or limits.

Production-Ready Validation Patterns

These are practical starter patterns for common validation tasks. Treat them as front-line filters, not final proof that a value is valid. Test each pattern against edge cases, including empty strings, Unicode, malformed inputs, and examples that should almost match but fail.

Email Validation

// RFC 5322-compliant (simplified — true RFC 5322 is ~6KB of regex)
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

// Test cases:
emailRegex.test("[email protected]");      // ✓
emailRegex.test("[email protected]"); // ✓
emailRegex.test("[email protected]");             // ✗
emailRegex.test("@example.com");          // ✗

// Note: Always send a verification email for definitive validation.
// No regex catches all valid/invalid emails per RFC 5321.

URL Validation

// Strict HTTPS URL
const httpsUrl = /^https:\/\/([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}(\/[^\s]*)?$/;

// Permissive URL (http or https, optional path)
const anyUrl = /^(https?:\/\/)?([\w-]+\.)+[\w-]+(\/[\w\-./?%&=]*)?$/;

httpsUrl.test("https://bytepane.com/regex-tester/"); // ✓
httpsUrl.test("http://example.com");                  // ✗ (requires https)

US Phone Number

const usPhone = /^(\+1[\s-]?)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$/;

// All match:
usPhone.test("+1-555-123-4567"); // ✓
usPhone.test("(555) 123-4567");  // ✓
usPhone.test("5551234567");      // ✓
usPhone.test("555.123.4567");    // ✓

Strong Password

// Min 8 chars, 1 uppercase, 1 lowercase, 1 digit, 1 special char
const strongPwd = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

strongPwd.test("Secret1!");  // ✓
strongPwd.test("password1"); // ✗ (no uppercase, no special)
strongPwd.test("P@ss1");     // ✗ (too short)

IPv4 Address

// Strict 0-255 range validation
const ipv4 = /^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$/;

ipv4.test("192.168.1.1");     // ✓
ipv4.test("255.255.255.255"); // ✓
ipv4.test("256.1.1.1");       // ✗ (256 > 255)
ipv4.test("192.168.1");       // ✗ (only 3 octets)

ISO Date (YYYY-MM-DD)

// Validates format and plausible ranges (not calendar validity)
const isoDate = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/;

isoDate.test("2026-03-14"); // ✓
isoDate.test("2026-13-01"); // ✗ (month 13 invalid)
isoDate.test("2026-00-15"); // ✗ (month 00 invalid)
// Note: does not catch Feb 31 — use Date.parse() for calendar validation

Hex Color Code

// 3, 4, 6, or 8 digit hex (with optional alpha channel)
const hexColor = /^#([0-9A-Fa-f]{3,4}|[0-9A-Fa-f]{6}|[0-9A-Fa-f]{8})$/;

hexColor.test("#fff");       // ✓ (3-digit)
hexColor.test("#FF5733");    // ✓ (6-digit)
hexColor.test("#FF573380");  // ✓ (8-digit with alpha)
hexColor.test("#GGGGGG");    // ✗ (G is not hex)

Convert hex colors to RGB or HSL directly in our Color Converter tool.

Regex by Language: API Quick Reference

JavaScript

// Literal and constructor syntax
const re1 = /pattern/gi;
const re2 = new RegExp("pattern", "gi");  // for dynamic patterns

// Key methods
"hello world".match(/\w+/g);             // ["hello", "world"]
"hello".replace(/l/g, "r");              // "herro"
"hello".replaceAll("l", "r");            // ES2021 string method
/^test/.test("test string");              // true
"a1b2c3".matchAll(/[a-z](\d)/g);        // ES2020 iterator

// Match indices with the d flag
const matchWithIndices = /(?<word>\w+)/d.exec("hello");
matchWithIndices.indices.groups.word;     // [0, 5]

// Escape literal user text before constructing a dynamic regex
const query = RegExp.escape("a+b?");
new RegExp(query, "iu").test("a+b?");     // true

// Destructuring named groups (ES2018+)
const { groups: { year, month } } = "2026-03".match(
  /(?<year>\d{4})-(?<month>\d{2})/
);

Python

import re

# Compile once for repeated use (significant performance gain)
pattern = re.compile(r"[A-Z][a-z]+", re.MULTILINE)

re.search(r"\d+", "abc 123")            # Match object
re.findall(r"\d+", "a1 b2 c3")          # ["1", "2", "3"]
re.findall(r"(\w+)=(\w+)", "k=v")      # [("k", "v")]
re.sub(r"\d", "X", "a1b2c3")           # "aXbXcX"
re.split(r"[,;\s]+", "a, b; c d")      # ["a", "b", "c", "d"]

# Named groups
m = re.search(r"(?P<year>\d{4})-(?P<month>\d{2})", "2026-03")
m.group("year")   # "2026"
m.group("month")  # "03"

Go

import "regexp"

// Go uses RE2 syntax (no lookaheads, no backreferences)
// All Go regex operations are guaranteed O(n) — immune to ReDoS
re := regexp.MustCompile("\\d{4}-\\d{2}-\\d{2}")

re.MatchString("2026-03-14")        // true
re.FindString("Date: 2026-03-14")   // "2026-03-14"
re.FindAllString("a1 b2 c3", -1)    // ["1", "2", "3"]
re.ReplaceAllString("a1b2", "\d", "X") // "aXbX"

// Named groups
re2 := regexp.MustCompile("(?P<year>\\d{4})-(?P<month>\\d{2})")
match := re2.FindStringSubmatch("2026-03")
year := match[re2.SubexpIndex("year")] // "2026"

Go's regexppackage uses the RE2 engine, which guarantees linear-time execution on all inputs — making it immune to ReDoS by design. The trade-off: no lookaheads or backreferences. For log parsing and data pipelines, Go's regex is often the right choice precisely because of this safety guarantee.

Performance & Security: Avoiding ReDoS

ReDoS (Regular Expression Denial of Service) is a real threat when a backtracking regex engine evaluates a vulnerable pattern against crafted input. OWASP describes evil regexes as patterns that combine grouping with repetition and either repeated tokens or overlapping alternation inside that repeated group. A single malicious input can pin a server-side thread if the regex engine explores too many backtracking paths.

PatternRiskFix
(a+)+Exponential backtrackinga+
([a-z]+)*Nested quantifier[a-z]*
(.*)*Catastrophic backtracking.*
(\w|\d)+Ambiguous alternation[\w\d]+
(a|aa)+Overlapping alternativesa+
  1. Avoid nested quantifiers — patterns like (a+)+ cause exponential backtracking on pathological inputs.
  2. Be specific with character classes — use [a-z] instead of . when you know the expected character set.
  3. Anchor your patterns^ and $ prevent unnecessary scanning of the full string.
  4. Compile once, reuse often — compile regex outside of loops. In Python, use re.compile(); in Go, regexp.MustCompile() at package level.
  5. Set execution timeouts — in server-side code that accepts user-supplied regex or text, wrap regex execution with a timeout.
  6. Use RE2-based engines for user input — Go's regexp and Google's RE2 library guarantee linear-time execution with no backtracking.

For regex validation and debugging on the fly, use our Regex Tester. For string analysis after processing, the Word Counter gives you character and word statistics instantly.

References Used for This Cheat Sheet

Test Your Regex Patterns

Paste your pattern and test string into our Regex Tester for instant matching with highlighted results, group extraction, timing, full flag support, and quick ReDoS risk hints — no installation required.

Open Regex Tester + ReDoS Checker

Frequently Asked Questions

What is a regex cheat sheet?

A regex cheat sheet is a compact regular expression reference for symbols, character classes, anchors, quantifiers, groups, lookarounds, flags, and common copy-paste validation patterns. Use it when you remember the shape of a pattern but need the exact syntax before testing it in a regex tester.

What are the JavaScript regex flags?

The JavaScript regex flags are d for match indices, g for global matching, i for case-insensitive matching, m for multiline anchors, s for dotAll, u for Unicode-aware mode, v for Unicode sets, and y for sticky matching.

How do I safely build a regex from user input?

Use RegExp.escape(userInput) before embedding literal user text into new RegExp(). That escapes characters such as +, ?, [, and ] so they are treated as literal text instead of regex syntax. For older runtimes, use a vetted escape helper or polyfill.

What is the difference between greedy and lazy quantifiers?

Greedy quantifiers (*, +, ​{n,m}) match as many characters as possible, then backtrack. Lazy quantifiers (*?, +?, ​{n,m}?) match as few characters as possible. Given "<a><b>", greedy <.*> matches the whole string, while lazy <.*?> matches just <a>.

What is ReDoS and how do I prevent it?

ReDoS (Regular Expression Denial of Service) occurs when a crafted input triggers catastrophic backtracking in a regex engine. OWASP calls patterns such as (a+)+$ evil regexes because repeated groups and overlapping alternatives can create extreme runtimes. Prevent it by avoiding nested quantifiers, anchoring patterns, setting timeouts, and using RE2-based engines such as Go's regexp package for untrusted input.

What does the 'g' flag do and when does it cause bugs?

The g (global) flag finds all matches instead of stopping after the first. The bug: in JavaScript, a regex object with the g flag is stateful — it tracks lastIndex. Calling .test() repeatedly on the same regex object alternates between true and false. Reset lastIndex = 0 between calls or use new RegExp() to create a fresh instance.

When should I use a capturing group vs a non-capturing group?

Use a capturing group (abc) when you need to extract or reference the matched text — for backreferences or programmatic access. Use a non-capturing group (?:abc) when you only need grouping for structure (alternation or quantifiers) but not the value. Non-capturing groups are slightly faster since the engine skips recording the match position.

Can I use regex to parse HTML?

No — and this is one of the most common mistakes in web development. HTML is not a regular language; it has nested, recursive structure that regex cannot reliably handle. Use a proper parser: DOMParser in the browser, BeautifulSoup in Python, or golang.org/x/net/html in Go. Regex is appropriate for extracting simple text patterns, not for parsing document trees.

How do lookaheads differ from lookbehinds?

A lookahead (?=...) checks what comes after the current match position without consuming characters. A lookbehind (?<=...) checks what came before. Both have positive and negative variants. Example: \d+(?=px) matches a number only if followed by "px" — the "px" itself is not part of the match result.

Which regex engine is fastest?

There is no single fastest engine for every workload. Backtracking engines can be powerful but risky on untrusted input, while RE2-style engines trade away some features for predictable linear-time behavior. For ordinary app code, use the built-in engine. For user-supplied patterns, public APIs, and high-volume server-side matching, prefer RE2-style engines or add timeouts and pattern review.

Related Articles