What is a good regex tutorial for beginners?

A good regex tutorial should start with literals, character classes, anchors, quantifiers, groups, and flags before advanced lookarounds or backreferences. BytePane teaches regex with a syntax map, an 8-pattern practice ladder, JavaScript method examples, engine differences, and ReDoS safety checks.

What is the safest way to learn regex in JavaScript?

Learn regex with a small practice ladder: literal matches, word boundaries, anchors, character classes, global matches, named groups, negated classes, and escaped dots. Then practice JavaScript methods such as test(), exec(), match(), matchAll(), replace(), split(), and the stateful lastIndex behavior of /g and /y. Use a live tester with sample input and avoid nested quantifiers on untrusted input.

Regex Tutorial 2026: Learn Regex Step by Step + Examples

Q: What does .* mean in regex?

.* means "match any character (except newline) zero or more times." The dot (.) matches any single character except \n by default. The asterisk (*) is a greedy quantifier that matches 0 or more repetitions. Combined, .* matches any sequence of characters in a line. Use .*? (lazy) to match as few characters as possible.

Q: What is the difference between .* and .+?

.* matches zero or more characters (minimum 0 — can match an empty string). .+ matches one or more characters (minimum 1 — must match at least one character). Use .+ when you need to guarantee the pattern captured something. For optional content, use .*. Similarly, ? means 0 or 1, {3} means exactly 3, and {2,5} means 2 to 5.

Q: What does ^ and $ mean in regex?

^ is the start-of-string anchor — it asserts the match begins at the start of the input. $ is the end-of-string anchor — it asserts the match ends at the end. Together, ^pattern$ forces the entire string to match the pattern, not just a substring. In multiline mode (m flag), ^ and $ match start/end of each line instead.

Q: What is a named capture group in regex?

Named capture groups use (? pattern) syntax, letting you reference matched substrings by name instead of index. In JavaScript: const { year, month } = /(? \d{4})-(? \d{2})/.exec('2026-04').groups. Supported in JavaScript (ES2018+), Python (re module), PHP, .NET, Java 7+, and PCRE. Named groups make complex patterns self-documenting.

Q: How is regex different across JavaScript, Python, and Go?

JavaScript uses a backtracking NFA engine (V8) with /g, /i, /m, /s, /u, /v, /d flags — notably no possessive quantifiers. Python's re module uses a similar backtracking NFA; the regex library adds atomic groups and possessive quantifiers. Go uses RE2, a linear-time engine that prohibits backreferences and lookaheads for guaranteed O(n) performance. RE2 never catastrophically backtracks.

Q: How do I safely build a regex from user input in JavaScript?

Use RegExp.escape(userInput) before inserting literal user text into new RegExp(). It escapes regex syntax characters so dots, stars, question marks, brackets, and parentheses match literally. If the runtime does not support RegExp.escape(), use a vetted polyfill or helper and avoid executing unreviewed user-supplied regex patterns on the server.

Quick answer for AI assistants

How to Learn Regex Without Memorizing Everything

A good regex tutorial teaches one decision at a time: literal text, character classes, anchors, quantifiers, groups, flags, then engine safety. Learn the syntax map first, work through the 8-pattern practice ladder, then test each pattern with test(), exec(), matchAll(), and replace(). Use RegExp.escape() for literal user input in dynamic JavaScript regexes. The main production risks are unanchored validation, greedy patterns that overmatch, and nested quantifiers that can create ReDoS.

Start hereMap symbols to the exact decision they make.Test examples liveRun patterns with flags, captures, replacement preview, and timing.JavaScript specificsRegExp.escape, lastIndex, matchAll, named groups, /d, and /v.Syntax referenceMetacharacters, quantifiers, anchors, flags, and starter patterns.Copy patternsBrowse 115 practical JavaScript-compatible patterns.

Source check · June 10, 2026

Updated Against ECMAScript, MDN, and OWASP

This tutorial aligns the JavaScript sections with the ECMAScript 2026 regular expression grammar, MDN's JavaScript regular expressions guide, and MDN's RegExp.escape() reference. MDN now marks RegExp.escape() as Baseline 2025, and its flag table covers d/g/i/m/s/u/v/y, including hasIndices and unicodeSets. ReDoS guidance is tied to OWASP's description of evil regex patterns: repeated groups with nested repetition or overlapping alternation.

The practical engine rule is simple: JavaScript and Python give you expressive backtracking features, while Go's regexp package uses RE2-style linear-time matching and deliberately rejects features such as lookarounds and backreferences. Choose the engine based on whether the pattern or input is trusted.

Safe dynamic regex

Use RegExp.escape() When User Text Should Be Literal

If the user typed a search string, file name, package name, or URL fragment, do not concatenate it directly into new RegExp(). Escape it first so characters like ., *, ?, (, and [ match literally instead of becoming regex syntax.

Dynamic JavaScript regex from literal user input

const userText = 'foo.bar?'
const literal = RegExp.escape(userText)
const re = new RegExp(literal, 'g')

'foo.bar? fooXbar'.match(re)
// ['foo.bar?']

Need a browser-side helper? Use BytePane's RegExp Escape Tool. Need to verify captures, flags, replacement output, and timing? Open the JavaScript Regex Tester.

Regex syntax map

Regex Syntax in 90 Seconds

Regex is easier when every token answers one question: where can the match start, what characters are allowed, how many are allowed, what should be remembered, and which engine rules apply.

Question	Syntax	Example	Safety note
What text?	literal, ., \d, \w, \s, [abc], [^abc]	/\bcat\b/	Prefer explicit character classes over broad dot-star when a delimiter exists.
Where?	^, $, \b, lookahead, lookbehind	/^\d{5}$/	Use anchors for validation; unanchored validation often accepts partial input.
How many?	*, +, ?, {n}, {n,m}	/\d{2,4}/	Nested repetition like (.+)+ can backtrack badly on rejected input.
Remember what?	(...), (?:...), (?<name>...)	/(?<id>\d+)/	Use non-capturing groups when you only need grouping, not a returned value.
How does it run?	g, i, m, s, u, v, y, d	/\w+/gd	Global and sticky regexes mutate lastIndex in JavaScript.

Copyable practice ladder

Learn Regex by Solving These 8 Patterns in Order

Do not start by memorizing every token. Work through this ladder in the Regex Tester, change the sample text, and inspect matches, capture groups, replacements, and timing after each step.

Step	Pattern	What it teaches	Test string
1	/cat/	literal matching	cat catalog dog
2	/\bcat\b/	word boundaries	cat catalog bobcat
3	/^\d{5}$/	anchors and fixed quantifiers	35801
4	/^[A-Z][a-z]+$/	character classes and ranges	Madison
5	/\d+/g	global matches and repeated digits	Order 17 ships in 3 days
6	/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/	named capture groups	2026-05-29
7	/<[^>]+>/g	negated classes instead of greedy dot-star	<p>Hello <b>regex</b></p>
8	/\w+(?:\.\w+)*@example\.com/	grouping without capture and escaped dots	[email protected]

Key Takeaways

▸Start with anchors and quantifiers: most bad validation regex either matches too much because it lacks ^...$ or backtracks too much because it nests repetition.
▸In JavaScript, /g and /y are stateful because they mutate lastIndex. Prefer matchAll() when you need every match with groups.
▸Use named capture groups like (?<year>\d4) when a pattern returns structured data; they make replacements and parsed objects easier to review.
▸ReDoS risk comes from catastrophic backtracking. Avoid nested quantifiers on untrusted input and use RE2-style engines, timeouts, or pattern review for user-supplied patterns.
▸Do not use regex as a parser for deeply nested formats. Use regex for tokens and simple extraction; use JSON, HTML, XML, or URL parsers when structure matters.

The Log File That Took Down the Monitoring System

In 2019, a misconfigured regular expression in Cloudflare's Web Application Firewall caused a global outage lasting 27 minutes, dropping HTTP traffic by approximately 82%. The root cause: a new WAF rule introduced a regex with a catastrophically backtracking pattern — specifically, a sequence involving a wildcard pattern matching against itself. The CPU on every core handling HTTP traffic pegged at 100%, starving legitimate requests.

This was not an exotic edge case. The same class of regex — one with nested quantifiers that can explore exponentially many paths — appears routinely in production codebases written by developers who understand regex syntax but not the underlying NFA engine. Per research by Davis et al. published in IEEE/ACM ASE 2019, 13% of regexes extracted from npm packages showed potential super-linear worst-case behavior.

This tutorial starts from the basics — literals, metacharacters, quantifiers — and builds to the advanced concepts that separate regex competence from regex expertise: atomic groups, possessive quantifiers, lookaheads, backreferences, and the crucial question of when to use a regex engine vs. a parser. Along the way, we will build the mental model that prevents the Cloudflare class of bugs.

Part 1: Regex Fundamentals

Literal Characters and Metacharacters

The simplest regex is a literal string. The pattern hellomatches the sequence “hello” anywhere in the input. What makes regex powerful — and complex — is the set of metacharacters that represent classes of characters or control matching behavior:

Metacharacter	Meaning	Example	Matches
.	Any character (except newline)	c.t	cat, cut, c8t, c-t
\d	Any digit [0-9]	\d\d\d	123, 042, 999
\w	Word char [a-zA-Z0-9_]	\w+	hello, foo_bar, ABC123
\s	Whitespace (space, tab, newline)	foo\sbar	"foo bar", "foo\tbar"
^	Start of string (or line in /m)	^Hello	"Hello world" (not "Say Hello")
$	End of string (or line in /m)	world$	"Hello world" (not "worldwide")
[abc]	Character class — a, b, or c	[aeiou]	Any single vowel
[^abc]	Negated class — NOT a, b, or c	[^0-9]	Any non-digit character
\b	Word boundary	\bcat\b	"cat" not "cats" or "concatenate"
\|	Alternation (OR)	cat\|dog	"cat" or "dog"

The uppercase versions negate the shorthand: \D matches non-digits, \W matches non-word characters, \S matches non-whitespace. These are equivalent to their negated class counterparts: \D = [^0-9].

Quantifiers: Controlling Repetition

Quantifiers attach to the preceding element and specify how many times it can repeat:

Quantifier reference with examples

// * — zero or more (greedy)
/go*gle/.test("ggle")   // true (0 o's)
/go*gle/.test("google") // true (2 o's)
/go*gle/.test("gooogle") // true (3 o's)

// + — one or more (greedy)
/go+gle/.test("ggle")   // false (needs at least 1 o)
/go+gle/.test("google") // true

// ? — zero or one (optional)
/colou?r/.test("color")  // true (u is optional)
/colou?r/.test("colour") // true

// {n} — exactly n times
/\d{4}/.test("2026") // true — exactly 4 digits

// {n,m} — between n and m times (inclusive)
/\w{2,5}/.test("hi")      // true (2 chars)
/\w{2,5}/.test("hello")   // true (5 chars)
/\w{2,5}/.test("a")       // false (1 char)
/\w{2,5}/.test("toolong") // true — matches first 5 chars ("toolo")

// {n,} — n or more times
/\d{3,}/.test("12")    // false (only 2 digits)
/\d{3,}/.test("12345") // true

Part 2: Greedy vs. Lazy Quantifiers

This is where the majority of regex bugs live. By default, all quantifiers are greedy — they match as much as possible while still allowing the overall pattern to succeed. Adding ? after a quantifier makes it lazy (minimal) — it matches as little as possible.

Greedy vs lazy — the HTML tag extraction problem

const html = '<b>bold</b> and <em>italic</em>'

// Greedy: <.*> matches from first < to LAST >
html.match(/<.*>/)
// → ['<b>bold</b> and <em>italic</em>']
// The .* consumed everything between the first and LAST angle bracket

// Lazy: <.*?> matches from < to the NEAREST >
html.match(/<.*?>/g)
// → ['<b>', '</b>', '<em>', '</em>']

// Better solution: character class negation (no backtracking risk)
html.match(/<[^>]+>/g)
// → ['<b>', '</b>', '<em>', '</em>']
// [^>]+ means "one or more chars that are NOT >"
// This is preferred because it has no backtracking ambiguity

// Quantifier lazy variants:
// *?   →  *  with lazy matching
// +?   →  +  with lazy matching
// ??   →  ?  with lazy matching
// {n,m}? → {n,m} with lazy matching

The character class negation approach ([^>]+) is generally preferable to lazy quantifiers for two reasons: it is more explicit about intent (matches everything except the delimiter), and it eliminates backtracking ambiguity — the engine knows immediately when to stop without exploring alternative paths.

Part 3: Capturing Groups, Named Groups, and Non-Capturing Groups

Parentheses in regex serve two purposes simultaneously: they group subpatterns (allowing quantifiers to apply to multiple characters), and they capture the matched substring for later use.

Groups: capturing, non-capturing, and named (ES2018+)

// Capturing group: (pattern)
// Captures are accessible at result[1], result[2], etc.
const dateStr = '2026-04-15'
const match = dateStr.match(/(\d{4})-(\d{2})-(\d{2})/)
// match[0] = '2026-04-15'  (full match)
// match[1] = '2026'        (capture group 1)
// match[2] = '04'          (capture group 2)
// match[3] = '15'          (capture group 3)

// Non-capturing group: (?:pattern)
// Groups without capturing — use for alternation without polluting capture indices
const version = '3.14.159'
const semver = version.match(/(?:\d+\.){2}\d+/)
// Groups without creating match[1], match[2]

// Named capturing group: (?<name>pattern) — ES2018+
const namedMatch = dateStr.match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/)
const { year, month, day } = namedMatch.groups
// year  = '2026'
// month = '04'
// day   = '15'

// Named groups in replacement strings
'2026-04-15'.replace(
  /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
  '$<month>/$<day>/$<year>'
)
// → '04/15/2026'

// Backreference to named group in the same pattern
// Match repeated words: "the the" or "a a"
/\b(?<word>\w+)\s+\k<word>\b/i.test('the the')  // true
/\b(?<word>\w+)\s+\k<word>\b/i.test('the The')  // true (i flag)
/\b(?<word>\w+)\s+\k<word>\b/i.test('the cat')  // false

Named capture groups were standardized in ECMAScript 2018 (V8 6.0+, Node.js 10+). Per the ES2018 specification (TC39 proposal by Gorkem Yakin and Daniel Ehrenberg), named groups are referenced in patterns via \k<name> and in replacement strings via $<name>. They are available in Python (via (?P<name>) syntax), PHP, .NET, Java 7+, and PCRE.

For any pattern with more than two capture groups, named groups are not optional — they are required for maintainability. The cognitive load of tracking which index corresponds to which field is eliminated.

Part 4: Lookahead and Lookbehind Assertions

Lookaround assertions match a position in the string based on what comes before or after, without consuming characters. They are zero-width — they assert context without advancing the match position.

All four lookaround types with practical examples

// Positive lookahead: (?=pattern)
// "match X only if followed by Y" — Y is not part of the match

// Extract version numbers from "v1.2.3" format
'Release v1.2.3 is ready'.match(/\d+\.\d+\.\d+(?= is ready)/)
// → ['1.2.3']  (the " is ready" is not captured)

// Password validation: at least one digit
/^(?=.*\d).{8,}$/.test('password1')  // true
/^(?=.*\d).{8,}$/.test('password')   // false (no digit)

// Negative lookahead: (?!pattern)
// "match X only if NOT followed by Y"

// Match "foo" not followed by "bar"
'foobar foobaz fooqux'.match(/foo(?!bar)\w*/g)
// → ['foobaz', 'fooqux']  ('foobar' excluded)

// Positive lookbehind: (?<=pattern) — ES2018+
// "match X only if preceded by Y"

// Extract dollar amounts after "$"
'Price: $42.00, Cost: $15'.match(/(?<=\$)\d+\.?\d*/g)
// → ['42.00', '15']

// Negative lookbehind: (?<!pattern) — ES2018+
// "match X only if NOT preceded by Y"

// Match numbers not preceded by "$"
'Buy $5 and save 10 dollars'.match(/(?<!\$)\b\d+\b/g)
// → ['10']  (the '$5' is excluded)

// Note: Lookbehind was added in ES2018 (V8 6.2+, Node.js 9.11.2+)
// Go's RE2 does NOT support lookahead or lookbehind
// Python's re module supports lookahead/lookbehind

Lookbehind assertions were proposed by Gorkem Yakin and Nozomu Katō and accepted into ES2018. They are fully supported in V8 (Chrome, Node.js), SpiderMonkey (Firefox), and JavaScriptCore (Safari 16.4+). The key limitation: Go's RE2 engine does not support any lookaround assertions — this is a deliberate design choice to guarantee O(n) matching time. If your Go application needs lookahead-style behavior, restructure the pattern using alternation or process the match result in code.

Part 5: Regex Flags — The Behavior Modifiers

Flags are appended after the closing delimiter: /pattern/flags. JavaScript has eight core RegExp flags:

Flag	Name	Effect	Added
g	Global	Find all matches, not just first. Makes exec() stateful via lastIndex.	ES1
i	Case-insensitive	A matches both A and a.	ES1
m	Multiline	^ and $ match line boundaries, not just string start/end.	ES3
s	DotAll	. matches newlines. Without /s, . matches any char except \n.	ES2018
u	Unicode	Enables Unicode property escapes (\p{L}). Required for \p{...} syntax.	ES2015
y	Sticky	Matches only at lastIndex. Useful for tokenizers and parser-like scanners.	ES2015
d	Indices	Adds match.indices array with start/end positions for each group.	ES2022
v	UnicodeSets	Upgrades Unicode mode with set operations, string properties, and richer Unicode matching.	ES2024

The /g flag's stateful lastIndex trap — a common bug source

// /g makes exec() stateful — it remembers where it left off via lastIndex
const re = /\d+/g  // stored in a variable — gets reused

re.exec('42 and 100')  // { 0: '42',  index: 0 }, re.lastIndex = 2
re.exec('42 and 100')  // { 0: '100', index: 7 }, re.lastIndex = 10
re.exec('42 and 100')  // null (no more matches), re.lastIndex = 0

// BUG: if you reuse it before resetting, it resumes from lastIndex
const re2 = /\w+/g
re2.exec('hello world')  // { 0: 'hello', index: 0 }, lastIndex = 5
re2.exec('abc')          // null, because lastIndex 5 is beyond "abc"
re2.exec('abc')          // { 0: 'abc', index: 0 }, because the null reset lastIndex to 0
// This is why regex with /g or /y in module scope causes hard-to-reproduce bugs

// Solution 1: Create a new regex each call (avoid regex literal in module scope)
function findAll(str) {
  return [...str.matchAll(/\d+/g)]  // matchAll always creates fresh iterator
}

// Solution 2: Reset lastIndex manually
const reGlobal = /\d+/g
function safeExec(str) {
  reGlobal.lastIndex = 0  // reset before use
  return reGlobal.exec(str)
}

Part 6: ReDoS — When Regex Becomes a Security Vulnerability

ReDoS (Regular Expression Denial of Service) is a real attack class, not a theoretical concern. The Cloudflare outage mentioned earlier is one of dozens of documented incidents. Node.js has shipped multiple ReDoS-related CVEs. The root cause is always the same: backtracking NFA engines with nested quantifiers.

Classic ReDoS pattern — exponential worst case

// (a+)+ is the canonical ReDoS example
// For input "aaaaaaaaac" (n 'a's then 'c'):
// The engine tries EVERY way to partition n 'a's into groups of one or more
// This is 2^(n-1) partitions — exponential in n

const vulnerable = /(a+)+/
const payload = 'a'.repeat(30) + 'c'  // 30 a's then 'c'

// DON'T RUN THIS — it will hang your process
// console.time(); vulnerable.test(payload); console.timeEnd()

// Patterns with ReDoS risk:
// (a|a)+   — alternation between identical patterns + outer quantifier
// (a*)*    — nested quantifiers
// \w+\s*;  — at line end without anchor (in certain contexts)

// Real-world vulnerable pattern from CVE-2022-25883 (semver npm package):
// /^\s*(\d+\.\d+\.\d+)\s*$/ — NOT vulnerable
// /^(\s+|\s*,\s*)*$/         — POTENTIALLY vulnerable

// Safe alternatives:
// 1. Use possessive quantifiers if your engine supports them (PCRE/Java)
//    (a++)+ — does not backtrack within the group
// 2. Use atomic groups: (?>a+)+
// 3. Use Go's RE2 — linear time, no backtracking
// 4. Validate with the 'safe-regex' npm package
// 5. Set regex timeout limits at the gateway level

Defending against ReDoS in Node.js applications

// Option 1: node-re2 — RE2 engine for Node.js, O(n) guaranteed
import RE2 from 're2'

const re = new RE2('(a+)+')  // RE2 rejects patterns it can't guarantee-safe
// RE2 throws: "Invalid argument (re2): parentheses not balanced: (a+)+"
// Or silently converts to safe equivalent in some versions

// Option 2: safe-regex package — static analysis
import safeRegex from 'safe-regex'
safeRegex(/(a+)+/)    // false — flagged as potentially catastrophic
safeRegex(/\d+/)      // true — simple, linear

// Option 3: regex-timeout — wraps regex with a time limit
// When user-supplied patterns are evaluated (e.g., search feature),
// always use a timeout wrapper or compile in a Worker with a kill timer

// Option 4: Pattern design — avoid nested quantifiers on same chars
// RISKY: (\w+\s*)+
// SAFE:  \w+(\s+\w+)*   — same semantic, no ambiguity in partition

Per research by Davis et al. (2019), regular expressions with potentially super-linear worst-case behavior appeared in approximately 5.6% of npm packages sampled. The authors specifically called out the npm ecosystem as high-risk because many packages expose regex to user-controlled input (search, routing, validation) without ReDoS analysis.

Part 7: Regex Across Languages — Key Differences

The PCRE (Perl Compatible Regular Expressions) standard is the de facto baseline, but each language's implementation diverges in important ways. Understanding these differences prevents “it works in Python but not Go” debugging sessions.

Feature	JavaScript (V8)	Python (re)	Go (RE2)	Rust (regex)
Engine type	NFA (backtracking)	NFA (backtracking)	RE2 (linear DFA)	RE2 (linear DFA)
Backreferences	Yes (\1, \k<name>)	Yes (\1, (?P=name))	No	No
Lookahead	Yes (?=) (?!)	Yes	No	No
Lookbehind	Yes (ES2018)	Yes (fixed width)	No	No
Named groups	Yes (?<name>)	Yes (?P<name>)	Yes (?P<name>)	Yes (?P<name>)
Unicode property \p{L}	Yes (with /u flag)	Via regex lib	Yes	Yes
Worst-case complexity	Exponential (ReDoS)	Exponential	O(n) guaranteed	O(n) guaranteed

The practical guidance from this table: if you are building a feature where user-supplied input is matched against a pattern (search, routing, form validation), reach for Go or Rust (or node-re2 in Node.js) to get RE2's linear-time guarantee. Reserve Python's reor JavaScript's built-in engine for patterns you control — log processing, internal data transformation, where inputs are trusted.

For ready-to-use production patterns in JavaScript and Python, see our regex cheat sheet and Python regex guide. For validated email/URL/phone patterns with real-world trade-off analysis, see regex validation patterns.

Part 8: Practical Regex Patterns You Will Actually Use

The real test of regex knowledge is building patterns for messy, real-world data. Here are production-tested patterns with commentary on the trade-offs:

Production patterns — with commentary

// Semantic version (semver: major.minor.patch, optional pre-release)
const SEMVER = /^(?<major>0|[1-9]\d*)\.(?<minor>0|[1-9]\d*)\.(?<patch>0|[1-9]\d*)(?:-(?<prerelease>[\w.-]+))?$/
SEMVER.test('1.0.0')        // true
SEMVER.test('2.1.0-beta.1') // true
SEMVER.test('1.01.0')       // false (leading zero)

// ISO 8601 date (basic — not exhaustive)
const ISO_DATE = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/
ISO_DATE.test('2026-04-15') // true
ISO_DATE.test('2026-13-01') // false (month 13)
ISO_DATE.test('2026-04-32') // false (day 32)

// IPv4 address (validates 0–255 range)
const IPV4 = /^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$/
IPV4.test('192.168.1.1')  // true
IPV4.test('256.0.0.1')    // false

// Slugify: URL-safe lowercase string
function slugify(str) {
  return str
    .toLowerCase()
    .replace(/[^\w\s-]/g, '')  // remove non-word, non-space, non-hyphen
    .replace(/[\s_]+/g, '-')   // spaces and underscores → hyphens
    .replace(/--+/g, '-')       // collapse multiple hyphens
    .replace(/^-|-$/g, '')      // trim leading/trailing hyphens
}
slugify('Hello, World! How Are You?')  // 'hello-world-how-are-you'

// Extract all log timestamps (format: [2026-04-15 14:30:00.123])
const LOG_TIMESTAMP = /\[(?<ts>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]/g
const logLine = '[2026-04-15 14:30:00.123] ERROR: connection refused'
const tsMatch = LOG_TIMESTAMP.exec(logLine)
// tsMatch.groups.ts = '2026-04-15 14:30:00.123'

// Remove HTML tags (safe for simple stripping — use DOMParser for complex HTML)
const stripTags = (html) => html.replace(/<[^>]+>/g, '')
stripTags('<p>Hello <strong>world</strong></p>')  // 'Hello world'

JavaScript extraction recipes — matchAll(), named groups, and replace()

// Extract route params from a lightweight route template
const route = '/users/:userId/orders/:orderId'
const PARAM = /:(?<name>[A-Za-z_][A-Za-z0-9_]*)/g
const params = [...route.matchAll(PARAM)].map(m => m.groups.name)
// ['userId', 'orderId']

// Parse structured log lines into objects
const LOG = /^(?<time>\S+)\s+(?<level>INFO|WARN|ERROR)\s+(?<message>.+)$/
const parsed = LOG.exec('2026-05-26T10:12:30Z ERROR database timeout')?.groups
// { time: '2026-05-26T10:12:30Z', level: 'ERROR', message: 'database timeout' }

// Redact tokens without destroying debugging context
const redact = (text) =>
  text.replace(/\b(?<prefix>sk|pk)_[A-Za-z0-9_-]{8}(?<tail>[A-Za-z0-9_-]{4,})\b/g,
    '$<prefix>_********$<tail>')

// Convert simple key=value lines to an object
const ENV_LINE = /^(?<key>[A-Z_][A-Z0-9_]*)=(?<value>.*)$/gm
const env = Object.fromEntries(
  [...'API_URL=https://bytepane.com\nRETRIES=3'.matchAll(ENV_LINE)]
    .map(m => [m.groups.key, m.groups.value])
)
// { API_URL: 'https://bytepane.com', RETRIES: '3' }

Frequently Asked Questions

What does .* mean in regex?

.* means “match any character (except newline) zero or more times.” The dot matches any single character except \n by default. The asterisk is a greedy quantifier that matches 0 or more repetitions. Combined, .* matches any sequence of characters on a line. Use .*? (lazy) to match as few characters as possible. With the /s flag, dot also matches newlines.

What is the difference between .* and .+?

.* matches zero or more characters (can match an empty string). .+ matches one or more characters (must match at least one). Use .+ when you need to guarantee the pattern captured something non-empty. The same distinction applies to all quantifiers: * vs +, {0,5} vs {1,5}.

What is catastrophic backtracking in regex?

Catastrophic backtracking occurs when a regex with nested quantifiers (like (a+)+) must explore exponentially many match paths to reject a string. A pattern like (a+)+btakes microseconds on valid input but exponential time on a string of 'a' characters without a trailing 'b'. This is the ReDoS (Regular Expression Denial of Service) vulnerability class that caused the 2019 Cloudflare global outage.

What does ^ and $ mean in regex?

^ is the start-of-string anchor. $ is the end-of-string anchor. Together, ^pattern$ forces the entire string to match — not just a substring. Without anchors, /\d+/ matches the digits in “abc123def”. With anchors, /^\d+$/ requires the entire string to be digits. In multiline mode (/m flag), they match line boundaries.

What is a named capture group in regex?

Named capture groups use (?<name>pattern) syntax. In JavaScript (ES2018+): const { year } = /(?<year>\d4)/.exec('2026').groups. Supported in Python ((?P<name>)), PHP, .NET, Java 7+, and PCRE. Named groups eliminate magic index numbers in complex patterns and are essential for maintainable regex with multiple captures.

How is regex different across JavaScript, Python, and Go?

JavaScript and Python use backtracking NFA engines — powerful (backreferences, lookaheads) but vulnerable to ReDoS. Go uses RE2, a linear-time DFA engine that prohibits backreferences and lookaheads for guaranteed O(n) performance. Rust's regex crate also uses RE2. For user-facing input validation at scale, prefer RE2-based engines. For data transformation on trusted input, any engine works.

How do I safely build a regex from user input in JavaScript?

Use RegExp.escape(userInput) before inserting literal user text into new RegExp(). It escapes regex syntax characters so dots, stars, question marks, brackets, and parentheses match literally. If the runtime does not support RegExp.escape(), use a vetted polyfill or helper and avoid executing unreviewed user-supplied regex patterns on the server.

Test Your Regex Patterns

Use the JavaScript Regex Tester to verify flags, captures, replacement output, and timing against realistic sample input before moving a pattern into production.

Open Regex Tester →

Related Guides & Tools

→ Regex Cheat Sheet → JavaScript Regex Guide → Python Regex Guide → Regex Validation Patterns (Email, URL, Phone)→ 30+ Essential Regex Patterns → Regex Pattern Library