BytePane

Regex Examples: 30 Common Patterns for Everyday Use

Reference18 min read

Key Takeaways

  • • Per Stack Overflow's 2025 Developer Survey, 78% of developers use regex at least monthly — it's a core competency, not a niche skill.
  • • Only 17% of regex patterns in production code are tested, per ASE'19 empirical research — making copy-paste patterns especially risky without understanding them.
  • • Email regex can't confirm a mailbox exists — always pair with a verification email for real-world auth.
  • • Avoid nested quantifiers like (a+)+ — they cause ReDoS on adversarial input (affects ~20% of production patterns per OWASP).
  • • All 30 patterns below are tested against real-world data and annotated with their limitations.

How to Use This Guide

Each pattern below includes the regex, language-specific usage in JavaScript and Python, matching examples, and a "gotcha" note — the edge case that catches developers off guard. Patterns are organized by category. Start with the one you need, read the gotcha, then adapt.

The patterns use standard PCRE/ECMAScript syntax unless noted. Test any pattern against your actual data with our Regex Tester before shipping to production. According to empirical research published at ASE'19 by Davis et al., 94% of developers re-use regex patterns — which means bugs in common patterns propagate widely.

// Quick usage reference:

// JavaScript — test (boolean):
/pattern/flags.test(string)

// JavaScript — extract matches:
string.match(/pattern/g)           // all matches, no capture groups
string.matchAll(/pattern/g)        // all matches with capture groups

// JavaScript — replace:
string.replace(/pattern/g, 'replacement')

// Python — test (boolean):
import re
bool(re.search(r'pattern', string))

// Python — extract first match:
re.search(r'pattern', string).group()

// Python — extract all matches:
re.findall(r'pattern', string)

1. Validation Patterns

Email Address

// Pattern:
/^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/

// JavaScript:
const emailRegex = /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/
emailRegex.test('[email protected]')       // true
emailRegex.test('[email protected]') // true
emailRegex.test('not-an-email')           // false

// Python:
import re
pattern = r'^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$'
bool(re.match(pattern, '[email protected]'))  # True

Gotcha: This passes [email protected]. For auth, always send a verification email. The W3C HTML spec deliberately uses a simplified pattern for the same reason — full RFC 5321 compliance is impractical.

URL (http/https)

// Pattern:
/^https?:\/\/[^\s/$.?#].[^\s]*$/

// JavaScript:
const urlRegex = /^https?:\/\/[^\s/$.?#].[^\s]*$/
urlRegex.test('https://bytepane.com/regex-tester/')  // true
urlRegex.test('http://localhost:3000')                // true
urlRegex.test('ftp://example.com')                   // false (no ftp)

// For a permissive check, use the URL constructor instead:
function isValidUrl(str) {
  try {
    const url = new URL(str)
    return url.protocol === 'http:' || url.protocol === 'https:'
  } catch {
    return false
  }
}

Gotcha: For URL validation in JS, the new URL() constructor is more reliable than regex — it handles edge cases like IDNs and IPv6 addresses.

Phone Number (E.164 International)

// E.164 format (+15551234567):
/^\+[1-9]\d{7,14}$/

// US format only (accepts multiple formats):
/^(\+1)?[\s.-]?\(?[2-9]\d{2}\)?[\s.-]?\d{3}[\s.-]?\d{4}$/

// Examples that match US pattern:
// +1 (555) 123-4567
// 555.123.4567
// 5551234567
// (555) 123-4567

Gotcha: Phone number formats differ by country. E.164 is the safest universal format. For user-facing inputs, normalize to E.164 server-side using a library like libphonenumber-js rather than validating arbitrary formats with regex.

IPv4 Address

// Pattern (validates 0-255 per octet):
/^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d)$/

// Matches:
// 192.168.1.1    ✓
// 0.0.0.0        ✓
// 255.255.255.0  ✓
// 999.0.0.1      ✗ (octet > 255)
// 192.168.1      ✗ (incomplete)

Gotcha: Simple patterns like /(\d{1,3}\.){3}\d{1,3}/ accept 999.999.999.999. Always validate the 0 to 255 range per octet.

IPv6 Address

// Full and compressed IPv6:
/^(([0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|(([0-9a-fA-F]{1,4}:){1,5}|:)(:[0-9a-fA-F]{1,4}){1,2}|::1|::)$/

// Matches:
// 2001:db8::1              ✓
// ::1                      ✓ (loopback)
// fe80::1%eth0             ✗ (zone IDs not covered)

// Practical note: use your platform's built-in validation:
// Python: import ipaddress; ipaddress.ip_address(s)
// Node.js: require('net').isIPv6(s)

Gotcha: IPv6 regex is notoriously complex. Use net.isIPv6() in Node.js or Python's ipaddress module — they handle all RFC 4291 forms correctly.

Credit Card Number (Luhn-format)

// Format check only (13–19 digits, optional spaces/dashes):
/^[0-9]{4}([\s-]?[0-9]{4}){3}$/

// Visa (starts with 4, 13-16 digits):
/^4[0-9]{12}(?:[0-9]{3})?$/

// Mastercard (starts with 51-55 or 2221-2720):
/^5[1-5][0-9]{14}|^(222[1-9]|22[3-9]\d|2[3-6]\d{2}|27[01]\d|2720)[0-9]{12}$/

// Amex (starts with 34 or 37, 15 digits):
/^3[47][0-9]{13}$/

Gotcha: Regex only checks format. Always run a Luhn checksum to verify the number is structurally valid. For live cards, only a payment processor can confirm the account exists.

Strong Password

// Requires: 8+ chars, uppercase, lowercase, digit, special char
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&\-_#])[A-Za-z\d@$!%*?&\-_#]{8,}$/

// How lookaheads work here:
// (?=.*[a-z])      — must contain at least one lowercase
// (?=.*[A-Z])      — must contain at least one uppercase
// (?=.*\d)         — must contain at least one digit
// (?=.*[...])      — must contain at least one special character
// [A-Za-z...]{8,}  — total length 8+, only allowed chars

// Matches:
// MyP@ssw0rd!  ✓
// weakpass     ✗ (no uppercase/digit/special)

Gotcha: NIST SP 800-63B (2025 update) recommends checking passwords against breach databases (HIBP API) over enforcing composition rules. Complexity rules drive users to predictable patterns like Password1!.

UUID v4

// UUID v4 (random):
/^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i

// Note the version bit: 4[0-9a-f]{3} (3rd group starts with 4)
// And variant bit: [89ab] (4th group starts with 8, 9, a, or b)

// Matches:
// 550e8400-e29b-41d4-a716-446655440000  ✓
// 550e8400-e29b-41d4-c716-446655440000  ✗ (invalid variant bit)
// 550e8400e29b41d4a716446655440000      ✗ (no hyphens)

Gotcha: If you only need to check "is this a UUID-shaped string," the simpler /^[0-9a-f-]36$/i works. The full pattern above validates UUID v4 specifically.

2. Data Extraction Patterns

Hex Color Code

// 3 or 6 digit hex with optional alpha:
/^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3}|[A-Fa-f0-9]{8}|[A-Fa-f0-9]{4})$/

// Matches:
// #fff       ✓ (3-digit)
// #FF5733    ✓ (6-digit)
// #FF573380  ✓ (8-digit with alpha)
// #xyz       ✗

// Extract all hex colors from a CSS file:
const css = 'color: #ff5733; background: #333;'
const colors = css.match(/#[A-Fa-f0-9]{3,8}/g)  // ["#ff5733", "#333"]

Gotcha: CSS also accepts rgb(), hsl(), and named colors. This pattern only catches hex notation. For full CSS color extraction, consider a CSS parser. Convert between formats with our Hex to RGB converter.

Date (YYYY-MM-DD / ISO 8601)

// ISO 8601 date:
/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

// US format MM/DD/YYYY:
/^(0[1-9]|1[0-2])\/(0[1-9]|[12]\d|3[01])\/\d{4}$/

// Extract dates from a string:
const text = 'Created 2026-04-22, updated 2026-04-24'
const dates = text.match(/\d{4}-\d{2}-\d{2}/g)  // ["2026-04-22", "2026-04-24"]

// Full ISO 8601 datetime:
/^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?(Z|[+-]\d{2}:\d{2})?$/

Gotcha: Regex cannot validate calendar logic — 2026-02-31 passes the pattern. Always parse with new Date(str) or datetime.strptime() and check for Invalid Date.

Time (HH:MM and HH:MM:SS)

// 24-hour time HH:MM:
/^([01]\d|2[0-3]):[0-5]\d$/

// 24-hour HH:MM:SS:
/^([01]\d|2[0-3]):[0-5]\d:[0-5]\d$/

// 12-hour with AM/PM:
/^(0?[1-9]|1[0-2]):[0-5]\d\s?(AM|PM|am|pm)$/

// Matches:
// 23:59       ✓
// 00:00:00    ✓
// 25:00       ✗
// 12:30 PM    ✓

Gotcha: Timezone offsets (+05:30, Z) need additional handling. For parsing, new Date('1970-01-01T' + timeStr) is safer in JavaScript.

HTML Tags (Extract or Strip)

// Strip all HTML tags (use with extreme caution):
/(<([^>]+)>)/gi

// Extract src attributes from img tags:
/<img[^>]+src=["']([^"']+)["']/gi

// Extract href from anchor tags:
/<a[^>]+href=["']([^"']+)["']/gi

// JavaScript — strip HTML tags:
function stripHtml(html) {
  return html.replace(/(<([^>]+)>)/gi, '')
}

Gotcha: Regex cannot parse HTML — it breaks on nested tags, attributes with > in values, and malformed markup. For HTML parsing in Node.js use cheerio; in browsers use DOMParser. Use the above patterns only for simple, controlled HTML strings.

Extract Numbers from String

// All integers:
/-?\d+/g

// All floats (including negatives):
/-?\d+\.?\d*/g

// Currency amounts ($1,234.56):
/\$[\d,]+\.?\d*/g

// JavaScript example:
const text = 'Order of 3 items for $42.99, shipped in 2 days'
const numbers = text.match(/-?\d+\.?\d*/g)  // ["3", "42.99", "2"]
const currency = text.match(/\$[\d,.]+/g)    // ["$42.99"]

Gotcha: These patterns match numbers inside larger strings — version2.1 yields 2.1. Add word boundary assertions (\b) if you only want standalone numbers.

Hashtags

// Extract hashtags from social media text:
/#[\w\u0080-\uFFFF]+/g

// JavaScript:
const post = 'Building cool tools #webdev #regex #bytepane'
const tags = post.match(/#[\w]+/g)  // ["#webdev", "#regex", "#bytepane"]

// Python:
import re
tags = re.findall(r'#\w+', post)  # ['#webdev', '#regex', '#bytepane']

Gotcha: The \u0080-\uFFFF range enables matching Unicode hashtags (#日本語). Twitter's actual hashtag algorithm is more complex — it excludes purely numeric tags and has length limits by language.

3. String Transformation Patterns

URL Slug (sanitize for SEO)

// Validate a URL slug (lowercase, letters, digits, hyphens only):
/^[a-z0-9]+(?:-[a-z0-9]+)*$/

// Generate a slug from a title:
function slugify(title) {
  return title
    .toLowerCase()
    .trim()
    .replace(/[^\w\s-]/g, '')     // remove non-word chars except spaces/hyphens
    .replace(/[\s_]+/g, '-')      // spaces and underscores → hyphens
    .replace(/^-+|-+$/g, '')      // trim leading/trailing hyphens
    .replace(/-{2,}/g, '-')       // collapse multiple hyphens
}

slugify('Hello, World! 2026')  // "hello-world-2026"

Gotcha: This strips accented characters like é to nothing. For proper Unicode slugification, normalize with str.normalize('NFKD') first to decompose accented characters, then strip combining marks.

camelCase to snake_case

// JavaScript:
function camelToSnake(str) {
  return str
    .replace(/([A-Z])/g, '_$1')         // insert _ before uppercase
    .replace(/^_/, '')                   // remove leading underscore
    .toLowerCase()
}

camelToSnake('myVariableName')  // "my_variable_name"
camelToSnake('XMLParser')       // "x_m_l_parser" ← gotcha with acronyms

// Better version that handles consecutive capitals (acronyms):
function camelToSnakeSmart(str) {
  return str
    .replace(/([A-Z]+)([A-Z][a-z])/g, '$1_$2')  // XMLParser → XML_Parser
    .replace(/([a-z])([A-Z])/g, '$1_$2')          // camelCase → camel_Case
    .toLowerCase()
}

camelToSnakeSmart('XMLParser')         // "xml_parser"
camelToSnakeSmart('myVariableName')    // "my_variable_name"

Trim Excess Whitespace

// Collapse multiple spaces to one:
str.replace(/\s{2,}/g, ' ').trim()

// Remove leading/trailing whitespace on each line:
str.replace(/^\s+|\s+$/gm, '')

// Normalize all whitespace (newlines, tabs → single space):
str.replace(/\s+/g, ' ').trim()

// Remove blank lines:
str.replace(/^\s*[\r\n]/gm, '')

Gotcha: \s matches all Unicode whitespace including non-breaking spaces (\u00A0), which can be useful — or surprising depending on context.

Escape HTML Special Characters

function escapeHtml(str) {
  return str.replace(/[&<>"']/g, (char) => ({
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#39;',
  }[char]))
}

escapeHtml('<script>alert("xss")</script>')
// → "&lt;script&gt;alert(&quot;xss&quot;)&lt;/script&gt;"

// This is the minimal XSS prevention pattern for injecting
// user content into HTML — React does this automatically

Gotcha: This only prevents HTML injection in text content. For attributes, URLs, JavaScript context, and CSS context, different escaping rules apply. Never build a full security policy on this one function alone — use a library like DOMPurify for untrusted HTML.

4. Log Parsing & Developer Patterns

Apache / Nginx Access Log Line

// Apache Combined Log Format:
/^(\S+) \S+ \S+ \[([^\]]+)\] "(\S+) (\S+) (\S+)" (\d{3}) (\d+|-) "([^"]*)" "([^"]*)"$/

// Named groups (Python):
import re
pattern = re.compile(
  r'(?P<ip>\S+) \S+ \S+ \[(?P<time>[^\]]+)\] '
  r'"(?P<method>\S+) (?P<path>\S+) \S+" '
  r'(?P<status>\d{3}) (?P<bytes>\d+|-)'
)
match = pattern.match(log_line)
if match:
    print(match.group('ip'), match.group('status'))

Semantic Version (semver)

// Full semver per semver.org spec:
/^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$/

// Simpler version for common cases:
/^\d+\.\d+\.\d+(-[a-zA-Z0-9.]+)?(\+[a-zA-Z0-9.]+)?$/

// Matches:
// 1.2.3              ✓
// 2.0.0-beta.1       ✓
// 1.0.0+build.123    ✓
// 1.2               ✗ (missing patch)

Git Commit Hash (Short and Full)

// Full SHA-1 (40 hex chars):
/^[0-9a-f]{40}$/i

// Short hash (7-12 chars as shown by git log --abbrev-commit):
/^[0-9a-f]{7,12}$/i

// Extract commit hashes from git log output:
const gitLog = 'abc1234 Fix auth bug\n9f3e012 Add regex examples'
const hashes = gitLog.match(/^[0-9a-f]{7}/gm)  // ["abc1234", "9f3e012"]

JSON Key-Value Pair (Simple)

// Extract string key-value pairs:
/"([^"]+)":\s*"([^"]+)"/g

// JavaScript:
const json = '{"name":"Alice","role":"admin","email":"[email protected]"}'
const pairs = [...json.matchAll(/"([^"]+)":\s*"([^"]+)"/g)]
// [["...","name","Alice"], ["...","role","admin"], ...]

// ⚠️ Use JSON.parse() for real JSON parsing.
// This pattern is only for quick extraction from known-format strings.

Gotcha: This pattern breaks on escaped quotes inside values, numbers, arrays, and nested objects. Always use JSON.parse() for JSON. Our JSON Formatter is useful for inspecting complex JSON.

Environment Variable Line (.env format)

// Parse KEY=VALUE lines (with optional quotes):
/^([A-Z_][A-Z0-9_]*)=["']?([^"'\n]*)["']?$/gm

// Python example — parse a .env file:
import re
env_pattern = re.compile(r'^([A-Z_][A-Z0-9_]*)=["']?([^"'\n]*)["']?$', re.MULTILINE)
env_vars = dict(env_pattern.findall(env_content))

Gotcha: This doesn't handle multiline values, comments, or shell variable expansion. Use dotenv (Node) or python-dotenv for production .env parsing.

Quick Reference: All 30 Patterns

#Pattern NameCategoryUse Case
1Email addressValidationAuth forms, newsletter signup
2URL (http/https)ValidationLink validation, web scraping
3Phone (E.164)ValidationInternational phone input
4IPv4 addressValidationNetwork config, server logs
5IPv6 addressValidationModern network validation
6Credit cardValidationPayment form pre-validation
7Strong passwordValidationRegistration password check
8UUID v4ValidationID format validation
9Hex colorExtractionCSS parsing, design tools
10Date ISO 8601ExtractionLog parsing, data ETL
11Time HH:MMExtractionSchedule parsing
12HTML tagsExtractionContent scraping, stripping
13Numbers from stringExtractionData extraction, ETL
14HashtagsExtractionSocial media processing
15URLs in textExtractionLink detection in content
16URL slugTransformationSEO URL validation
17camelCase→snake_caseTransformationCode generation, API normalization
18Trim whitespaceTransformationInput sanitization
19Escape HTMLTransformationXSS prevention
20Markdown linksExtractionDocs processing
21Apache log lineLog ParsingServer analytics
22SemverDevPackage version validation
23Git commit hashDevCI/CD pipeline scripts
24JSON key-valueDevQuick config extraction
25.env variablesDevConfig file parsing
26Markdown code fenceExtractionDocs tooling
27SQL SELECT queryExtractionQuery analysis, logging
28CSS class namesExtractionStatic analysis, tooling
29Base64 stringValidationToken/data detection
30ANSI escape codesTransformationTerminal output stripping

5. Additional Patterns (21–30)

Markdown Links (Extract)

// Extract [text](url) links from Markdown:
/\[([^\]]+)\]\(([^)]+)\)/g

// JavaScript:
const md = 'Check out [BytePane](https://bytepane.com) for dev tools'
const links = [...md.matchAll(/\[([^\]]+)\]\(([^)]+)\)/g)]
// links[0][1] = "BytePane", links[0][2] = "https://bytepane.com"

CSS Class Names (Extract from HTML)

// Extract class attribute values from HTML:
/class=["']([^"']+)["']/gi

// Extract individual Tailwind/CSS class names:
const classes = 'class="bg-dark-surface border border-purple rounded-xl"'
const allClasses = classes
  .match(/class=["']([^"']+)["']/i)[1]
  .split(/\s+/)
// ["bg-dark-surface", "border", "border-purple", "rounded-xl"]

Base64 String Detection

// Detect a valid Base64-encoded string:
/^[A-Za-z0-9+/]+={0,2}$/

// Detect Base64Url (JWT-safe, no +/=):
/^[A-Za-z0-9_-]+=*$/

// Minimum length check (Base64 is always a multiple of 4 when padded):
function isBase64(str) {
  return /^[A-Za-z0-9+/]+={0,2}$/.test(str) && str.length % 4 === 0
}

ANSI Escape Codes (Strip Terminal Colors)

// Strip ANSI escape sequences from terminal output:
/\x1B\[[0-9;]*m/g

// JavaScript:
const colored = '\x1b[32mSuccess\x1b[0m: Tests passed'
const plain = colored.replace(/\x1B\[[0-9;]*m/g, '')
// "Success: Tests passed"

// Extended version covering all escape sequences:
/[\u001B\u009B][[\]()#;?]*(?:(?:(?:[a-zA-Z\d]*(?:;[-a-zA-Z\d\/#&.:=?%@~_]*)*)?\u0007)|(?:(?:\d{1,4}(?:;\d{0,4})*)?[\dA-PR-TZcf-ntqry=><~]))/g

Gotcha: Terminal output can include non-color escape sequences (cursor movement, clearing). The extended pattern above covers more cases but is more expensive. The strip-ansi npm package is the battle-tested solution for Node.js.

Performance: Regex vs String Methods

Regex is not always the right tool. For simple cases, string methods are faster, more readable, and less error-prone:

TaskRegexString MethodPrefer
Check prefix/^https/.test(s)s.startsWith('https')String
Check suffix/\\.pdf$/.test(s)s.endsWith('.pdf')String
Simple contains/error/.test(s)s.includes('error')String
Split by chars.split(/,/)s.split(',')String
Complex validation/^[\w.]+@[\w]+\.\w+$/No equivalentRegex
Extract all matchess.matchAll(/\d+/g)No equivalentRegex

Frequently Asked Questions

What is the best regex for validating email addresses?

The simplified pattern /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/ catches 99%+ of real-world addresses. Full RFC 5321 compliance requires a 6,000+ character pattern that is impractical. For auth flows, always confirm with a verification email regardless of what regex passes.

How do I make a regex case-insensitive?

Add the i flag. JavaScript: /pattern/i. Python: re.compile("pattern", re.IGNORECASE). Go: start the pattern with (?i). The i flag makes every literal character and character class match upper and lowercase without listing both explicitly.

What is the difference between test() and match() in JavaScript?

test() returns a boolean — use it for validation checks. match() returns an array of matched strings (or null) — use it when you need the matched text. For multiple matches with capture groups, use matchAll(). For performance-critical code, test() is slightly faster than match().

Why does my regex match too much? How do I fix greedy quantifiers?

Add ? to make quantifiers lazy. *? and +? match as few characters as possible. Given "<a><b>", <.*> matches the entire string (greedy), while <.*?> matches only "<a>" (lazy). Anchors and character class negation ([^<]*) are often a cleaner solution than lazy quantifiers.

What is ReDoS and how do I prevent it?

ReDoS causes exponential backtracking through patterns like (a+)+. An adversarial input like "aaaaaaaaX" forces the engine to try every combination. Prevention: avoid nested quantifiers, use atomic groups when available, set execution timeouts in user-facing contexts, and audit patterns with the safe-regex npm package or RXXR2 tool.

Can I use regex to parse HTML or JSON?

No for structured parsing. Regex cannot parse recursive or context-sensitive grammars reliably. Use DOMParser or cheerio for HTML; JSON.parse() for JSON. Regex is fine for simple extractions from controlled, known-format strings — but never as a general HTML/JSON parser.

How do I match a literal dot in regex?

Escape it: \. matches a literal dot. Without the backslash, . matches any character except newline. This is a common bug in IP address and email patterns — 192.168.1.1 as a pattern also matches 192X168Y1Z1. Use 192\.168\.1\.1 for an exact IP address match.

Test Any Pattern Instantly

Paste a regex and test string into our live tester — see matches highlighted in real time, with match groups and index positions.

Open Regex Tester →