What Is Base64 Encoding? How It Works & When to Use It

Q: What is the difference between Base64 and Base64url?

Standard Base64 uses + and / characters that are percent-encoded in URLs, breaking them. Base64url (RFC 4648 §5) replaces + with - and / with _, making it safe for URLs and filenames without percent-encoding. JWTs use Base64url.

Q: Why does btoa() throw in JavaScript for Unicode strings?

btoa() only accepts Latin-1 characters (code points 0–255). Strings with characters above U+00FF — such as emoji or CJK characters — cause a "InvalidCharacterError". The fix is to encode the string to UTF-8 bytes first using TextEncoder, then pass the bytes to btoa().

Key Takeaways

•Base64 is encoding, not encryption — any Base64 string can be decoded by anyone without a key.
•It converts arbitrary binary data to printable ASCII by mapping every 3 input bytes to 4 output characters, inflating size by exactly 33.3%.
•Three standardized variants exist: standard Base64, URL-safe Base64url (used in JWTs), and MIME Base64 (used in email attachments).
•The right use cases: transmitting binary data over text-only channels (SMTP, JSON, HTTP Basic Auth), not general-purpose data compression or security.
•Every major language ships built-in Base64 support — but Unicode handling in JavaScript's btoa() requires extra care.

Let's Clear Up the Biggest Misconception First

A surprising number of developers — including experienced ones — treat Base64 as a form of light security. You'll see it in API responses where someone "hid" a password, or in localStorage where a token was "encoded for safety." This is a meaningful security mistake.

Base64 is not encryption. It requires no key. It has no secret. Any string encoded with Base64 can be decoded by anyone in under a second using any Base64 decoder on the planet — including the one built into every browser's developer console:

// "Hidden" password? Absolutely not.
atob("c3VwZXJzZWNyZXRwYXNzd29yZA==")
// => "supersecretpassword"

// Anyone can do this. There is no key.

With that settled: Base64 is a genuinely useful encoding scheme for specific, legitimate purposes. Understanding what it actually does — and what it does not do — is fundamental to working with APIs, authentication tokens, email, and binary data in web applications. To experiment with encoding and decoding, use our Base64 Encoder/Decoder tool.

What Is Base64? The Core Concept

Base64 is a binary-to-text encoding scheme that represents binary data using a set of 64 printable ASCII characters. The name comes directly from the number of characters in that alphabet: 64.

The problem Base64 solves is simple: many communication protocols and storage systems were designed to handle text — specifically printable ASCII characters (roughly the keyboard characters of a 1970s teletype machine). When you need to transmit binary data — an image, a PDF, an executable — across one of these text-only channels, you need a way to represent those arbitrary bytes using only printable characters.

The 64 characters in the standard Base64 alphabet are:

// Standard Base64 Alphabet (RFC 4648 Table 1)
// Values 0-25:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
// Values 26-51: a b c d e f g h i j k l m n o p q r s t u v w x y z
// Values 52-61: 0 1 2 3 4 5 6 7 8 9
// Value 62:     +
// Value 63:     /
// Padding:      = (used to pad output to a multiple of 4 characters)

Why 64? Because 64 is 2⁶ — each character represents exactly 6 bits of data. This clean power-of-two relationship is what makes Base64 efficient and unambiguous to encode and decode.

The earliest standardized version appeared in RFC 989, published in February 1987 by John Linn of BBN Corp, for Privacy Enhanced Mail (PEM). The current normative specification is RFC 4648 (published October 2006 by S. Josefsson), which defines standard Base64, Base64url, Base32, and Base16 encodings. RFC 4648 superseded the earlier RFC 3548.

How the Base64 Algorithm Works

The algorithm processes input in groups of 3 bytes (24 bits) at a time, splitting them into four 6-bit groups, each mapped to a Base64 character. Walk through the encoding of the string "Man" to see exactly how:

Step 1: Convert input bytes to binary

// Input: "Man" (3 bytes)
// ASCII values: M=77, a=97, n=110

// Binary representation:
M  =  77  =  0100 1101
a  =  97  =  0110 0001
n  = 110  =  0110 1110

// Concatenated 24 bits:
010011010110000101101110

Step 2: Split into four 6-bit groups

// Split 24 bits into 4 × 6-bit groups:
010011  010110  000101  101110
  19      22       5      46

// Look up in Base64 alphabet:
// 19 = T
// 22 = W
//  5 = F
// 46 = u

// Result: "TWFu"
btoa("Man") // => "TWFu"

Step 3: Handle padding when input length isn't a multiple of 3

When input bytes don't divide evenly into groups of three, Base64 pads the output with = characters:

// 1 remaining byte → 2 Base64 chars + "=="
btoa("M")     // => "TQ=="

// 2 remaining bytes → 3 Base64 chars + "="
btoa("Ma")    // => "TWE="

// 3 bytes → 4 Base64 chars, no padding needed
btoa("Man")   // => "TWFu"

// For a full string:
btoa("Hello") // => "SGVsbG8="
// H=72, e=101, l=108, l=108, o=111
// Groups: "Hel" → "SGVs", "lo" → "bG8="  (1 pad char)

The size math

Every 3 input bytes produce exactly 4 output characters. Output length = ⌈input_bytes / 3⌉ × 4. This always results in 33.3% size inflation (plus padding bytes). For a 1 MB binary file, Base64 output is approximately 1.37 MB. Per the MIME specification (RFC 2045), Base64-encoded content in email must also insert a CRLF line break every 76 characters, adding another ~3–4% overhead in email contexts.

Base64 Variants: Standard, Base64url, and MIME

RFC 4648 defines multiple Base64 alphabets for different environments. They are identical except for two characters:

Variant	Char 62	Char 63	Padding	Line breaks	Used in
Standard	+	/	= (required)	No	Data URLs, binary blobs
Base64url	-	_	= (optional)	No	JWTs, URL params, filenames
MIME Base64	+	/	= (required)	CRLF every 76 chars	Email attachments (SMTP)

The Base64url variant (RFC 4648 §5) exists because standard Base64's + and / characters are special in URLs — they need percent-encoding (%2B and %2F), which breaks query strings and path segments. JSON Web Tokens (JWTs) use Base64url for all three of their dot-separated sections. The = padding is also typically omitted in Base64url to keep JWTs slightly shorter. You can inspect the Base64url-encoded sections of any JWT with our JWT Decoder.

MIME Base64, specified in RFC 2045 (1996), adds mandatory line breaks every 76 characters. This limit traces back to constraints in SMTP — the Simple Mail Transfer Protocol — which was defined in an era of line-buffered communications. When your email client attaches a 2 MB PDF, it Base64-encodes the raw bytes and inserts a CRLF every 76 characters before embedding the result in the message body.

Real-World Use Cases for Base64

1. Email Attachments (SMTP)

SMTP was designed in 1982 for 7-bit ASCII text. Binary files — images, PDFs, executables — cannot pass through SMTP directly without corruption. Per RFC 2045 (MIME Part One), Base64 is the standard Content-Transfer-Encoding for binary email attachments. Every time you send a file via email, your email client Base64-encodes it first.

2. Data URLs in HTML and CSS

The data: URI scheme (defined in RFC 2397) lets you embed file content directly in HTML or CSS using Base64. This eliminates HTTP requests for small assets:

<!-- Embed a tiny icon directly in HTML — no HTTP request -->
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA..." alt="icon" />

/* Or in CSS: */
.icon {
  background-image: url('data:image/svg+xml;base64,PHN2ZyB4bWxu...');
}

This pattern is worth using for assets under roughly 5–10 KB. Above that, the 33% size overhead and loss of independent browser caching outweigh the benefit of saving an HTTP request.

3. HTTP Basic Authentication

RFC 7617 specifies HTTP Basic Authentication as a Base64-encoded username:password string in the Authorization header. This is purely a text-encoding convenience for the HTTP protocol — it provides zero security on its own. Basic Auth must always be used over HTTPS.

// Constructing a Basic Auth header:
const credentials = btoa("alice:mypassword123")
// => "YWxpY2U6bXlwYXNzd29yZDEyMw=="

fetch("https://api.example.com/resource", {
  headers: {
    "Authorization": `Basic ${credentials}`
  }
})

// The server decodes:
// atob("YWxpY2U6bXlwYXNzd29yZDEyMw==") => "alice:mypassword123"
// Split on first ":" to get username and password

4. JSON Web Tokens (JWTs)

Every JWT consists of three Base64url-encoded sections separated by dots: header, payload, and signature. Base64url makes JWTs safe to include in URL parameters, HTTP headers, and cookie values without additional escaping. The header and payload are just encoded JSON objects — not encrypted — which is why you should never include sensitive data in a JWT payload. Our JWT Decoder shows this in action.

5. Binary Data in JSON APIs

JSON has no binary data type. When an API needs to transmit binary content — a thumbnail, a cryptographic key, a signature — Base64 encoding the bytes and including the resulting string in a JSON field is the standard solution:

// API response with a Base64-encoded thumbnail
{
  "id": "user_42",
  "name": "Alice",
  "avatarData": "iVBORw0KGgoAAAANSUhEUgAAABAAAAAQ...",
  "avatarMime": "image/png"
}

// Decode and display in browser:
const img = document.createElement('img')
img.src = `data:${response.avatarMime};base64,${response.avatarData}`

6. Kubernetes Secrets and Config Files

Kubernetes stores Secret values as Base64 in YAML manifests. This is explicitly not security — the Kubernetes documentation itself states that the Base64 encoding is merely for safe YAML representation of arbitrary byte strings, not for protecting the values. Actual encryption of Secret data at rest requires etcd encryption configuration.

# Kubernetes Secret — values are Base64-encoded, NOT encrypted
apiVersion: v1
kind: Secret
metadata:
  name: my-secret
data:
  username: YWxpY2U=          # echo -n "alice" | base64
  password: bXlwYXNzd29yZA==  # echo -n "mypassword" | base64

# Anyone with kubectl get secret can decode these instantly:
# kubectl get secret my-secret -o jsonpath='{.data.username}' | base64 -d

Base64 Encoding in Every Major Language

JavaScript / Browser

// Browser built-ins: btoa() and atob()
const encoded = btoa("Hello, World!")  // => "SGVsbG8sIFdvcmxkIQ=="
const decoded = atob("SGVsbG8sIFdvcmxkIQ==")  // => "Hello, World!"

// ⚠️ btoa() fails for non-Latin-1 characters (emoji, CJK, etc.)
// Fix: encode to UTF-8 bytes first
function btoaUnicode(str: string): string {
  const bytes = new TextEncoder().encode(str)
  const binStr = String.fromCharCode(...bytes)
  return btoa(binStr)
}

function atobUnicode(b64: string): string {
  const binStr = atob(b64)
  const bytes = Uint8Array.from(binStr, c => c.charCodeAt(0))
  return new TextDecoder().decode(bytes)
}

btoaUnicode("Hello 🌍")  // Works correctly

Node.js

// Node.js: use Buffer (handles arbitrary bytes cleanly)
const encoded = Buffer.from("Hello, World!").toString("base64")
// => "SGVsbG8sIFdvcmxkIQ=="

const decoded = Buffer.from("SGVsbG8sIFdvcmxkIQ==", "base64").toString("utf8")
// => "Hello, World!"

// Base64url (for JWTs, URLs):
const urlSafe = Buffer.from("Hello, World!").toString("base64url")
// => "SGVsbG8sIFdvcmxkIQ" (no padding, - and _ instead of + and /)

// Encoding a binary file:
import { readFileSync } from 'fs'
const imageBytes = readFileSync("./photo.jpg")
const imageB64 = imageBytes.toString("base64")
const dataUrl = `data:image/jpeg;base64,${imageB64}`

Python

import base64

# Standard Base64
encoded = base64.b64encode(b"Hello, World!")
# => b"SGVsbG8sIFdvcmxkIQ=="

decoded = base64.b64decode(b"SGVsbG8sIFdvcmxkIQ==")
# => b"Hello, World!"

# Base64url (URL-safe variant, no padding by default)
url_safe = base64.urlsafe_b64encode(b"Hello, World!")
# => b"SGVsbG8sIFdvcmxkIQ=="  (same here, but + and / are replaced)

# Stripping padding for use in JWTs:
no_pad = url_safe.rstrip(b"=")

# Working with strings (encode to bytes first):
text = "Hello 🌍"
encoded = base64.b64encode(text.encode("utf-8")).decode("ascii")

Go

package main

import (
    "encoding/base64"
    "fmt"
)

func main() {
    input := []byte("Hello, World!")

    // Standard encoding
    encoded := base64.StdEncoding.EncodeToString(input)
    fmt.Println(encoded) // SGVsbG8sIFdvcmxkIQ==

    decoded, err := base64.StdEncoding.DecodeString(encoded)
    if err != nil {
        panic(err)
    }
    fmt.Println(string(decoded)) // Hello, World!

    // URL-safe encoding (for JWTs, URLs)
    urlEncoded := base64.URLEncoding.EncodeToString(input)
    // Uses - and _ instead of + and /

    // URL-safe without padding (most JWT libraries use this)
    rawEncoded := base64.RawURLEncoding.EncodeToString(input)
    fmt.Println(rawEncoded) // SGVsbG8sIFdvcmxkIQ  (no =)
}

Performance Considerations

Base64's 33% overhead is predictable and constant, but it compounds with other costs in high-traffic systems:

Network bandwidth: A 10 MB image embedded as a Base64 data URL in HTML becomes ~13.7 MB over the wire. At scale, this adds up. Serve images as separate assets whenever they exceed 10 KB.
CSS parse time: Base64 data URLs embedded in CSS files are parsed synchronously on the main thread. Large embedded images can block rendering.
Memory: Base64 decoding allocates new byte arrays. In hot paths processing many requests, the GC pressure from repeated decode allocations is measurable. Reuse buffers where possible.
CPU: The encoding/decoding algorithm is computationally trivial — modern CPUs process hundreds of MB/s. CPU is rarely the bottleneck.

The dominant real-world trade-off is bandwidth vs. HTTP requests. Base64 data URLs eliminate a round-trip for each asset but pay a bandwidth penalty every time the parent document is fetched. For SVG icons under 1 KB, inlining as Base64 (or as raw SVG) is typically a net win. For anything larger, a separate file with proper cache headers wins on repeat loads.

When Not to Use Base64

Base64 is often misapplied. Here are the patterns to avoid:

Not for security: As established, Base64 provides no confidentiality. If data is sensitive, encrypt it with AES-GCM or similar. Base64 is transparent encoding.
Not for compression: Base64 makes data larger, never smaller. If you want to reduce payload size, use gzip or Brotli compression.
Not for hashing: People sometimes Base64-encode password hashes "for storage." Hash the password with bcrypt or Argon2id. The output of those functions is already safe to store as-is.
Not as a substitute for binary protocols: If you control both ends of a communication channel and can speak binary (e.g., WebSocket binary frames, Protocol Buffers, MessagePack), use that. Base64 overhead is only justified when the channel is text-only.

Frequently Asked Questions

Is Base64 a form of encryption?

No. Base64 is encoding, not encryption. It is fully reversible without any key. Anyone who receives a Base64 string can decode it instantly. Never use Base64 to protect sensitive data — use proper encryption like AES-256 instead. The confusion arises because the output looks scrambled, but it carries zero cryptographic protection.

Why does Base64 increase file size by 33%?

Base64 encodes every 3 bytes (24 bits) of input into 4 ASCII characters (4 × 6 bits = 24 bits). The output is 4/3 the size of the input — a 33.3% increase. MIME line-break padding in email contexts can push the overhead to roughly 37% of the original binary size.

What is the difference between Base64 and Base64url?

Standard Base64 uses + and / characters that are percent-encoded in URLs, breaking them. Base64url (RFC 4648 §5) replaces + with - and / with _, making it safe for URLs and filenames. JWTs use Base64url throughout.

When should I use Base64 to embed images in CSS?

Only for small assets under roughly 5–10 KB. Embedding larger images as Base64 data URLs bloats your CSS file, prevents browser caching of the image independently, and increases CSS parse time. Above 10 KB, serve the image as a separate file with appropriate cache headers.

Why does btoa() throw in JavaScript for Unicode strings?

btoa() only accepts Latin-1 characters (code points 0–255). Strings with characters above U+00FF — such as emoji or CJK characters — cause an "InvalidCharacterError". The fix is to encode the string to UTF-8 bytes first using TextEncoder, then pass the bytes to btoa().

Can Base64 encode any type of data?

Yes. Base64 operates on raw bytes, so it can encode any binary data: images, PDFs, executables, encrypted ciphertext, audio files, or arbitrary byte streams. The output is always printable ASCII, regardless of what the input is.

Encode and Decode Base64 Online

Put the theory to practice. Our Base64 tool processes data client-side — nothing is sent to a server — and supports standard Base64, Base64url, and binary file input.

Open Base64 Encoder/Decoder →