What Is Base64 Encoding
Base64 is a binary-to-text encoding scheme that represents binary data using a set of 64 printable ASCII characters. It was originally designed to allow binary data to pass safely through text-based transport systems that might corrupt or reject raw binary bytes. The encoding is defined in RFC 4648 and has become a fundamental building block of the modern web.
The name "Base64" comes from the fact that it uses a 64-character alphabet to represent data. Each character in a Base64 string encodes exactly 6 bits of data (since 2^6 = 64). By comparison, hexadecimal is a Base16 encoding that uses 16 characters to represent 4 bits each, and regular decimal is a Base10 system.
You encounter Base64 constantly in web development, even if you do not always recognize it. Data URIs in HTML and CSS use Base64 to embed images directly in markup. JWT (JSON Web Tokens) encode their header and payload sections in Base64. Email attachments are transmitted as Base64-encoded MIME content. API responses often include Base64-encoded binary data in JSON payloads.
To encode or decode Base64 strings right now, use our Base64 Encoder/Decoder tool, which handles both standard and URL-safe variants.
How Base64 Works
The fundamental principle of Base64 encoding is simple: take binary data, split it into groups of 6 bits, and map each 6-bit group to a character from the Base64 alphabet. Since most computer systems work with 8-bit bytes, there is a mismatch between the 8-bit input chunks and the 6-bit output chunks. Base64 resolves this by working with groups of 3 bytes (24 bits) at a time, which divides evenly into four 6-bit groups.
Here is the math that makes it work:
- 3 input bytes = 24 bits
- 24 bits / 6 bits per character = 4 Base64 characters
- So every 3 bytes of input produce 4 characters of output
- This gives a size ratio of 4:3, meaning Base64 output is approximately 33% larger than the input
When the input length is not a multiple of 3 bytes, padding is added to the output. This is indicated by one or two = characters at the end of the encoded string. We will cover padding in detail in a later section.
The Base64 Alphabet
The standard Base64 alphabet consists of 64 characters plus the padding character:
- Uppercase letters: A through Z (values 0-25)
- Lowercase letters: a through z (values 26-51)
- Digits: 0 through 9 (values 52-61)
- Two special characters: + (value 62) and / (value 63)
- Padding character: = (used to fill incomplete groups)
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0 | A | 16 | Q | 32 | g | 48 | w |
| 1 | B | 17 | R | 33 | h | 49 | x |
| 2 | C | 18 | S | 34 | i | 50 | y |
| 3 | D | 19 | T | 35 | j | 51 | z |
| 4 | E | 20 | U | 36 | k | 52 | 0 |
| 5 | F | 21 | V | 37 | l | 53 | 1 |
| 6 | G | 22 | W | 38 | m | 54 | 2 |
| 7 | H | 23 | X | 39 | n | 55 | 3 |
| 8 | I | 24 | Y | 40 | o | 56 | 4 |
| 9 | J | 25 | Z | 41 | p | 57 | 5 |
| 10 | K | 26 | a | 42 | q | 58 | 6 |
| 11 | L | 27 | b | 43 | r | 59 | 7 |
| 12 | M | 28 | c | 44 | s | 60 | 8 |
| 13 | N | 29 | d | 45 | t | 61 | 9 |
| 14 | O | 30 | e | 46 | u | 62 | + |
| 15 | P | 31 | f | 47 | v | 63 | / |
These 64 characters were chosen specifically because they are all printable ASCII characters that are safe to transmit through virtually any text-based system. They survive character encoding conversions, email gateways, and URL handling (with the exception of + and / in URLs, which is why the URL-safe variant exists).
The Encoding Process Step by Step
Let us walk through encoding the word "Cat" to Base64 to understand exactly how the process works.
Encoding "Cat" to Base64
Step 1: Convert each character to its ASCII byte value.
C = 67, a = 97, t = 116
Step 2: Convert each byte value to 8-bit binary.
C = 01000011, a = 01100001, t = 01110100
Step 3: Concatenate all bits into one continuous stream.
010000110110000101110100
Step 4: Split the bit stream into groups of 6 bits.
010000 | 110110 | 000101 | 110100
Step 5: Convert each 6-bit group to its decimal value.
16 | 54 | 5 | 52
Step 6: Look up each value in the Base64 alphabet.
Q | 2 | F | 0
Result: "Cat" encodes to Q2F0
This example works out cleanly because "Cat" is exactly 3 bytes, which divides evenly into four 6-bit groups. When the input is not a multiple of 3 bytes, padding is required.
Let us encode "Ca" (2 bytes) to see how padding works:
Encoding "Ca" to Base64 (with padding)
Step 1: ASCII values: C = 67, a = 97
Step 2: Binary: 01000011 01100001
Step 3: Concatenate: 0100001101100001
Step 4: Split into 6-bit groups: 010000 | 110110 | 0001
Step 5: The last group has only 4 bits. Pad with zeros to make 6: 010000 | 110110 | 000100
Step 6: Decimal values: 16 | 54 | 4
Step 7: Base64 characters: Q | 2 | E
Step 8: Since we had 2 input bytes (not 3), add one = padding character.
Result: "Ca" encodes to Q2E=
Padding with the Equals Sign
The = padding character at the end of a Base64 string indicates that the original data was not a perfect multiple of 3 bytes. The number of padding characters tells you how many bytes were missing from the last group:
- No padding: the input length was a multiple of 3 (e.g., 3, 6, 9 bytes).
- One
=: the input had 2 bytes remaining in the last group. Two 8-bit bytes (16 bits) produce three 6-bit groups (with 2 zero-padded bits). - Two
==: the input had 1 byte remaining in the last group. One 8-bit byte (8 bits) produces two 6-bit groups (with 4 zero-padded bits).
Padding ensures that the Base64 output length is always a multiple of 4 characters. This makes it possible for the decoder to determine the exact length of the original data. Without padding, a decoder would not know whether trailing zero bits in the last 6-bit group were part of the original data or added during encoding.
Some implementations strip padding characters since the original data length can also be inferred from the encoded string length modulo 4. URL-safe Base64 commonly omits padding. Both approaches work, but standard Base64 (RFC 4648 Section 4) specifies that padding should be present.
The Decoding Process
Decoding is the exact reverse of encoding:
- Take the Base64 string and look up each character in the alphabet to get its 6-bit value.
- Concatenate all the 6-bit values into a continuous bit stream.
- Split the bit stream into groups of 8 bits (bytes).
- Discard any remaining bits that were added as padding during encoding.
- Convert each byte to its corresponding character or use the raw binary data.
Decoding "Q2F0" back to text
Step 1: Look up each character: Q=16, 2=54, F=5, 0=52
Step 2: Convert to 6-bit binary: 010000 110110 000101 110100
Step 3: Concatenate: 010000110110000101110100
Step 4: Split into 8-bit bytes: 01000011 01100001 01110100
Step 5: Convert to ASCII: 67=C, 97=a, 116=t
Result: Q2F0 decodes to "Cat"
Try encoding and decoding strings yourself with our Base64 Encoder/Decoder tool.
URL-Safe Base64 Variant
Standard Base64 uses + and / as two of its 64 characters. Both of these have special meaning in URLs: + represents a space in query parameters (application/x-www-form-urlencoded), and / is the path separator. Including standard Base64 in a URL requires percent-encoding these characters (%2B and %2F), which increases the string length and can cause issues with URL length limits.
RFC 4648 Section 5 defines a URL-safe variant called "Base64url" that solves this by substituting two characters:
+is replaced with-(hyphen)/is replaced with_(underscore)
These substitutions use characters that are safe in URLs without percent-encoding. The rest of the alphabet (A-Z, a-z, 0-9) remains the same.
URL-safe Base64 also commonly omits the = padding characters, since the padding can be inferred and the equals sign would need percent-encoding in URLs as well. When you need to decode URL-safe Base64 with a standard decoder, you can add the padding back by appending = characters until the string length is a multiple of 4.
JWT tokens are the most prominent use of URL-safe Base64. The header and payload sections of a JWT are Base64url-encoded without padding, separated by periods. To inspect JWT tokens and see their decoded contents, use our JWT Decoder tool.
For encoding and decoding URL components beyond Base64, our URL Encoder/Decoder handles percent-encoding and decoding for URL-unsafe characters.
Common Use Cases
Data URIs in HTML and CSS
Data URIs let you embed files directly in HTML or CSS using the data: scheme. The file content is Base64-encoded and placed inline, eliminating the need for a separate HTTP request.
<!-- Embedding a small PNG image -->
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..." alt="icon">
/* Embedding a background image in CSS */
.icon {
background-image: url(data:image/svg+xml;base64,PHN2ZyB4bWxucz0i...);
}
Data URIs are best for small files (under 5-10 KB). For larger files, the 33% size overhead of Base64 and the inability to cache data URIs separately from the HTML/CSS document make them less efficient than serving the file directly. However, for small icons and decorative elements, data URIs reduce the number of HTTP requests, which can improve perceived page load speed.
To convert images to Base64 data URIs, use our Image to Base64 converter.
Email Attachments (MIME)
The MIME (Multipurpose Internet Mail Extensions) standard uses Base64 to encode email attachments. Email was originally designed for 7-bit ASCII text, which cannot represent binary file data directly. Base64 encoding transforms attachments into a text representation that can be safely transmitted through email infrastructure.
When you attach a file to an email, your email client Base64-encodes the file and includes it in the message body with a Content-Transfer-Encoding: base64 header. The recipient's email client decodes the Base64 back to the original file. This process is transparent to the user but is the reason email attachments are typically limited to 25 MB. The 33% Base64 overhead means a 25 MB limit on the encoded message actually allows roughly 18.75 MB of actual attachment data.
JWT (JSON Web Tokens)
JWT tokens consist of three parts separated by periods: header, payload, and signature. The header and payload are JSON objects encoded with Base64url (URL-safe Base64 without padding). This encoding makes the token safe to include in HTTP headers, URL query parameters, and form fields without additional escaping.
// JWT structure:
// header.payload.signature
// Example header (Base64url decoded):
{
"alg": "HS256",
"typ": "JWT"
}
// Example payload (Base64url decoded):
{
"sub": "1234567890",
"name": "John Doe",
"iat": 1516239022
}
// The encoded token looks like:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.
eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6
IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIy
fQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
API Payloads with Binary Data
REST APIs typically use JSON for data exchange, but JSON is a text format that cannot natively represent binary data. When an API needs to include binary content (images, files, certificates) in a JSON response, it Base64-encodes the binary data and includes it as a string value.
{
"filename": "document.pdf",
"content_type": "application/pdf",
"data": "JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PCAvVH..."
}
This approach works but increases the payload size by 33%. For large files, it is more efficient to use a separate binary endpoint or multipart form data. Base64 in JSON is most appropriate for small to medium files (thumbnails, signatures, QR codes, etc.) where the convenience of a single request outweighs the size overhead.
Storing Binary Data in XML
XML, like JSON, is a text format. Binary data in XML must be encoded. Base64 is the standard encoding for binary content in XML documents, SOAP messages, and SAML assertions. The XML Schema specification includes xs:base64Binary as a built-in data type specifically for Base64-encoded binary content.
Base64 in Programming Languages
JavaScript (Browser)
// Encode a string to Base64
const encoded = btoa('Hello, World');
// Result: "SGVsbG8sIFdvcmxk"
// Decode a Base64 string
const decoded = atob('SGVsbG8sIFdvcmxk');
// Result: "Hello, World"
// For Unicode strings (btoa only handles Latin-1):
function encodeUnicode(str) {
return btoa(encodeURIComponent(str).replace(
/%([0-9A-F]{2})/g,
(_, p1) => String.fromCharCode(parseInt(p1, 16))
));
}
function decodeUnicode(b64) {
return decodeURIComponent(
Array.from(atob(b64), c =>
'%' + c.charCodeAt(0).toString(16).padStart(2, '0')
).join('')
);
}
JavaScript (Node.js)
// Encode
const encoded = Buffer.from('Hello, World').toString('base64');
// Result: "SGVsbG8sIFdvcmxk"
// Decode
const decoded = Buffer.from('SGVsbG8sIFdvcmxk', 'base64').toString();
// Result: "Hello, World"
// URL-safe Base64
const urlSafe = Buffer.from('Hello, World').toString('base64url');
// Result: "SGVsbG8sIFdvcmxk" (no special chars in this example)
Python
import base64
# Encode
encoded = base64.b64encode(b'Hello, World').decode('ascii')
# Result: "SGVsbG8sIFdvcmxk"
# Decode
decoded = base64.b64decode('SGVsbG8sIFdvcmxk').decode('utf-8')
# Result: "Hello, World"
# URL-safe variant
url_safe = base64.urlsafe_b64encode(b'Hello, World').decode('ascii')
Command Line
# Encode (macOS/Linux)
echo -n 'Hello, World' | base64
# Result: SGVsbG8sIFdvcmxk
# Decode
echo 'SGVsbG8sIFdvcmxk' | base64 --decode
# Result: Hello, World
# Encode a file
base64 input.png > output.txt
# Decode a file
base64 --decode input.txt > output.png
Performance Considerations
Base64 encoding introduces a consistent 33% size overhead. For small data like icons or short strings, this overhead is negligible. For large files, it becomes significant. A 10 MB image becomes approximately 13.3 MB when Base64-encoded. A 100 MB file becomes roughly 133 MB.
Beyond raw size, there are several performance implications to consider:
Memory usage: Base64 encoding and decoding require the entire input to be in memory (or processed in chunks). For very large files, this can cause memory pressure in browser-based applications. When encoding files in JavaScript, use chunked reading with the FileReader API or ReadableStream to avoid loading the entire file into memory at once.
CPU cost: the encoding and decoding operations themselves are computationally inexpensive. Modern hardware can Base64-encode data at several gigabytes per second. The CPU cost is rarely the bottleneck.
Caching: Base64-encoded data embedded in HTML or CSS cannot be cached independently. If you embed a 100 KB image as a data URI in your CSS file, that image data is downloaded every time the CSS file is fetched. A separately served image file can be cached by the browser and CDN, potentially saving significant bandwidth over time.
Compression: Base64 text compresses reasonably well with gzip or Brotli. The overhead after compression is typically less than the raw 33% because the repetitive character patterns in Base64 strings compress efficiently. However, the original binary data usually compresses even better than its Base64 representation, so you still benefit from serving binary files directly when possible.
Guideline: Use Base64 data URIs for images under 5 KB. Above that threshold, serve images as separate files to benefit from independent caching. For CSS sprites and icon fonts, consider SVG as an alternative to Base64-encoded PNGs since SVG is a text format that does not require Base64 encoding and compresses extremely well.
Base64 Is Not Encryption
This point deserves its own section because it is a common source of confusion. Base64 is an encoding, not encryption. The distinction is critical:
Encoding transforms data from one format to another for compatibility. Anyone can decode Base64 without any secret information. The algorithm is public and deterministic. There is no key, no password, and no secret. Base64 provides exactly zero security.
Encryption transforms data using a key so that only someone with the correct key can read it. AES, RSA, and ChaCha20 are encryption algorithms. They provide confidentiality.
Hashing transforms data into a fixed-size digest that cannot be reversed. SHA-256 and bcrypt are hashing algorithms. They provide integrity and password verification. For generating cryptographic hashes, use our Hash Generator tool.
Important: Never use Base64 to "hide" or "secure" sensitive data like passwords, API keys, or personal information. Base64 is trivially reversible. Any developer can decode a Base64 string in seconds using any programming language, command-line tool, or online decoder. If you need to protect data, use proper encryption.
A common misconception arises from JWT tokens. JWTs use Base64url encoding for their header and payload, which can make the token appear opaque. In reality, anyone can decode the header and payload of a JWT. The signature (the third section) is what provides integrity verification, not the Base64 encoding. The payload of a JWT should never contain sensitive information that the client should not be able to read.
Related Tools on Zovo Tools
Frequently Asked Questions
No. Base64 is an encoding scheme, not encryption. It transforms binary data into a text representation that can be reversed by anyone without a key. There is no secret involved. Base64 provides zero security. Anyone who has the encoded string can decode it instantly. Never use Base64 as a way to hide or protect sensitive data.
Base64 represents every 3 bytes (24 bits) of input as 4 characters (each encoding 6 bits). Since each output character takes 1 byte in ASCII, 3 input bytes become 4 output bytes. That is a 4/3 ratio, which equals approximately 33% size increase. Additional padding characters and line breaks can add slightly more overhead.
Standard Base64 uses the characters + and / which have special meaning in URLs. URL-safe Base64 (also called Base64url, defined in RFC 4648) replaces + with - and / with _ to avoid URL encoding issues. It also typically omits the = padding characters. JWT tokens and many web APIs use URL-safe Base64.
Use Base64 when you need to embed binary data in a text-only context: data URIs in HTML/CSS, JSON payloads containing binary files, email attachments via MIME, storing binary data in XML, or passing binary data through systems that only handle ASCII text. Avoid Base64 when the transport already supports binary data natively, as the 33% overhead adds unnecessary size.
In browsers, use btoa() to encode a string to Base64 and atob() to decode. For Unicode strings, first encode to UTF-8 bytes: btoa(unescape(encodeURIComponent(str))). In Node.js, use Buffer.from(str).toString('base64') to encode and Buffer.from(b64, 'base64').toString() to decode.