How-To & Life · Guide · Developer Utilities
How to use Base64 encoding
Why Base64 exists (binary in text channels), size overhead (~33%), URL-safe vs standard alphabet, padding, and when to use Base64 vs hex.
Base64 is the glue that lets binary data travel through text-only pipes. Email bodies, JSON payloads, URL parameters, HTML data attributes, and OAuth tokens were all designed for ASCII, which means a raw image or an encryption key can’t be dropped in verbatim—the random bytes will collide with control characters, get mangled by character-set conversions, or confuse a parser expecting newlines. Base64 rewrites bytes using a 64-character alphabet that survives every reasonable text transport, at the cost of roughly 33% size inflation. It is an encoding, not encryption—anyone can decode it. This guide covers the alphabet, how the three-byte-to-four-character math works, padding rules, the URL-safe variant, data URLs, and the decisions you’ll make around when the size tradeoff is worth it.
Advertisement
Why encode binary as text
Text-based protocols only reserve a safe subset of bytes for payload. SMTP was specified for 7-bit ASCII; any byte with the high bit set could be stripped or corrupted by a mail relay. HTTP headers cannot contain newlines. JSON strings cannot contain raw control characters. URL query strings have reserved characters like ?, &, and =. If you want to ship a PNG through any of these, you either escape every non-safe byte individually (ugly and verbose) or you re-encode the whole payload using an alphabet that’s universally safe. Base64 picked 64 symbols that every text-handling system accepts.
The alphabet
Standard Base64 uses A–Z, a–z, 0–9, plus + and /—sixty-four distinct symbols. The equals sign = is reserved as padding. Together that’s 65 characters, all of which pass through ASCII, UTF-8, EBCDIC mail gateways, and printer drivers without change.
Index Char Index Char Index Char Index Char 0 A 16 Q 32 g 48 w 1 B 17 R 33 h 49 x ... ... ... ... ... ... ... ... 25 Z 41 p 57 5 (62 +) 26 a 42 q 58 6 (63 /) Padding: =
How the encoding math works
Base64 takes three bytes (24 bits) and splits them into four 6-bit groups, then looks up each 6-bit value in the alphabet. Three bytes in, four characters out—always. That’s where the 33% expansion comes from: 4 / 3 = 1.333.
Input bytes: M a n ASCII: 77 97 110 Binary: 01001101 01100001 01101110 Split into 6: 010011 010110 000101 101110 Values: 19 22 5 46 Base64: T W F u "Man" → "TWFu"
Padding
When the input length isn’t a multiple of three, you pad the output with = so the length is a multiple of four. One leftover byte produces two characters plus ==; two leftover bytes produce three characters plus =. Padding lets decoders know where the data ends and distinguishes “this is exactly three bytes” from “this is one byte.”
"M" → "TQ==" (1 byte → 2 chars + ==) "Ma" → "TWE=" (2 bytes → 3 chars + =) "Man" → "TWFu" (3 bytes → 4 chars, no pad)
Some parsers tolerate missing padding; others reject it. If you’re producing Base64 for a strict decoder (RFC 4648), always include the padding. If you’re producing it for a URL query string where = has special meaning, you may strip padding and add it back during decode.
URL-safe Base64
The + and / characters collide with URL syntax—+means “space” in form-encoded data, and / is a path separator. RFC 4648 defines a URL-safe alternative that swaps them for - and _. Padding = is usually dropped because it’s a reserved URL character too. JWTs use this variant exclusively.
Standard: SGVsbG8gd29ybGQ+ URL-safe: SGVsbG8gd29ybGQ- Standard uses: + / = URL-safe uses: - _ (no padding)
Data URLs
A data URL embeds an entire file inside a URL, with the payload typically Base64-encoded. The format is data:[mime];base64,[payload]. This is why you sometimes see a 40KB blob of garbage in an HTML src attribute—it’s an inline image, saving an HTTP round trip at the cost of page size and no browser cache reuse. Use data URLs for small icons and critical-path images; anything over ~10KB is usually better served as a real file.
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..." /> data:text/plain;base64,SGVsbG8gd29ybGQh → "Hello world!"
The 33% tax is real
Base64 always inflates payloads by a factor of 4/3. A 1MB image becomes a 1.33MB string. Over the wire this is partially offset by gzip, which compresses Base64 text well, but the CPU cost of encoding, decoding, and decompressing adds up. For inline thumbnails and fonts, it’s fine. For user-uploaded photos in a mobile app, it’s wasteful compared to a multipart upload.
Base64 is not encryption
Anyone can decode Base64. It hides nothing—it is an encoding transformation, same category as hex. Storing a password “encrypted” in Base64 is equivalent to storing it in plain text. Use real cryptography (AES, libsodium, bcrypt for passwords). Base64 is only for moving bytes through text channels.
Common uses in practice
Basic HTTP auth sends Authorization: Basic base64(user:pass). JWTs Base64-encode header, payload, and signature. Email attachments are Base64 in MIME bodies. Cryptographic key exports—SSH public keys, PEM certificates—wrap raw bytes in Base64 between BEGIN/END markers. Webhook signatures and HMAC results are often Base64-encoded. API responses sometimes Base64-encode binary blobs like PDFs or images to keep the response as a single JSON document.
Common mistakes
Double-encoding. Encoding already-encoded Base64 produces a legal but longer string that decodes back to Base64, not your bytes. If your decoded output looks like more Base64, you ran the encoder twice.
Missing padding. A stripped = in transit will cause strict decoders to throw “invalid length” errors. Either use a lenient decoder or re-add = so the length is a multiple of four.
Mixing up URL-safe and standard alphabets. Feeding a URL-safe string into a standard decoder fails on - and _. Know which variant you produced and decode accordingly, or normalize before decoding.
Treating Base64 as encryption. It isn’t. If the goal is secrecy, you need real cryptography. Base64 only provides transport safety.
Using Base64 for large binary payloads when multipart would work. The 33% tax compounds with serialization and logging costs. If the transport supports binary (gRPC, raw HTTP bodies, multipart form uploads), skip Base64.
Assuming the output is line-wrapped. RFC 2045 (MIME) wraps at 76 characters; RFC 4648 does not. If you’re comparing Base64 strings for equality, strip whitespace first.
Encoding a UTF-8 string without specifying the character set. Base64 operates on bytes, not characters. Encoding “café” as Latin-1 gives a different result than UTF-8. Always agree on the byte encoding before Base64’ing.
Run the numbers
Encode and decode instantly, with optional URL-safe mode, in our Base64 encoder and decoder. Pair it with the data size converter when you’re sanity-checking the size inflation on an image or payload, and the regex tester to validate the shape of a Base64 string before handing it to a decoder.
Advertisement