Developers & Technical · Guide · Developer Utilities
How to URL encode correctly
Reserved vs unreserved characters, encodeURI vs encodeURIComponent, URLSearchParams, path vs query encoding, double-encoding traps, UTF-8 handling.
URL encoding — percent-encoding — is one of those fundamentals that quietly breaks things when it’s wrong. A space that should be %20 ends up as +; an ampersand in a query value splits the URL; a Unicode character comes through as mojibake. This guide covers what the encoding actually does, which characters need it, the critical difference between encodeURI and encodeURIComponent, path vs query encoding, double-encoding traps, form encoding, and the Unicode handling that catches people who only tested with ASCII.
Advertisement
What URL encoding is
URLs are limited to a specific set of ASCII characters by RFC 3986. Anything outside that set — spaces, most punctuation, non-ASCII — must be represented as %XX where XX is the byte’s hex value.
Example: space (0x20) → %20. At sign (0x40) → %40. Non-ASCII characters are encoded as their UTF-8 byte sequence, one %XX per byte: é (U+00E9) → %C3%A9 (two bytes).
Reserved vs unreserved characters
RFC 3986 divides URL characters into categories:
Unreserved: A-Z, a-z, 0-9, and - _ . ~. Never need encoding.
Reserved (gen-delims): : / ? # [ ] @. Have structural meaning in URLs. Encode if they’re part of data, not structure.
Reserved (sub-delims): ! $ & ' ( ) * + , ; =. Encode when used as data within query values — they have meaning in forms and query syntax.
Everything else: spaces, Unicode, unusual punctuation — always encode.
encodeURI vs encodeURIComponent — the JavaScript fork
JavaScript’s two URL encoders have different purposes and mixing them up is the #1 URL bug.
encodeURI: encodes a full URL. Preserves reserved structural characters (: / ? # & =). Use when you have a complete URL that needs to be made safe for transmission.
encodeURI("https://x.com/a b?q=1&r=2") → https://x.com/a%20b?q=1&r=2. Notice the space got encoded but ? and & did not.
encodeURIComponent: encodes a value that will be placed inside a URL component (one segment of a path, one value in a query). Encodes reserved characters. Use when building URLs from parts.
encodeURIComponent("a&b=c") → a%26b%3Dc. Encodes both & and = so they can safely appear as a value.
Rule: use encodeURIComponent on each part when assembling, not encodeURI on the whole.
The URLSearchParams approach — safer, modern
Skip manual encoding entirely for query strings:
const params = new URLSearchParams(); params.append( "q", "hello world"); params.append( "tag", "a&b"); url.search = params.toString();
Produces q=hello+world&tag=a%26b with correct encoding and no off-by-one errors. Use this pattern whenever possible.
Note: URLSearchParams uses + for spaces (legacy form encoding), while encodeURIComponent uses %20. Both decode correctly on the server; don’t mix them in the same URL.
Path encoding vs query encoding
Subtle differences between segments of a URL.
Path segments: encode / if it appears in data (otherwise it creates a new path segment). Space as %20.
Query strings: + means space in traditional form encoding (application/x-www-form-urlencoded). In modern URL parsers, %20 works too and is unambiguous. & and = have structural meaning and must be encoded in values.
Fragments (#): similar rules to query; encode characters that break parsing.
Double-encoding — the classic trap
Encoding an already-encoded URL turns %20 into %2520 (because % becomes %25). The result looks URL-valid but the destination server gets the literal string “%20” instead of a space.
Symptoms: product pages showing “Item %26 Part” in the title, search queries returning no results for simple terms, 404s on URLs with special characters.
Fix: track whether a string is encoded or decoded as it flows through your code. Don’t encode on the way in and again on the way out. Libraries that build URLs should always take decoded strings and encode once at the edge.
UTF-8 and non-ASCII characters
Modern URL encoding is defined on UTF-8 bytes, not Unicode code points directly. A character like é is first encoded as UTF-8 (two bytes: 0xC3 0xA9), then each byte becomes %XX (%C3%A9).
encodeURIComponent("café") → caf%C3%A9.
IDN domains (example: münchen.de): the domain portion uses Punycode (xn--mnchen-3ya.de), not percent-encoding. Percent-encoding is only for the path/query/fragment. Browsers handle the conversion; servers usually see Punycode.
Old systems: some servers expect Windows-1252 or ISO-8859-1 encoding of non-ASCII (so é would be %E9, not %C3%A9). Almost always wrong in 2026 but you’ll still meet it when integrating with legacy systems. Always check what the receiver expects.
Form encoding — related but distinct
HTML forms submitted with application/x-www-form- urlencoded use a variant of URL encoding:
Spaces become + (not %20).
Line breaks in textareas become %0D%0A (CRLF).
Everything else follows standard percent-encoding.
multipart/form-data is different — used for file uploads, it wraps values in boundary-separated parts and doesn’t need URL encoding at all.
Server-side decoding
Most web frameworks decode automatically — req.query.qin Express or @QueryParam in Java gives you the decoded string. You rarely call decoder functions yourself.
Watch: if you need to re-emit a URL (in a redirect, or storing it in a database), re-encode it before output. Never store an encoded URL string then emit it as-is — you’ll hit double-encoding when downstream code assumes decoded input.
Common mistakes
Using encodeURI when you needed encodeURIComponent. The most common bug: putting user input in a query value without encoding the & or =.
Encoding data before it becomes data. Encoding a database value when inserting it into storage, then again when emitting it in a URL — double-encoding.
Concatenating URL pieces as strings. const url = base + "?q=" + query;. Use URL/URLSearchParams objects — they encode properly.
Assuming UTF-8 everywhere. Very old endpoints may demand a different byte encoding. Read the API spec.
Encoding slashes in paths without meaning to.If you have a/b/c and call encodeURIComponent, you get a%2Fb%2Fc — one path segment, not three. Split first, encode per segment, rejoin with /.
Run the numbers
Encode and decode URLs instantly with the URL encoder/decoder. Pair with the Base64 encoder/decoder when you need to embed binary payloads inside URL-safe text, and the slug generator when turning titles into URL path segments.
Advertisement