How-To & Life · Guide · Developer Utilities
How to parse query strings
Percent encoding, repeated keys, array syntax, +-vs-%20 for spaces, URLSearchParams API, and decoding oddities across frameworks.
Query strings are the piece of a URL after the ? — a &-separated list of key-value pairs that carries search terms, filters, pagination, campaign tags, and session hints. The syntax looks like a one-week project someone never finished: no standard for repeated keys, no official way to express arrays or nested objects, and three subtly different encoding rules between URLs, HTML forms, and application/x-www-form-urlencoded. This guide covers the modern URLSearchParams API, the encoding rules that actually apply, how repeated keys and bracket notation handle arrays, the PHP-style nested-key hack, URL length limits across servers and CDNs, and how Unicode and emoji flow through the whole pipeline.
Advertisement
Anatomy of a query string
In the URL https://example.com/search?q=hello&page=2, the query string is q=hello&page=2 — everything after ? and before #. Pairs are joined with &; keys and values are separated by =.
The spec that defines how URLs work is RFC 3986. How query strings are used for data — repeated keys, brackets, booleans — is convention, not standardization.
URLSearchParams — the modern API
Every evergreen browser and Node 10+ has URLSearchParams for parsing and building.
const params = new URLSearchParams('q=hello&page=2&tag=red&tag=blue');
params.get('q'); // 'hello'
params.get('tag'); // 'red' (first match)
params.getAll('tag'); // ['red', 'blue']
params.has('page'); // true
params.keys(); // iterator
[...params.entries()]; // [['q','hello'],['page','2'],['tag','red'],['tag','blue']]The constructor accepts a string, an object, or an array of pairs. It automatically URL-decodes values, so ?q=hello%20world gives you the string “hello world”.
Encoding rules in query strings
Three character classes need encoding in query values:
Reserved sub-delims: ! $ & ' ( ) * + , ; =. Most importantly, & and = are the structural separators — a literal & in a value must be %26, a literal = must be %3D.
Space: encoded as + in the application/x-www-form-urlencoded variant (used by HTML form submissions and URLSearchParams output), or as %20 in strict RFC 3986 encoding. Both decode back to space in every major parser, but mixing them in one URL looks sloppy.
Non-ASCII: encoded as UTF-8 bytes, each byte as %XX. Emoji “😀” (U+1F600) is four UTF-8 bytes (F0 9F 98 80) and encodes as %F0%9F%98%80.
Repeated keys — the array convention
The spec says nothing about repeating a key. In practice, three conventions are common:
Plain repetition: ?tag=red&tag=blue&tag=green. Read with getAll('tag'). This is the approach URLSearchParams expects and most modern servers handle natively.
Bracket notation (PHP, Rails): ?tag[]=red&tag[]=blue. Bracket characters themselves need encoding: tag%5B%5D=red. PHP parses this into the array $_GET['tag']. Not understood by URLSearchParams — it returns the raw key “tag[]”.
Comma-separated: ?tag=red,blue,green. Simplest for logs, needs manual splitting, breaks if a value contains a comma.
Pick one convention per API and document it. Mixing tag and tag[] is the kind of bug that gets found during a midnight deploy.
Nested keys — PHP-style
Bracket notation also expresses nested objects:
?user[name]=jay&user[role]=admin&user[prefs][theme]=dark
// Decodes (in PHP / qs library) to:
// { user: { name: 'jay', role: 'admin', prefs: { theme: 'dark' } } }The qs npm library is the most common implementation outside PHP. It supports nesting, arrays, and various array-format options. Express uses it for req.query by default.
URLSearchParams does not understand nesting at all; it sees the whole bracket-y string as a flat key. If you are on a modern API, strongly consider pushing complex structures to the request body as JSON instead of serializing them into the query string.
URL length limits
There is no official URL length limit in the HTTP spec, but practical ceilings matter.
Browsers: Chrome, Firefox, Safari all handle URLs up to around 32,000 characters reliably; older IE capped at 2,083.
Web servers: Apache 8 KiB by default (LimitRequestLine), Nginx 8 KiB (large_client_header_buffers), IIS 16 KiB.
CDNs and load balancers: Cloudflare 16 KiB, AWS ALB 16 KiB headers (including request line).
Keep query strings under 2 KiB to be safe across all infrastructure. If you are anywhere close, move the data to a POST body.
Boolean conventions
Booleans in URLs are entirely convention. Pick a pattern:
Presence only: ?includeArchived means true, absence means false. Compact, but easy to confuse with a key that has an empty value.
Explicit value: ?includeArchived=true. Requires server to coerce the string “true” — and remember that “false”, “no”, and “0” are all truthy strings in most languages.
0/1: ?archived=1. Unambiguous, concise, common in older APIs.
Building query strings safely
Never concatenate strings with & and =. Always use URLSearchParams or a URL-building library.
const url = new URL('https://example.com/search');
url.searchParams.set('q', 'hello & goodbye');
url.searchParams.append('tag', 'red');
url.searchParams.append('tag', 'blue');
console.log(url.toString());
// https://example.com/search?q=hello+%26+goodbye&tag=red&tag=blueNotice the & inside the value was correctly encoded to %26, and the space became +. Trying to do this by hand is where bugs live.
Unicode, normalization, and collation
Two equal-looking strings can have different byte sequences. “café” can be one composed character (U+00E9) or “e” + combining acute (U+0065 U+0301). Both encode to valid but different query strings. Normalize input with string.normalize('NFC')before comparing or using as a database key.
Common mistakes
Splitting by & and = manually. It works until a value contains the literal character. Use URLSearchParams.
Double encoding. Running encodeURIComponent over a string that is already encoded turns %20 into %2520. Track encoded-vs-decoded state carefully through your code.
Assuming key order is significant. Order is preserved by most parsers but not guaranteed by any spec. Do not rely on it for caching keys, signatures, or equality checks — sort keys first.
Using .get() when there might be multiple values. get() returns only the first occurrence. getAll() returns every one. Use whichever matches your convention.
Mixing + and %20.They both decode to space but look inconsistent. Pick one (URLSearchParams always emits +; strict encoders emit %20) and stick with it.
Putting sensitive data in query strings.Query strings are logged by servers, proxies, and browser history. Tokens, passwords, and PII should go in headers or POST bodies, never in URLs.
Run the numbers
Break any URL into its parameters with the query string parser. Pair with the URL parser for the full protocol/host/path breakdown around it, and the URL encoder/decoder when a single value is misbehaving and you need to see exactly what got encoded.
Advertisement