Skip to content
Free Tool Arena

Developers & Technical · Guide · Developer Utilities

JSON Format Rules Every Developer Should Know

Strict JSON spec rules, JSON5 vs JSONC, top 10 parser errors, JSON Schema validation, streaming for huge files, security: prototype pollution and DoS.

By FreeToolArena Staff · Updated May 2026 · 6 min read

JSON is the lingua franca of web APIs — and almost every developer learns it by osmosis from copy-pasted examples rather than the spec. That works until it doesn’t: a trailing comma, a single quote, or an unescaped backslash breaks a production payload at 2 AM, and the parser error message is unhelpful.

This is a complete reference for the rules of strict JSON (RFC 8259), the relaxed variants you’ll encounter (JSON5, JSONC), the encoding edges that catch people, and the patterns for handling huge documents, schemas, and security risks. It’s organized as a working reference — jump to what you need.

Advertisement

RFC 8259: the strict spec

JSON was originally specified by Douglas Crockford in 2002 (RFC 4627), updated to RFC 7159 in 2014, and finalized as RFC 8259 in 2017 (also published as ECMA-404 and ISO/IEC 21778). RFC 8259 is the current authoritative spec; if your tool claims “JSON support” without specifying a variant, it should mean RFC 8259. The spec is short — about 16 pages including examples — and worth reading once.

Strict JSON is intentionally minimal. There are exactly six value types: object, array, string, number, boolean, null. No comments. No trailing commas. No unquoted keys. No single-quoted strings. No undefined. No date type. No binary type. No comments (yes, this is worth saying twice). The minimalism is a feature: it makes JSON safe to parse with simple parsers in every language.

The 6 hard rules

  1. Keys must be strings, double-quoted. {"name": "alice"} is valid; {name: "alice"} and {'name': 'alice'} are not.
  2. No trailing commas. {"a": 1, "b": 2,} is invalid. [1, 2, 3,] is invalid. The most common parser-error cause among humans copy-pasting JSON.
  3. No comments. Neither // line nor /* block */ are allowed in strict JSON. If you need comments, use JSON5/JSONC or convention like a __comment key.
  4. Strings use double quotes only. {"name": 'alice'} is invalid. ECMAScript and Python both allow single-quoted strings in source code, which makes this a frequent paste-mistake.
  5. Strings must escape certain characters. Within double quotes, you must escape: " as \", \ as \\, control chars (0-31) as \uXXXX or named escapes. Forward slashes may be escaped (\/) but don’t need to be.
  6. Numbers follow specific format. Decimal only (no hex, octal). No leading zeros except 0.x. No + sign. NaN and Infinity are NOT allowed (a major gotcha for JavaScript output).

String escaping rules

The full list of escapes inside a JSON string:

  • \""
  • \\\
  • \// (optional)
  • \ → backspace (U+0008)
  • \ → form feed (U+000C)
  • \n → newline (U+000A)
  • \r → carriage return (U+000D)
  • \ → tab (U+0009)
  • \uXXXX → any Unicode code point as 4 hex digits

Code points above U+FFFF: use a UTF-16 surrogate pair. The laughing-with-tears emoji (U+1F602) is encoded as \uD83D\uDE02. Most parsers handle this transparently.

Single quotes don’t need escaping — they have no special meaning in JSON strings. "don't" is valid; "don\'t" is also valid (the escape is unnecessary but not forbidden).

Number representation gotchas

JSON numbers are 64-bit floats by spec (IEEE 754 doubles). Implications:

  • Integer precision loss above 2^53. JavaScript’s Number.MAX_SAFE_INTEGER is 9007199254740991. A 64-bit ID like 12345678901234567890 rounds to the nearest representable double. Workaround: send big IDs as JSON strings ("12345678901234567890"), not numbers.
  • Floating-point classics. 0.1 + 0.2 serializes as 0.30000000000000004. Money should be in cents (integer) or explicitly serialized as a string with fixed precision.
  • NaN and Infinity are NOT valid JSON. JavaScript’s JSON.stringify(NaN) returns "null". If you need to transmit special floats, use a string sentinel like "Infinity" and parse on the receiving side.
  • No leading + or trailing decimal. +1 is invalid; write 1. 1. is invalid; write 1.0 or just 1.

JSON5 and JSONC: when to use each

Two relaxed variants are common in tooling but never in APIs:

  • JSONC (JSON with Comments): JSON + // line and /* block */ comments. Used by VS Code settings, tsconfig.json, ESLint configs. Simple to support; most JSON parsers can be extended to strip comments before parsing.
  • JSON5: JSONC plus unquoted keys, single quotes, trailing commas, hex numbers, leading/trailing decimals, and a few more conveniences. Used by some Babel configs, RollupJS, and config files where humans hand-edit. The spec is at json5.org. Use the json5 npm package or equivalent.

Rule of thumb: API payloads always use strict JSON. Hand- edited config files can use JSONC (most tools support it) or JSON5 (if you want the convenience). Never assume a parser handles JSON5 unless documented.

Top 10 parser errors and fixes

  1. “Unexpected token X in JSON at position N”. Look at position N in your input. Almost always: trailing comma, single quote, unquoted key, or unescaped character in a string. Use the JSON formatter — it points at the exact offending character.
  2. “Unexpected end of JSON input”. Truncated input. Check the source: copied incomplete payload, network response cut off, file write not flushed.
  3. “Unterminated string”. Missing closing quote, or an unescaped backslash creating a broken escape sequence. Check for \ followed by a character that doesn’t form a valid escape.
  4. “Duplicate keys”. RFC 8259 doesn’t require rejection but recommends behavior is implementation-defined. Most parsers keep the last value silently. Don’t produce JSON with duplicate keys; if you receive it, that’s a bug at the source.
  5. BOM (Byte Order Mark) at start. Some Windows tools save UTF-8 with BOM (the bytes EF BB BF at the start). RFC 8259 forbids it. Strip it before parsing: text.replace(/^\uFEFF/, '').
  6. Single-quoted strings. Convert to double quotes. Not just the outer quotes — any " inside the string must be escaped.
  7. Comments left in. Strip them before parsing as strict JSON, or switch to a JSONC-aware parser. Don’t use regex to strip; comments inside strings will be matched incorrectly.
  8. Trailing commas. Strip them: regex /,(\s*[}\]])/g $1. Be careful: this same regex inside a string would corrupt data. Use a JSON5 parser if you can.
  9. Wrong content-type. Server sends JSON with Content-Type: text/html. Browsers try to parse as HTML and fail. Check the headers; most fetch errors that look like “invalid JSON” are content-type bugs.
  10. NaN / Infinity in output. Code that produces these (uninitialized floats, division by zero) crashes downstream parsers. Sanitize before serializing: replace with null or a string sentinel.

JSON Schema: validation beyond syntax

JSON.parse only catches SYNTAX errors. A valid JSON document can still have wrong shape: missing required fields, fields of the wrong type, values out of range. JSON Schema is the standard way to declare expected shape and validate programmatically.

Example schema for a user object:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["id", "email"],
  "properties": {
    "id": { "type": "integer", "minimum": 1 },
    "email": { "type": "string", "format": "email" },
    "name": { "type": "string", "maxLength": 100 },
    "age": { "type": "integer", "minimum": 0, "maximum": 150 }
  },
  "additionalProperties": false
}

Validators by language: Ajv (JavaScript/TypeScript — fastest and most popular), jsonschema (Python), NJsonSchema (.NET), justify (Java), json_schemer (Ruby). All implement the same Draft 2020-12 spec; switching between languages is mostly mechanical.

OpenAPI relationship: OpenAPI 3.x uses JSON Schema for request / response shapes. Generate types and validators from your OpenAPI spec; never hand-write both.

Alternatives in TypeScript: Zod, io-ts, yup, valibot. Define schema in TypeScript code (gets type inference), validate at runtime, no separate schema file. Better DX for TS-heavy codebases; less interoperable with non-TS consumers.

Handling huge JSON (streaming)

JSON.parse loads the entire document into memory. For files over ~100MB, this becomes prohibitive. Streaming parsers process one token at a time:

  • jq (CLI): handles 100GB JSON files routinely. jq .users[] huge.json emits each user as it’s parsed.
  • Python ijson: for item in ijson.items(file, "users.item") iterates without loading the document.
  • Node JSONStream: fs.createReadStream(...).pipe(JSONStream.parse('users.*')).on('data', ...).
  • Go encoding/json Decoder: token-stream API for memory-efficient parsing.

NDJSON / JSONL: an alternative format where each line is a separate JSON object. Trivial to stream (read line by line, parse each). Standard for log aggregation, ML training data, and large data exports. File extension .ndjson or .jsonl.

UTF-8, BOMs, and binary data

RFC 8259 requires UTF-8 encoding (with optional UTF-16 / UTF-32 in older RFC 7159). Network JSON should always be UTF-8 without BOM.

Binary data doesn’t fit natively. Conventions:

  • Base64-encode: most common. Adds 33% size overhead. See the Base64 encoder.
  • Hex-encode: simpler but 100% size overhead.
  • Out-of-band: send a URL or signed reference; client fetches the binary separately. Best for files over ~10KB.

Security: prototype pollution, DoS, JSONP

Prototype pollution: untrusted JSON with keys like __proto__, constructor, prototype can modify the JavaScript Object prototype if you blindly merge into objects. Mitigate with safe-merge libraries (Lodash 4.17.21+ is safe), or use Object.create(null) for destination objects.

Parser DoS: very deep nesting (10,000+ levels) can blow the stack in some parsers. Limit input size and depth. Most modern parsers default to reasonable limits.

JSON hijacking (historical): a security issue with top-level JSON arrays in old browsers. Modern browsers fixed this. Don’t worry about it unless supporting IE 5/6.

JSONP: JSON wrapped in a callback function for cross-origin loading via <script>. Largely obsolete due to CORS support in all modern browsers. JSONP runs arbitrary JavaScript — if the data source is compromised, you have an XSS. Avoid for new code; use CORS or fetch with proper Access-Control-Allow-Origin headers.

Performance tips

  • Avoid double-parsing. JSON.parse(JSON.parse(s)) happens when someone serialized JSON, then JSON-encoded the result as a string. Parse once. Better fix: stop double-encoding upstream.
  • Consider message size. JSON over HTTP is gzip/brotli compressed in transit. Field-name length matters less than the spec implies. But for stored JSON (databases, files): shorter field names compound. Compromise: readable in APIs, abbreviated in hot-path internal storage.
  • Streaming for files over 10MB. JSON.parse on a 100MB string takes 5-15 seconds and may freeze the browser tab.
  • Use protobuf or CBOR for binary-heavy or high-frequency payloads. Both are 2-5x smaller and faster than JSON. Trade-off: not human-readable, requires schema.

Anti-patterns and footguns

  • Stringly-typed dates. JSON has no Date type, so dates become strings. Multiple formats compete: ISO 8601 (2026-01-15T12:00:00Z — preferred), Unix timestamp (number of seconds), JavaScript Date string format. Pick one in your API contract; don’t mix. ISO 8601 is the interoperable choice.
  • Money as floats. Use integer cents or string-encoded decimals. Floats lose precision; 0.1 + 0.2 = 0.30000000000000004.
  • Big integers as numbers. IDs over 2^53 lose precision. Send as strings.
  • Comments in JSON files served as JSON content-type. Strip them before serving, or change content-type to application/jsonc (not widely recognized) or convention application/x-jsonc.
  • Mutating during iteration. Some parsers (Python json) return plain dicts; others return ordered dicts; behavior varies. Don’t mutate what you’re iterating; iterate a copy.
  • Trusting key order. RFC 8259 says objects are “unordered.” Most parsers preserve insertion order in practice (V8, Python 3.7+, Go); some don’t. Don’t depend on order for correctness.

The 80/20 takeaway

The six rules cover almost all parse errors you’ll encounter. The schema validation pattern (Ajv, Zod, JSON Schema) catches the rest at the contract level so bugs surface early. For huge documents, switch to streaming parsers or NDJSON. For binary data, Base64-encode or send out-of-band. Always validate untrusted JSON before merging into objects (prototype pollution).

Use the JSON formatter for one-off pretty-printing and validation; jq for command-line workflows; Ajv or Zod for runtime validation. Read RFC 8259 once. Bookmark this page. Most JSON headaches go away.

Use these while you read

Tools that pair with this guide

Frequently asked questions

Why does my JSON parser say 'Unexpected token'?

Five common causes, in order of frequency: (1) Trailing comma after the last item in an object or array — strict JSON forbids it. (2) Single-quoted strings instead of double-quoted. (3) Unquoted keys (valid in JavaScript object literals; invalid in JSON). (4) Unescaped backslash inside a string creating a broken escape sequence. (5) Comments — strict JSON has no comments. The 'position N' in the error message points at the exact character; paste your JSON into a formatter to see line:column.

What's the difference between JSON, JSON5, and JSONC?

JSON (RFC 8259) is the strict standard used in APIs: no comments, no trailing commas, double-quoted keys only. JSONC (JSON with Comments) adds // and /* */ comments and trailing commas; used by VS Code settings and tsconfig.json. JSON5 is JSONC plus unquoted keys, single-quoted strings, hex numbers, and a few other conveniences; used in some build tool configs. Rule of thumb: APIs always strict JSON; hand-edited config files can use JSONC or JSON5. Never assume a parser supports JSON5 unless documented.

How do I handle JSON files larger than 100MB?

JSON.parse and equivalent loaders read the entire document into memory — typically 2-5x the file size for object representation. For files >100MB, use streaming parsers: jq (CLI, handles 100GB easily), Python ijson, Node JSONStream, Go encoding/json Decoder. Alternative format: NDJSON (newline-delimited JSON) where each line is a separate JSON object — trivial to stream by reading line-by-line. NDJSON is standard for log aggregation, ML training data, and large data exports.

Why does my JSON have wrong numbers (precision loss)?

JSON numbers are IEEE 754 64-bit floats per spec. JavaScript's Number.MAX_SAFE_INTEGER is 2^53 - 1 = 9007199254740991. Bigger integers (long IDs from databases, blockchain transaction IDs, Twitter snowflake IDs) round to the nearest representable float — silent data corruption. Fix: send big integers as JSON strings, parse as BigInt or library-specific big-integer types on the receiving side. Money has the same problem: 0.1 + 0.2 = 0.30000000000000004. Store money as integer cents (or smaller units), or as strings with fixed precision; never as floats.

Advertisement

Found this useful?Email

Continue reading

100% in-browserNo downloadsNo sign-upMalware-freeHow we keep this safe →