Developers & Technical · Guide · Developer Utilities
How to convert YAML to JSON
YAML vs JSON mapping, the Norway problem, numeric precision, anchors, comments, YAML 1.1 vs 1.2 parsers, round-tripping.
YAML and JSON represent the same data but solve different problems. YAML is human-friendly (Kubernetes manifests, CI configs, Docker Compose). JSON is machine-friendly (APIs, tokens, logs). Converting between them sounds trivial but has gotchas: YAML is a superset of JSON, but YAML features like anchors, multi-line strings, and implicit typing don’t survive a round-trip without care. This guide covers the conversion rules, YAML features that need attention, numeric precision, comments, anchors/aliases, round-tripping, and when to use which format.
Advertisement
The basic mapping
YAML 1.2 is a strict superset of JSON, so any valid JSON is valid YAML. Going the other way drops features. Core mappings:
YAML mapping: key: value → JSON object: {“key”: “value”}
YAML sequence: - item (with hyphens and indentation) → JSON array: [“item”]
YAML scalar: name: Alice → JSON string: “Alice”
The conversion of structure is mechanical. The interesting part is the scalars.
Implicit typing — the Norway problem
YAML guesses types for unquoted scalars. true, false, yes, no, on, off, null, ~ become booleans/null. Numbers become int or float. This leads to the famous “Norway problem”:
countries: - GB - NO # Parsed as boolean false in YAML 1.1
YAML 1.1 (still used by PyYAML by default!) treats NO as false. JSON output gets [“GB”, false] — wrong.
Fix: use YAML 1.2 parsers (ruamel.yaml, js-yaml 4+, go-yaml v3) or quote values: - “NO”.
Numbers and precision
YAML parses 01 as 1, 0x1f as 31, 1.5e3 as 1500, .inf as infinity, .nan as NaN. JSON doesn’t support infinity or NaN — those become null or errors.
Leading zeros: version: 007 becomes 7 in JSON — the zero-padding is lost. Quote it: version: “007”.
Large numbers: YAML supports arbitrary precision in some parsers, JSON is spec-limited to 64-bit double. Numbers beyond Number.MAX_SAFE_INTEGER (9.007 × 10¹⁵) silently lose precision. Output as strings for IDs: “1234567890123456789”.
Multi-line strings
YAML has several multi-line styles, all collapse to JSON string with escaped newlines:
literal: | Line 1 Line 2 # Keeps newlines folded: > Long paragraph that wraps # Newlines become spaces plain: "With \n escapes"
JSON output: “Line 1\nLine 2” for the literal block. Indentation of the block becomes part of the content if it’s inside the block start.
Comments disappear
JSON has no comments. Every YAML comment is lost in conversion. If comments matter (Kubernetes manifests are heavily commented), either keep the YAML source of truth or use JSON with a comment-preserving extension like JSON5 / JSONC.
Anchors and aliases
YAML supports references to repeated content:
defaults: &defaults port: 8080 host: localhost server1: <<: *defaults name: primary server2: <<: *defaults name: secondary
JSON has no anchors — converters expand them, producing duplicated objects. If you go back to YAML, you’ll have lost the shared reference and doubled file size on large configs.
Workaround: if the YAML is source-of-truth, keep it in YAML and generate JSON downstream. Don’t round-trip.
Keys can be any type in YAML
YAML allows non-string keys:
? [a, b] : "tuple key" 42: "number key" true: "boolean key"
JSON keys must be strings. Converters either stringify (“[a,b]”, “42”) or error. Most real-world YAML only uses string keys so this is rare, but watch for boolean/numeric keys that silently coerce.
Choosing parsers
js-yaml (Node.js): YAML 1.2 by default in v4+. Fast, widely used.
ruamel.yaml (Python): YAML 1.2, preserves comments and formatting on round-trip. Best Python option.
PyYAML (Python): YAML 1.1 only. Has the Norway problem. Avoid for new projects.
go-yaml.v3: YAML 1.2, supports preserving comments via AST.
yq: command-line tool — yq -o json config.yaml converts quickly.
When to use YAML vs JSON
YAML is better for: human-edited configs (CI/CD, Kubernetes, Docker Compose, Ansible), long files where comments help, anywhere DRY through anchors adds value.
JSON is better for: APIs, data interchange between services, log formats, anything machine-generated or parsed at scale. JavaScript parses it natively.
Common pattern: YAML source in version control, JSON at runtime. Convert on build or first load.
Round-tripping
YAML → JSON → YAML isn’t lossless. You lose:
Comments.
Original scalar styles (quoted vs plain vs block).
Anchor/alias references (expanded).
Non-string keys (stringified).
Ordering (JSON objects are unordered; re-serialized YAML may sort keys).
If you need to modify YAML programmatically and preserve formatting, use comment-aware parsers (ruamel.yaml in Python, yaml-edit-ts in JS) that keep the source AST, not the naive convert-modify-serialize loop.
Validating output
After conversion, validate the JSON against your expected schema. Most bugs are silent type changes (boolean that should’ve been string, number that should’ve been string to preserve zeros).
JSON Schema catches type drift. Also diff small samples manually to make sure keys survived.
Common mistakes
Using YAML 1.1 parser for modern YAML. Boolean coercion bugs everywhere. Upgrade to a 1.2 parser.
Not quoting version strings. version: 1.0 becomes 1 (float). Use version: “1.0”.
Assuming comments survive. They don’t unless your tooling explicitly preserves them in both directions.
Round-tripping as a refactor. YAML → JSON → YAML will expand anchors and reformat. If the YAML is human-maintained, don’t round-trip.
Exceeding JSON number precision. Large IDs and timestamps need string representation in JSON.
Trusting implicit types. Quote any string that looks like a boolean, number, date, or null.
Run the numbers
Convert between YAML and JSON with the YAML ↔ JSON converter. Pair with the JSON formatter to pretty-print and validate the output, and the JSON schema generator to lock down the structure.
Advertisement