Skip to content
Free Tool Arena

Developers & Technical · Guide · Developer Utilities

How to format SQL for readability

Uppercase keywords, one-column-per-line, JOIN placement, CTE style, leading vs trailing commas, auto-formatters (SQLFluff, sql-formatter).

Updated April 2026 · 6 min read

A 300-line SELECT crammed onto three lines is the SQL equivalent of unreadable code. Format it — indent joins, align columns, uppercase keywords, break at logical boundaries — and the same query tells its story at a glance. This guide covers the conventions teams actually agree on, the stylistic decisions worth having an opinion about (leading comma vs trailing, river vs indent), auto-formatters, style guides from major projects, and the small rules that keep PRs readable a year from now.

Advertisement

Why SQL formatting matters more than most code formatting

SQL is declarative and often nested. A single query can be the logic of an entire business report. Unlike Python or JavaScript, linters and auto-formatters were uncommon in SQL workflows for years — teams picked up habits that clash.

Poorly formatted SQL hides: duplicate joins, wrong filter placement (WHERE vs HAVING vs ON), missed NULL handling, and unintended cross joins. Formatting exposes these by making the shape of the query readable.

The core rules most teams agree on

Keywords uppercase. SELECT, FROM, WHERE, JOIN, GROUP BY, ORDER BY — all caps. Table and column names stay lowercase (or follow your DB’s case convention).

One column per line in SELECT lists. Easier to add, remove, and comment individual columns in diffs.

One join per line. Each JOIN clause on its own line with the ON condition either inline or indented.

Logical indentation. Subqueries and CTEs indented from their parent; related clauses aligned.

Trailing commas. (Or leading — opinions differ; see below.) Consistent within your codebase.

Leading comma vs trailing comma — pick one

Trailing comma style (more common):

SELECT id, name, email, created_at FROM userswith each column on a new line and the comma at the end.

Leading comma style (catches bugs, polarizing):

SELECT id , name , email , created_at FROM userswith each column on a new line and the comma at the start.

The argument for leading comma: the last line doesn’t end with a comma, so you don’t forget to remove it when commenting out the last column. Also, the visual alignment makes it easy to see which columns are selected.

The argument against: looks odd compared to other languages. Modern SQL formatters handle trailing-comma commenting automatically.

Pick one, write it in your style guide, enforce it with a formatter.

The “river” style

Classic SQL formatting puts each major clause keyword on its own line, right-aligned to create a visual “river” of whitespace down the left edge:

SELECT id, name FROM users WHERE active = true ORDER BY created_at DESC

(With SELECT, FROM, WHERE, ORDER BY right-aligned in a column.)

Pros: very scannable; clauses jump out.

Cons: painful to maintain manually; not well supported by modern formatters.

In 2026, most teams use indent-based style (keywords left-aligned, arguments indented under them) because it auto-formats cleanly.

JOIN formatting

Every JOIN on its own line. ON condition either:

Inline when short: JOIN orders ON orders.user_id = users.id

Below with indentation when long or multi- condition:

JOIN orders ON orders.user_id = users.id AND orders.status = 'active' AND orders.created_at >= '2026-01-01'

Always use explicit JOIN syntax (INNER JOIN, LEFT JOIN) — never comma joins with WHERE conditions. Old syntax mixes filters and join conditions, hiding bugs.

CTEs (WITH clauses) — the readability superpower

Break complex queries into named CTEs. Each CTE is a labeled, reusable subquery. Even if the query planner doesn’t need it, humans do.

WITH active_users AS (SELECT id FROM users WHERE active), recent_orders AS (SELECT user_id, SUM(total) FROM orders WHERE created_at > CURRENT_DATE - 30) SELECT au.id, ro.total FROM active_users au LEFT JOIN recent_orders ro ON au.id = ro.user_id

Indentation rule: each CTE body indented two spaces from the CTE name; CTEs separated by a blank line (or at least the closing paren + comma on its own line).

Rule: if a subquery is referenced twice or is more than 5-6 lines, promote it to a CTE. Queries with 3-5 named CTEs read like a paragraph; inline nested subqueries read like a regex.

WHERE clause formatting

Short WHERE: single line. WHERE active = true.

Multi-condition WHERE: each condition on its own line with AND or OR at the start. Makes it easy to comment individual conditions.

WHERE active = true AND created_at > '2026-01-01' AND role IN ('admin', 'member')

Parens for OR logic: always parenthesize mixed AND/OR to make precedence explicit. WHERE (a = 1 OR b = 2) AND c = 3, not WHERE a = 1 OR b = 2 AND c = 3(which means a = 1 OR (b = 2 AND c = 3) — surprise).

Naming — the unwritten part of formatting

snake_case for identifiers. user_id, not userId. SQL is case-insensitive for unquoted identifiers; stick to snake_case for consistency across engines.

Singular or plural tables? Pick one. “ users” or “user” — both work, but don’t mix. Most teams use plural (“users”, “orders ”) because they’re collections.

Aliases are short and meaningful. FROM users u JOIN orders o ON o.user_id = u.id. Avoid single-letter aliases on long queries where context is lost.

Qualify columns when more than one table is involved. u.email, not bare email. Makes queries survive when tables are renamed or columns added.

Auto-formatters

Don’t format by hand. Pick a tool:

sql-formatter (npm): open-source, dialect-aware (Postgres, MySQL, Snowflake, BigQuery, etc.). Configurable line width, keyword case, comma position. Widely used.

SQLFluff: Python-based, opinionated, production-grade. Dialect-aware, configurable rules. Integrates with dbt. Standard in data-engineering shops.

DataGrip / SQL Developer / pgAdmin: IDE-level formatters, varying quality.

Prettier SQL plugin: if your team uses Prettier for other languages, add the SQL plugin for consistency.

dbt users: SQLFluff with dbt-specific rules is the standard. Runs as a pre-commit hook.

Style guides worth reading

GitLab data team style guide: well-known, detailed, opinionated. CTE-first, leading comma, lowercase keywords (yes, they buck the trend).

Mozilla SQL style guide: short, practical.

Matt Mazur’s SQL style guide: written from analytics-engineering perspective; readable.

dbt Labs blog: ongoing discussion of SQL style in the modern data stack.

Common mistakes

Mixing uppercase and lowercase keywords inconsistently. Pick one. Most teams: UPPERCASE for keywords.

SELECT *. Fine in ad-hoc queries, bad in production. Lists of explicit columns survive schema changes and show intent.

Nested subqueries instead of CTEs. Hard to read, hard to test. Unwrap into CTEs; query planner usually treats them equivalently.

No explicit JOIN type. JOINdefaults to INNER JOIN in most engines, but being explicit makes it obvious.

Formatting only the final query. If you ship SQL in a migration, a dbt model, or a report config, it gets read repeatedly. Format once, save the team hours later.

Run the numbers

Paste SQL and get a cleanly formatted version with the SQL formatter. Pair with the JSON formatter when your query returns JSON that needs cleaning up, and the case converter to normalize identifier casing in bulk.

Advertisement

Found this useful?Email