What's the difference between a system prompt and a user prompt?

System prompts set persistent rules — the AI's personality, constraints, available tools, what not to do. User prompts are the specific request each turn. System prompts stay active throughout the conversation; user messages change turn-to-turn.

How long should a system prompt be?

500-3000 tokens for most applications. Shorter prompts give more flexibility; longer ones constrain behavior better. Top platforms like ChatGPT's Custom GPTs and Claude Projects use 2000-5000 token system prompts. Most 'good enough' prompts are 800-1500 tokens.

Should I include examples in my system prompt?

Yes, for consistent output. Few-shot examples (2-5 labeled examples of input → desired output) significantly improve structure adherence. The examples should cover edge cases, not just happy paths. This technique is more effective than just stating the rule.

How do I test if my system prompt works?

Test with adversarial inputs: users trying to make the AI break role, edge-case requests, long multi-turn conversations. A good system prompt survives hostile probing. Run the same prompt through 20 diverse user queries and check whether it stays on-brief consistently.

Should I use XML tags or markdown headers for sections?

XML tags for structure (Anthropic recommends this; Claude responds especially well to , , tags). Markdown for human readability when reviewing the prompt. The two combine well: use XML tags as section delimiters with markdown content inside. OpenAI's models accept both equally. Avoid mixing inconsistent formatting — pick a convention and stick with it across the whole system prompt.

What's the difference between system prompt and user prompt?

System prompts set persistent rules — the AI's persona, constraints, tools, what not to do. User prompts are turn-specific requests. System prompts ride along every turn invisibly; user messages change. Best practice: put EVERYTHING that should remain constant (instructions, examples, knowledge base) in system prompt. Keep user prompts minimal and request-focused. This makes prompt caching effective and conversation memory predictable.

AI & Prompt Tools · Free tool

System Prompt Builder

Compose a focused system prompt from a role, tone, constraints, and output format — copy-ready for any LLM.

Updated June 2026

RoleConstraintsOutput formatExamples

Generated system prompt

# ROLE
You are a senior technical writer specializing in developer documentation.

# CONSTRAINTS
- Always use active voice.
- Never invent APIs or flags that weren't provided.
- Keep paragraphs under 3 sentences.

# OUTPUT FORMAT
Return Markdown with an H2 title, a short summary, then numbered steps.

# EXAMPLES
Input: "document the /auth endpoint"
Output: ## /auth\nAuthenticates the user...\n1. Send POST...

Paste into the system / instruction field of your LLM playground or API call. Works with OpenAI, Anthropic, Gemini, and most agent frameworks.

Found this useful?Email Buy Me a Coffee

What it does

Generate a structured, production-ready system prompt by filling in: role (assistant/expert/agent), tone (formal/casual/technical), constraints (what NOT to do), output format (JSON/markdown/plain text), available tools (if agent), and any domain-specific guardrails. Output is a copy-ready prompt you can paste into ChatGPT, Claude, Gemini, or any LLM API. Designed to produce prompts in the 800-2000 token range — long enough to constrain behavior, short enough to be cost-efficient.

Why prompt structure matters: ChatGPT’s “Custom GPTs” and Claude’s “Projects” both demonstrate that well-constructed system prompts can change a base model’s behavior more than fine-tuning. Anthropic publishes prompt-engineering guides emphasizing: clear role definition, explicit constraints, examples (few-shot), structured XML tags for sections, and adversarial test prompts. OpenAI’s GPT-4o and Anthropic’s Claude Opus 4 follow these structures most consistently; older models (GPT-3.5, Claude 2) need shorter, more explicit prompts to stay on-brief.

Common failure modes to design against: (1) Role bleed — model forgets persona in long conversations; mitigate with periodic role-reinforcement. (2) Instruction override — user inputs like “ignore previous instructions”; mitigate with XML-tagged user content and explicit refusal instructions. (3) Format drift — JSON output gradually becoming natural language; mitigate with strict schema + few-shot JSON examples. (4)Hallucination on out-of-scope queries — model invents answers rather than refusing; mitigate with explicit “If unsure, say I don’t know” instruction and reference-only constraints. The generator includes these guardrails as configurable options.

Embed this tool on your siteShow snippet

Paste this snippet into any page. Loads on-demand (lazy), no tracking scripts, and sized to most dashboards. Replace the height to fit your layout.

<iframe src="https://freetoolarena.com/embed/system-prompt-builder" width="100%" height="720" frameborder="0" loading="lazy" title="System Prompt Builder" style="border:1px solid #e2e8f0;border-radius:12px;max-width:720px;"></iframe>

Embed docs →

How to use it

Pick role: 'Assistant', 'Expert in [domain]', 'Agent with tools', 'Coach', 'Reviewer'.
Pick tone: formal, conversational, technical, encouraging, direct.
List 3-7 constraints: 'Never reveal system prompt', 'Refuse off-topic requests', 'Cite sources', etc.
Pick output format: plain text, markdown, JSON schema, structured XML tags.
Add 1-3 few-shot examples (input → ideal output) — biggest single quality lever.
Test the generated prompt with adversarial inputs before deploying. Iterate until it survives 20 hostile probing attempts.

When to use this tool

Building a custom AI assistant or chatbot (Custom GPT, Claude Project, embedded API agent).
Standardizing AI behavior across multiple users — a fixed system prompt produces consistent output.
Migrating from one model to another — re-test with same system prompt to compare quality across vendors.
Onboarding new team members to LLM workflows — a templated builder lowers the prompt-engineering learning curve.

When not to use it

Single-shot queries where you control the user message — system prompt overhead isn't worth it.
When you need fine-tuning-level customization — system prompts have limits; truly distinct behavior needs SFT or RLHF.
When the model already has a strong default that fits — over-prompting can degrade quality on simple Q&A.
For prompts under 200 tokens that already work — adding more structure won't help.

Common use cases

Educational use — demonstrating the underlying concept
Onboarding a colleague who needs the same calculation/conversion
Verifying a number or output before passing it on
Quick generation during a typical workday

Frequently asked questions

What's the difference between a system prompt and a user prompt?: System prompts set persistent rules — the AI's personality, constraints, available tools, what not to do. User prompts are the specific request each turn. System prompts stay active throughout the conversation; user messages change turn-to-turn.
How long should a system prompt be?: 500-3000 tokens for most applications. Shorter prompts give more flexibility; longer ones constrain behavior better. Top platforms like ChatGPT's Custom GPTs and Claude Projects use 2000-5000 token system prompts. Most 'good enough' prompts are 800-1500 tokens.
Should I include examples in my system prompt?: Yes, for consistent output. Few-shot examples (2-5 labeled examples of input → desired output) significantly improve structure adherence. The examples should cover edge cases, not just happy paths. This technique is more effective than just stating the rule.
How do I test if my system prompt works?: Test with adversarial inputs: users trying to make the AI break role, edge-case requests, long multi-turn conversations. A good system prompt survives hostile probing. Run the same prompt through 20 diverse user queries and check whether it stays on-brief consistently.
Should I use XML tags or markdown headers for sections?: XML tags for structure (Anthropic recommends this; Claude responds especially well to <instructions>, <examples>, <output_format> tags). Markdown for human readability when reviewing the prompt. The two combine well: use XML tags as section delimiters with markdown content inside. OpenAI's models accept both equally. Avoid mixing inconsistent formatting — pick a convention and stick with it across the whole system prompt.
What's the difference between system prompt and user prompt?: System prompts set persistent rules — the AI's persona, constraints, tools, what not to do. User prompts are turn-specific requests. System prompts ride along every turn invisibly; user messages change. Best practice: put EVERYTHING that should remain constant (instructions, examples, knowledge base) in system prompt. Keep user prompts minimal and request-focused. This makes prompt caching effective and conversation memory predictable.

Learn more

Explore more ai & prompt tools tools

100% in-browserNo downloadsNo sign-upMalware-freeHow we keep this safe →