AI & LLMs · Guide · AI & Prompt Tools

How to Use SWE-agent

Solve GitHub issues automatically with SWE-agent. This free guide covers the agent-computer interface, model config, and cost control, with instant browser-based setup instructions.

By FreeToolArena Staff · Updated June 2026 · 6 min read

SWE-agent is Princeton’s autonomous software-engineering agent that takes a GitHub issue and a repo, then writes, runs, and tests a patch end-to-end without human hand-holding.

SWE-agent is an open-source framework from the Princeton NLP group, built to solve real software-engineering tasks by driving a language model through a specially designed Agent-Computer Interface (ACI). It was the first agent to crack double-digit scores on SWE-bench, a benchmark of unresolved GitHub issues from popular Python repos. Researchers use it to study agent capabilities, teams use it to triage bug backlogs, and CTF players use the EnIGMA spin-off for capture-the-flag challenges. It’s MIT-licensed and maintained by the SWE-agent authors.

What it is

The core insight is the ACI: instead of giving a model raw shell access, SWE-agent exposes narrow, high-feedback commands (open, goto, edit, find_file, search_dir, submit) that a model can actually use well. It wraps these in a sandboxed Docker environment, runs the agent loop against providers like Claude, GPT, or any LiteLLM-supported model, and emits a patch plus a full trajectory log. Configuration lives in YAML files so you can swap prompts, tools, and models without touching code.

Install

git clone https://github.com/SWE-agent/SWE-agent.git
cd SWE-agent
pip install --editable .
# Docker must be installed and running for sandboxed execution

First run

Point the agent at a live GitHub issue and watch it clone the repo, reproduce the bug, edit files, and emit a patch. Set your API key first.

$ export ANTHROPIC_API_KEY=sk-ant-...
$ sweagent run \
  --agent.model.name=claude-sonnet-4 \
  --problem_statement.github_url=https://github.com/pvlib/pvlib-python/issues/1603
[INFO] Cloned repo to /tmp/...
[INFO] Step 1: open pvlib/iotools/psm3.py
[INFO] Step 7: submit
[DONE] Patch written to trajectories/<run-id>/patch.diff

Everyday workflows

Batch SWE-bench — run sweagent run-batch against the dataset to reproduce benchmark numbers locally.
Fix local issues — pass --problem_statement.path to a text file describing a bug in your own codebase.
Swap models — edit the YAML to try Claude, GPT-4o, DeepSeek, or a local model through LiteLLM without changing agent logic.

Gotchas and tips

Cost is real: a single SWE-bench instance can burn 50k–200k tokens on frontier models, and full-dataset runs get expensive fast. Start with ten instances to calibrate, and cache the Docker environments — rebuilding them for every task dominates wall-clock time on a cold machine. Trajectories are verbose JSON; browse them with the included inspector_web tool rather than tailing raw files.

The agent is tuned for Python repos and pytest-style test suites. Non-Python languages and custom build systems work but often need a custom YAML with the right install and test commands. Pin the SWE-agent version if you’re publishing results — behavior shifts meaningfully between releases as prompts are refined.

Who it’s for

SWE-agent fits researchers benchmarking agent capabilities and engineering teams curious about autonomous bug-fixing on Python codebases. Read the ACI paper before your first serious run — understanding why the commands are shaped the way they are will save you from fighting the framework.

Use these while you read

Tools that pair with this guide

Found this useful?Email Buy Me a Coffee

Continue reading

100% in-browserNo downloadsNo sign-upMalware-freeHow we keep this safe →