AI Coding Tools Landscape 2026 — The Complete Guide to Agentic Development

The Landscape Has Fundamentally Shifted

In 2024, AI coding was autocomplete. In 2026, it's orchestration — agents that plan, write, test, run terminal commands, spawn sub-agents, and merge pull requests while you sleep.

The question developers ask has changed. It's no longer "can AI help me write this function?" It's "which tool do I trust to own this feature end-to-end?" That shift has fractured the market into four distinct archetypes that solve different problems and serve different workflows. The tools are no longer interchangeable.

According to GitHub's Octoverse report, 92% of US developers now use AI coding tools. But a RAND study found that 80–90% of products labeled "AI agent" are still chatbot wrappers underneath. The 15 tools in this guide are the real deal — they can genuinely plan, execute, iterate, and in some cases close the entire loop from ticket to merged PR.

92%

US devs using AI tools (Octoverse)

97M

MCP SDK downloads/month (Mar 2026)

10K+

Active public MCP servers

80%+

SWE-bench top scores (2026)

76%

Best bash-only leaderboard score

67%

Devin PR merge rate on defined tasks

The four archetypes

Agentic IDE

AI embedded in your editor

VS Code forks with deep AI integration. Handle autocomplete, multi-file edits, and agent tasks while you stay in a familiar environment. Best for daily interactive coding.

Cursor · Windsurf · Kiro · Antigravity

CLI Agent

AI in the terminal

Terminal-native tools that operate directly on your filesystem and git. Editor-agnostic, composable with any workflow, and often the deepest reasoners available.

Claude Code · Gemini CLI · Codex CLI · Aider · OpenCode

Autonomous Agent

Fire-and-forget cloud agents

Fully autonomous systems that accept a task via Slack or Jira and deliver a PR. Best for well-scoped, defined tasks. Requires clear requirements; struggles with ambiguity.

Devin · OpenAI Codex App · Jules

Open-Source / Self-Hosted

Bring your own model

Free at the tool layer; you pay only for model API calls (or nothing with local models). Full control, auditability, and no vendor lock-in. Ideal for privacy-sensitive teams.

Cline · Roo Code · Kilo Code · OpenHands · Continue.dev · Tabby

The convergence point: Every major tool shipped multi-agent support in the same two-week window in February 2026 — Cursor (8 agents), Windsurf Wave 13, Claude Code Agent Teams, Codex CLI with Agents SDK, Devin parallel sessions. Parallel agentic execution is now table stakes. The differentiation has moved to how agents are orchestrated, how context is managed, and how much you pay.

The Protocol Layer: MCP, A2A, and AGENTS.md

Agentic tools are only as powerful as what they can connect to. Three open standards now define that infrastructure — and every serious tool has either adopted them or is racing to.

Before MCP, every AI tool integration was a bespoke one-off. A GitHub connector in Cursor needed completely different code than the same connector in Copilot. MCP solved the N×M problem: build one server, and every MCP-compatible agent can use it. By March 2026 the ecosystem had crossed 97 million monthly SDK downloads and over 10,000 active public servers — faster adoption than any developer protocol since GraphQL.

The three protocols that matter in 2026

MCP — Model Context Protocol

Standardizes how AI agents connect to external tools, databases, and services. Built by Anthropic, donated to the Linux Foundation (AAIF) in December 2025. Adopted by every major provider: Anthropic, OpenAI, Google, Microsoft, AWS.

97M downloads/month · 10,000+ servers · OAuth 2.1 baked in

A2A — Agent-to-Agent Protocol

Standardizes how AI agents discover, communicate, and delegate to each other — regardless of framework. Built by Google, donated to Linux Foundation in June 2025. Think of it as HTTP for agent-to-agent communication.

IBM ACP merged into A2A (Aug 2025) · AAIF co-founded by OpenAI, Anthropic, Google, AWS

AGENTS.md / CLAUDE.md

Project instruction files that give agents persistent context about your codebase conventions. AGENTS.md is the open standard (60+ tools); CLAUDE.md is Claude Code's richer variant with imports and path scoping. Write once, inform every session.

60+ tools support AGENTS.md · 4 Cursor rule activation modes

SKILL.md

A structured instruction file that teaches an agent a reusable capability — like a plugin in natural language. Defines role, tools, triggers, and step-by-step instructions. Over 2,600 community skills now available across Claude Code, Codex, and others.

2,636+ public skills · doubles every quarter

MCP Adoption Status
Who has it, who doesn't

          Full native MCP: Claude Code, Cursor, Windsurf (Cascade), GitHub Copilot, Kiro, Cline, Roo Code, Kilo Code, OpenHands, Continue.dev, Aider (partial). 

          Partial / proprietary: Antigravity added partial MCP authentication in April 2026, but still routes primarily through its own AgentKit. 

          No MCP: Devin uses its own integration layer; Tabby (server tool, not agent).
        
10,000+ MCP servers available today

Security Note

MCP has known vulnerabilities

Security researchers in April 2025 identified prompt injection via tool descriptions, permission escalation by combining tools, and lookalike server attacks. MCP v2.1 (2026) introduced enhanced authentication, RBAC, and audit trails — but teams should audit MCP server permissions carefully before granting broad access to production systems.

MCP v2.1 adds RBAC · OAuth 2.1 · audit logging

Context Window vs MCP

Two different levers

A large context window (1M tokens in Claude Code) lets the agent understand your whole codebase at once. MCP lets the agent act on external systems — databases, CI/CD, Jira, Slack, GitHub. They're complementary: context for reasoning, MCP for reaching out. The best agents do both.

Claude Code: 1M ctx + deepest MCP

Context Engineering: Rules, Memory & Instructions

The real skill in 2026 isn't prompt writing — it's context engineering. How you structure the persistent instructions, memories, and rules that shape every agent interaction determines whether your AI behaves like a senior engineer or a confused intern.

Every major tool now has a configuration file system. But the implementations differ substantially. Understanding which file does what — and how they interact — is critical for teams that want consistent, codebase-aware AI behaviour across their entire engineering organisation.

Claude Code

CLAUDE.md ~/.claude/CLAUDE.md

ActivationProject-level + global. Supports imports, path scoping, and hooks. Three-tier hierarchy — most powerful variant available.

MemoryAgent Skills (SKILL.md) for persistent capabilities. No automatic memory — explicit instructions required each session.

Cursor

.cursor/rules/*.mdc .cursorrules (legacy)

Activation4 modes per rule: Always Active, Auto Attached (glob pattern), Agent Requested, or Manual. The most granular per-rule control of any tool.

MemoryNo persistent cross-session memory. Rules are static, version-controlled files. What you write is what the agent always sees.

Windsurf

.windsurf/rules/*.md

ActivationDual-layer: developer-written rules (version-controlled team norms) plus Cascade Memories auto-generated from interactions (stored locally).

MemoryUnique in the market: Cascade learns your architecture patterns and coding conventions over ~48 hours of use. Auto-memories can be promoted into explicit versioned rules.

GitHub Copilot

.github/copilot-instructions.md .github/instructions/*.instructions.md

ActivationRepo-wide instructions plus path-specific rules via applyTo frontmatter. Personal user-level instructions take highest priority. Organisation-level instructions became GA in April 2026.

MemoryNo session memory. Copilot Enterprise can fine-tune on your private repos for domain-specific accuracy.

Kiro

.kiro/steering/ specs/ (EARS notation)

ActivationSteering files configure agent behaviour per-project or globally. Specs are mandatory checkpoints — requirements and design must be approved before any code is generated.

MemorySpecs function as living memory — documents that evolve with the codebase and act as the canonical source of intent for every agent session.

AGENTS.md standard

AGENTS.md AGENTS.override.md

ActivationOpen standard under AAIF / Linux Foundation. 60+ tools support it. Nearest file to the edited code takes precedence in the directory tree. Write once, works everywhere.

MemoryPure Markdown, no required fields. Tool-agnostic — the highest-leverage choice for open-source projects or teams using multiple agents.

Practical recommendation: Teams should maintain both a CLAUDE.md (for Claude Code users) and an AGENTS.md (universal fallback). If your team uses multiple tools — or if you run an open-source project where contributors bring their own agents — AGENTS.md gives you broader coverage with one file. For Claude Code power users, CLAUDE.md's path scoping and hook system is worth the extra effort.

"The real skill in working with coding agents is no longer prompt design. It's context engineering — how you structure the persistent instructions and specifications that shape every agent interaction across every session."

— Medium, State of AI Coding Agents, March 2026

Tool Deep Dives — Agentic IDEs & CLI Agents

The five tools most developers reach for every day — each with a distinct philosophy about where AI lives in your workflow.

Cursor

Anysphere · $2B ARR · 2M users

$20/mo

Agentic IDE

Composer model — 4× faster than comparable models
Up to 8 parallel agents; auto-judges best solution
Background Agents — run tasks asynchronously
Plan Mode — editable Markdown plans before execution
OS-level sandboxing — reduces permission prompts 40%
Full MCP support · Git worktrees · JetBrains support
SOC 2 Type II · 360K+ paying customers · zero data loss
.cursor/rules with 4 activation modes (most granular)
No persistent cross-session memory (unlike Windsurf)
Credit burn on heavy agent use is unpredictable

Windsurf

Cognition AI (ex-Codeium) · #1 LogRocket Mar 2026

$20/mo

Agentic IDE

Cascade — agentic engine that maintains persistent context
Wave 13: Arena Mode (blind model comparison + voting)
Wave 13: Plan Mode; parallel sessions via Git worktrees
SWE-1.5 model: 950 tok/sec — 13× faster than Sonnet 4.6
Cascade Memories: auto-learns your patterns over 48hrs
Fast Context (SWE-grep): finds relevant context 20× faster
84% success rate on multi-file refactoring
Voice input · BYOK for Claude/GPT · MCP for Cascade
Pricing rose from $15 → $20 in March 2026
No background/async agent mode (requires interactive session)

Claude Code

Anthropic · Terminal-native

$20–200/mo

CLI Agent

80.9% SWE-bench Verified — highest reasoning quality
1M token context window (Opus 4.6) — entire repos in one pass
Agent Teams: 16+ parallel Claude instances, shared task list
Deepest native MCP integration — 300+ tool connections
CLAUDE.md: path scoping, imports, hooks system
Agent Skills (SKILL.md): loadable specialised instruction sets
/loop: run continuously until condition met (tests pass, etc.)
/voice: dictate prompts; receive spoken responses
~30% less code rework vs competitors in head-to-head tests
Token-based pricing: heavy use reaches $200/mo
No free tier; Claude models only (no model portability)

GitHub Copilot

Microsoft / GitHub · 15M devs

$10/mo

IDE Plugin

Cheapest serious entry point at $10/mo
Agent Mode (Jan 2026): multi-file editing + terminal use
Broadest IDE support: VS Code, JetBrains, Neovim, Xcode
Copilot Extensions: Docker, Azure, Sentry as chat participants
GitHub-native: tag @copilot on PRs and issues
Copilot Workspace: PR/issue to code, end-to-end
Org-level instructions (GA April 2026)
Agent mode weaker than Cursor/Windsurf equivalents
No persistent context — autocomplete and chat paradigm
Weakest agentic capability of the five; best for inline work

Kiro

AWS · Spec-driven

Preview

Agentic IDE

Spec-driven: requirement → design doc → task list → code
EARS-notation requirements — unambiguous acceptance criteria
Hooks: auto-run tests or docs on every code change
Steering files: configure agent behaviour per project
Native remote MCP integration
AWS GovCloud (US) available from February 2026
Mandatory checkpoints before code generation
Spec overhead can feel verbose for small bug fixes
Enterprise pricing post-GA unknown; Open VSX only (no VS Marketplace)

Google Antigravity

Google · Nov 2025 preview

Free*

Agentic IDE

Manager View: parallel multi-agent orchestration (unique UI)
Native Chrome browser sub-agent with video recording
Artifacts: verifiable task lists, browser walkthroughs
Broadest model roster: Gemini 3.1 Pro + Claude 4.6 + GPT-OSS
AgentKit 2.0: 16 specialists, 40+ domain skills
Partial MCP authentication shipped April 2026
Opaque AI Credit system; hidden weekly cap introduced March 2026
5 critical CVEs (RCE, prompt injection) — not production-safe
48 GB RAM usage reported; UI glow causes input lag

Terminal agents: CLI-first tools

Gemini CLI

Google · Open source

Free

Open Source CLI

Open source and free (MIT licensed)
Gemini 2.5 Pro: 1M token context window
Voice input · WCAG accessibility checks
MCP support (partial)
Model locked to Gemini family
Less mature agent scaffolding than Claude Code or Aider

Codex CLI

OpenAI · Open source · Rust

Free (BYOK)

Open Source CLI

GPT-5.3 Codex: highest Terminal-Bench scores (77.3%)
Built in Rust — fastest raw generation speed
Bundled in ChatGPT Plus ($20/mo); 1M+ devs in first month
Native GitHub integration: @codex on issues and PRs
Kernel-level sandboxing (Linux) — on by default
MCP support · Agents SDK (Python + TypeScript)
OpenAI models only; no model portability

Agentic Features: The Full Comparison

This is the matrix that actually matters. Beyond autocomplete quality, here's how every tool handles the capabilities that define truly agentic behaviour.

Agentic IDEs

CursorIDE$20/mo

Multi-agent ×8 Background tasks MCP native Git worktrees Rules (4 modes) JetBrains SOC 2 II OS sandboxing Browser agent Persistent memory Voice

WindsurfIDE$20/mo

Multi-agent MCP (Cascade) Git worktrees Arena Mode Cascade Memories Voice input JetBrains BYOK Background tasks Browser agent

GH CopilotIDE$10/mo

MCP native Git-native Rules / instructions JetBrains SOC 2 II Copilot Extensions Agent mode (limited) Background tasks Browser agent Persistent memory

KiroIDEPreview

MCP (remote) Spec-driven Hooks & triggers Steering files Git-native AWS GovCloud Multi-agent (planned) Background tasks JetBrains Voice

AntigravityIDEFree*

Multi-agent (Manager) Browser agent Artifacts Gemini+Claude+GPT-OSS MCP (partial, Apr 2026) Background tasks Rules / CLAUDE.md Persistent memory SOC 2 5 unpatched CVEs

CLI Agents

Claude CodeCLI$20–200/mo

Agent Teams ×16+ Background tasks (/loop) MCP native (deepest) 1M token context CLAUDE.md + hooks SKILL.md Voice (/voice) Git-native Browser agent Air-gap

Codex CLICLIFree / BYOK

Multi-agent (Agents SDK) Background tasks MCP support Kernel sandboxing GitHub native (@codex) AGENTS.md MIT license Browser agent Air-gap Voice

Gemini CLICLIFree

1M token context Voice input Open source (MIT) MCP (partial) Multi-agent Background tasks Air-gap

Autonomous Agents

Devin 2.2Autonomous$20 + ACUs

Fully autonomous PR Parallel sessions Background tasks Browser built-in Slack / Jira triggers Interactive Planning 67% PR merge rate MCP Air-gap Fails on ambiguous tasks

Open-Source / Self-Hosted

ClineOSS IDEFree BYOK

MCP native Browser automation Step-by-step approvals AGENTS.md Air-gap capable 57.9K ★ Multi-agent (v3.58) Inline autocomplete Persistent memory

Kilo CodeOSS IDEFree BYOK

Superset Cline+Roo Orchestrator mode Inline autocomplete Memory Bank 500+ models MCP Marketplace JetBrains Air-gap capable Browser agent

OpenHandsOSS PlatformFree BYOK

Multi-agent SDK Background tasks MCP native Docker sandbox Browser built-in Air-gap capable 50%+ GitHub issues solved

AiderOSS CLIFree Apache

Auto-commits every edit Git-native Air-gap capable 100+ languages Local model support 40K ★ Multi-agent Persistent memory

Continue.devOSS IDEFree BYOK

VS Code + JetBrains MCP native Ollama / Tabby endpoints 4 modes (auto/chat/edit/agent) Air-gap capable Agent mode (maturing) Multi-agent

Spotlight: five agentic features worth understanding

Background Agents
Work while you work elsewhere
Background agents let you assign tasks and continue coding. The agent runs asynchronously, surfaces a PR or diff when done. Cursor Background Agents and Claude Code's /loop command are the most mature implementations. Critical: each agent needs its own git worktree to avoid file conflicts.
Cursor · Claude Code · Codex Desktop · Devin

Hooks & Triggers

Event-driven agent automation

Kiro Hooks are the most polished implementation: define triggers like "after every code change, run tests and update documentation automatically." Claude Code Hooks are programmatic — you define shell triggers in CLAUDE.md. GitHub Copilot via Jira integration can trigger on ticket creation.

Kiro (best) · Claude Code (hooks) · Copilot+Jira

Arena Mode

Windsurf's blind model comparison

A genuinely novel feature: two Cascade agents run the same prompt in parallel with hidden model identities. You vote on which performed better, feeding a personal and global leaderboard. Helps teams discover which models actually work for their specific codebase — not just what benchmarks say. No other tool offers this.

Windsurf only (Wave 13+)

Spec-Driven Development
Requirements before code
Kiro is the purest implementation: natural language → EARS-notation requirements → architecture doc → task list → implementation. Every step is a checkpoint. No code is written until requirements are approved. Reduces "AI-assisted but messy" syndrome in large teams. Kilo Code's Orchestrator mode takes a lighter approach.
Kiro (mandatory) · Kilo Code (optional) · Devin 2.2

Agent Skills (SKILL.md)

Plugins in natural language

A SKILL.md file defines a reusable capability: role, tools, triggers, and step-by-step instructions. Load it and your agent gains a new specialisation — security auditing, WordPress publishing, SEO analysis. Over 2,600 community skills exist, doubling quarterly. Claude Code, Codex, and OpenCode support the format natively.

Claude Code · Codex CLI · OpenCode · 2,600+ community skills

Autonomous PR Agents

Assign a ticket, get a PR

Devin (67% PR merge rate on defined tasks) and OpenAI Codex represent the most autonomous end of the spectrum — assign via Slack or Jira, receive a reviewed PR. Best for well-scoped backlog work: migrations, test writing, documentation, dependency updates. Struggle with ambiguity or mid-task requirement changes.

Devin 2.2 · Codex Desktop · OpenHands

Benchmarks: What the Numbers Actually Mean

SWE-bench is the gold standard — real GitHub issues, not synthetic puzzles. But scaffolding matters as much as the model. The same model can score 10+ points apart depending on how context is retrieved.

Critical caveat on all benchmark scores: Scaffolding matters enormously. Cognition reports that their Devin agent spends 60% of its time on search and context retrieval, not code generation. Augment Code's context engine adds ~6 SWE-bench Pro points over bare scaffolding — using the same model. A product demo using proprietary scaffolding, RAG, and multi-agent review is a fundamentally different system than a single model in a bash shell. Use scores to compare similar systems, not raw numbers across different architectures.

SWE-bench Verified (bash-only leaderboard)

Claude Opus 4.6

80.9%

Claude Sonnet 4.6

79.6%

GPT-5.3 Codex

~78%

Gemini 3.1 Pro

~76%

GLM-5 (OSS)

~73%

Qwen3.5-35B-A3B

~72%

Kimi K2.5 (OSS)

~68%

Terminal-Bench 2.0 (terminal task performance)

Gemini 3.1 Pro

78.4%

GPT-5.3 Codex

77.3%

Claude Opus 4.6

74.7%

Claude Sonnet 4.6

~65%

Windsurf SWE-1.5

~63%

Real-world task autonomy (developer assessment)

Claude Code

9.2

Cursor

8.6

OpenHands

8.4

Windsurf

8.2

Kilo Code

8.0

Devin 2.2

7.2

GH Copilot

6.2

Code rework rate (lower = better)

Claude Code

★ Low

Cursor

Low–

Windsurf

Low–

Kilo Code

Med

GH Copilot

Med

Antigravity

Med+

Devin

High*

OSS model spotlight: Qwen3.5-35B-A3B & Gemma 4 31B

Two open-weight models released in early 2026 that are reshaping what's possible in self-hosted agentic workflows. Different families, different strengths — both worth knowing.

Qwen3.5-35B-A3B — Qwen3.5 family · Alibaba

35B total parameters, only 3B active (MoE). Released February 24, 2026. Beats the previous-generation Qwen3-235B-A22B (22B active) — better architecture, not bigger scale. Runs on a MacBook Pro 24GB. General-purpose with exceptional coding: strong SWE-bench, BFCL tool use, and 1M token context. The practical sweet spot for most self-hosted agentic workflows.

SWE-bench Verified

~72%

BFCL-V4 (tool use)

72.2%

MMLU-Pro

82.1%

Context: 262K native · 1M extended · License: Apache 2.0 · Run via: ollama run qwen3.5:35b-a3b

Gemma 4 31B Dense — Gemma 4 family · Google DeepMind

Released April 2, 2026. Built on Gemini 3 research. Currently #3 open model on Arena AI leaderboard. Native function calling, multimodal (text + vision + audio), Apache 2.0. No SWE-bench score published — Google evaluated against LiveCodeBench, τ2-bench, and GPQA Diamond instead.

LiveCodeBench v6

80.0%

τ2-bench (tool use)

86.4%

GPQA Diamond

84.3%

AIME 2026 (math)

89.2%

Context: 256K tokens · License: Apache 2.0 · Run via: ollama run gemma4:31b

Open-Source & Self-Hosted: The Privacy-First Stack

The gap between open-source and proprietary AI coding has closed dramatically. With the right model, self-hosted setups now match proprietary tools on everyday tasks — at zero recurring cost.

Cline

The original OSS agent · 57.9K ★ · 5M installs

Free

VS Code Extension

Step-by-step human approval before every action
Full repo indexing before execution planning
Native MCP · BYOK any provider · browser automation
CLI 2.0 (v3.58): headless/CI mode · native subagents
No inline autocomplete (agentic only)
~90s per task vs ~45s for Cursor on equivalent work

Kilo Code

$8M seed · Cline+Roo superset · 1.5M users

Free

VS Code + JetBrains

Superset of Cline + Roo Code — all their features plus more
Orchestrator mode: routes subtasks to specialist modes
Inline autocomplete (unique among OSS agents)
Memory Bank · browser agent · 500+ model providers
JetBrains support · MCP Marketplace
Free through Q1 2026; $20/user/mo team plan after

OpenHands

All Hands AI · $18.8M Series A · 65K ★

Free / on-prem

Enterprise Platform

50%+ of real GitHub issues solved in benchmarks
87% same-day bug resolution (enterprise deployments)
SDK for building and deploying custom agents at scale
Docker sandbox: kernel-level isolation
Browser · Jupyter · terminal built-in
Platform for automation, not daily coding companion

Aider

Paul Gauthier · Git-native · 40K ★

Free (Apache 2.0)

Terminal CLI

Every edit = git commit with auto-generated message
Every session = a reviewable, revertable branch
100+ languages · 4.1M installs · 15B tokens/week
Qwen3.5-35B-A3B via Ollama: near-frontier quality at $0
Air-gap capable · best for regulated environments
CLI-only — no visual IDE experience

Tabby

TabbyML · Rust · 32K ★ · team server

Free (self-host)

Team Server

Single binary / Docker — no external DBMS required
Truly air-gap after initial model download
VS Code · JetBrains · Vim/Neovim · Eclipse
SSO · RBAC · usage analytics · repo context indexing
Completions + chat only; pair with Cline for agentic tasks

Continue.dev

Open-source · VS Code + JetBrains · 26K ★

Free

IDE Extension

Only fully OSS agent with native VS Code + JetBrains
4 modes: Autocomplete · Chat · Edit · Agent
Any OpenAI-compatible endpoint: Ollama · Tabby · LM Studio
Best pairing: Continue + Cline + Tabby server (full air-gap)
Agent mode less mature than Cline or Kilo Code

The recommended self-hosted stack (2026): Deploy Tabby on a GPU server (RTX 4090, 24GB VRAM) serving Qwen3.5-35B-A3B. Point Continue.dev at it for fast autocomplete and Cline or Kilo Code for agentic tasks. Use Aider for surgical git-precise edits. Escalate to Claude Code API for the hard 20% of problems. Total recurring cost: ~$0 after hardware. Data stays on your server.

"The practical setup: local models for 80% of routine coding, Claude Code for the 20% of hard problems requiring frontier reasoning. The gap is real but narrowing every month."

— InsiderLLM, Best Local Alternatives to Claude Code, February 2026

Pricing Reference — April 2026

Tool	Free tier	Entry paid	Pro / Max	Model access	Notable
Cursor	Limited trial	$20/mo	$40–200/mo	Claude · GPT · Gemini	SOC 2 II · unpredictable credit burn on heavy agent use
Windsurf	Free (SWE-1.5 3mo promo)	$20/mo	$40–200/mo	SWE-1.5 · Claude · GPT	Was $15; raised to $20 March 2026. Quota-based.
GitHub Copilot	50 premium req/mo	$10/mo	$19–39/mo	GPT-5 · Claude · Gemini	Cheapest entry. Best for GitHub-native teams.
Kiro	Preview (free)	—	Enterprise custom	Claude Sonnet 4.5 + Auto	AWS GovCloud. Post-GA pricing unknown.
Antigravity	Generous*	$20/mo	$249.99/mo	Gemini 3.1 · Claude 4.6 · GPT-OSS	⚠ Opaque credits; hidden weekly cap; 5 unpatched CVEs
Claude Code	Free (Sonnet 4.6, limited)	$20/mo	$100–200/mo	Opus 4.6 · Sonnet 4.6	Token-based. 95% savings possible with caching + batch.
Codex CLI	Free (open source)	Bundled with ChatGPT Plus $20	$200/mo (Pro)	GPT-5.3 Codex	MIT. Best Terminal-Bench scores. 1M devs in month one.
Gemini CLI	Free (open source)	—	—	Gemini 2.5 Pro	MIT. Voice input. Best free CLI entry point.
Devin 2.2	—	$20/mo + $2.25/ACU	Enterprise custom	Proprietary	67% PR merge on defined tasks. Down from $500/mo. ACU = ~15 min active work.
Cline	Free (VS Code ext)	API costs only	Teams plan (SSO)	Any via BYOK	$5–200/mo in API costs depending on usage and model.
Kilo Code	Free + $20 credits	$20/user/mo (post-Q1)	Enterprise custom	500+ via BYOK (no markup)	First 10 seats permanently free.
OpenHands	Free (self-host)	SaaS plan	On-prem custom	Any via BYOK	$18.8M Series A. Enterprise SDK. Docker sandbox.
Aider	Free (Apache 2.0)	API costs only	—	Any via BYOK / local	$0 with Ollama + local model. Best for air-gapped setups.
Tabby	Free (self-host)	—	Enterprise custom	Local GPU	Hardware cost only. After initial pull: fully air-gap.
Continue.dev	Free (all features)	Hub team plan	—	Any via BYOK	Only OSS with native VS Code + JetBrains. Free tier complete.

The cheapest real-world stack: GitHub Copilot Pro ($10/mo) for daily autocomplete + Claude Code Pro ($20/mo) for complex tasks = $30/mo total, covers 95% of development scenarios according to developer surveys. For teams, adding Tabby on shared GPU infrastructure brings per-seat costs near zero for autocomplete while keeping frontier model access for hard problems.

The Verdict — Which Tool for Which Situation

No single tool wins across all dimensions. The right choice depends on where you sit on three axes: how much autonomy you want to delegate, how much your code can leave your infrastructure, and how much you want to spend.

Your situation	Best pick	Why
Polished daily coding, familiar VS Code feel, reliability first	Cursor	SOC 2 certified, fastest completions, 8 parallel agents, zero data-loss record, most mature agentic IDE.
Best context memory and model comparison across sessions	Windsurf	Cascade Memories learns your patterns. Arena Mode helps you find which model actually works for your codebase. SWE-1.5 is fastest at 950 tok/sec.
Hardest reasoning, largest codebase, maximum MCP integration	Claude Code	80.9% SWE-bench, 1M token context, deepest MCP, Agent Skills, /loop, hooks. The escalation tool when others fail.
Cheapest entry, GitHub-centric team, just getting started	GitHub Copilot	$10/mo, broadest IDE support, GitHub-native. Ceiling is real but for inline work it's the best value per dollar.
Auditable, spec-driven workflows (regulated industries, compliance)	Kiro	Only tool with mandatory requirement → design → task checkpoints. Hooks for automation. AWS GovCloud available.
Defined backlog tasks: migrations, test writing, documentation	Devin 2.2	67% PR merge rate on scoped tasks. Assign via Slack, review the PR. Don't use on ambiguous or exploratory work.
Maximum OSS features, want autocomplete + agents in one tool	Kilo Code	Superset of Cline + Roo. 500+ models, Orchestrator mode, Memory Bank, inline autocomplete, JetBrains. Free BYOK.
Git precision, surgical multi-file edits, clean commit history	Aider	Every edit is a commit. Every session is a branch. $0 with Ollama + Qwen3.5-35B-A3B. Air-gap capable.
Code must never leave your infrastructure	Tabby + Cline	Tabby on GPU: true air-gap, SSO, RBAC, usage analytics. Point Cline at it. 60% lower cost than SaaS at scale.
Enterprise autonomous agents, need custom agent deployment	OpenHands	$18.8M backed. SDK for custom agents. Docker sandbox. 50%+ GitHub issue resolution. Deployable on-prem.
Multi-agent browser automation, prototyping, Google ecosystem	Antigravity (with caveats)	Manager View + Chrome sub-agent are unique. Accept quota opacity and CVEs as preview tax. Not for production repos.

The power-user stack — what most senior developers actually run

Cursor or Windsurf as the daily IDE driver — autocomplete, inline edits, fast feedback loops. Windsurf if you care about persistent context; Cursor if you want the most polished multi-agent UI.

Claude Code in the terminal as the escalation path — for the hard 20% of problems that require sustained deep reasoning, 1M-token codebase analysis, or /loop automation.

Cline or Kilo Code pointed at a local Tabby/Ollama server for sensitive or proprietary repos — agentic tasks without any code leaving your machine.

Aider for surgical git-precise edits when you need a clean, reviewable commit history — particularly useful for open-source contributions or regulated environments.

AGENTS.md in every repo — the single highest-leverage, zero-cost improvement you can make today. Write your conventions once; every tool that touches the repo will follow them.