Complete Field Guide · April 2026

The AI Coding Tools
Landscape 2026

Every serious tool across cloud IDEs, CLI agents, autonomous agents, and self-hosted infrastructure — with a deep focus on the agentic capabilities, protocols, and context systems reshaping how software gets built.

Tools covered: 15+
Categories: 4 archetypes
Focus: MCP · A2A · Rules · Memory · Multi-agent
Updated: April 13, 2026
01

The Landscape Has Fundamentally Shifted

In 2024, AI coding was autocomplete. In 2026, it's orchestration — agents that plan, write, test, run terminal commands, spawn sub-agents, and merge pull requests while you sleep.

The question developers ask has changed. It's no longer "can AI help me write this function?" It's "which tool do I trust to own this feature end-to-end?" That shift has fractured the market into four distinct archetypes that solve different problems and serve different workflows. The tools are no longer interchangeable.

According to GitHub's Octoverse report, 92% of US developers now use AI coding tools. But a RAND study found that 80–90% of products labeled "AI agent" are still chatbot wrappers underneath. The 15 tools in this guide are the real deal — they can genuinely plan, execute, iterate, and in some cases close the entire loop from ticket to merged PR.

92%
US devs using AI tools (Octoverse)
97M
MCP SDK downloads/month (Mar 2026)
10K+
Active public MCP servers
80%+
SWE-bench top scores (2026)
76%
Best bash-only leaderboard score
67%
Devin PR merge rate on defined tasks

The four archetypes

Agentic IDE
AI embedded in your editor
VS Code forks with deep AI integration. Handle autocomplete, multi-file edits, and agent tasks while you stay in a familiar environment. Best for daily interactive coding.
Cursor · Windsurf · Kiro · Antigravity
CLI Agent
AI in the terminal
Terminal-native tools that operate directly on your filesystem and git. Editor-agnostic, composable with any workflow, and often the deepest reasoners available.
Claude Code · Gemini CLI · Codex CLI · Aider · OpenCode
Autonomous Agent
Fire-and-forget cloud agents
Fully autonomous systems that accept a task via Slack or Jira and deliver a PR. Best for well-scoped, defined tasks. Requires clear requirements; struggles with ambiguity.
Devin · OpenAI Codex App · Jules
Open-Source / Self-Hosted
Bring your own model
Free at the tool layer; you pay only for model API calls (or nothing with local models). Full control, auditability, and no vendor lock-in. Ideal for privacy-sensitive teams.
Cline · Roo Code · Kilo Code · OpenHands · Continue.dev · Tabby
The convergence point: Every major tool shipped multi-agent support in the same two-week window in February 2026 — Cursor (8 agents), Windsurf Wave 13, Claude Code Agent Teams, Codex CLI with Agents SDK, Devin parallel sessions. Parallel agentic execution is now table stakes. The differentiation has moved to how agents are orchestrated, how context is managed, and how much you pay.
02

The Protocol Layer: MCP, A2A, and AGENTS.md

Agentic tools are only as powerful as what they can connect to. Three open standards now define that infrastructure — and every serious tool has either adopted them or is racing to.

Before MCP, every AI tool integration was a bespoke one-off. A GitHub connector in Cursor needed completely different code than the same connector in Copilot. MCP solved the N×M problem: build one server, and every MCP-compatible agent can use it. By March 2026 the ecosystem had crossed 97 million monthly SDK downloads and over 10,000 active public servers — faster adoption than any developer protocol since GraphQL.

The three protocols that matter in 2026

MCP — Model Context Protocol
Standardizes how AI agents connect to external tools, databases, and services. Built by Anthropic, donated to the Linux Foundation (AAIF) in December 2025. Adopted by every major provider: Anthropic, OpenAI, Google, Microsoft, AWS.
97M downloads/month · 10,000+ servers · OAuth 2.1 baked in
A2A — Agent-to-Agent Protocol
Standardizes how AI agents discover, communicate, and delegate to each other — regardless of framework. Built by Google, donated to Linux Foundation in June 2025. Think of it as HTTP for agent-to-agent communication.
IBM ACP merged into A2A (Aug 2025) · AAIF co-founded by OpenAI, Anthropic, Google, AWS
AGENTS.md / CLAUDE.md
Project instruction files that give agents persistent context about your codebase conventions. AGENTS.md is the open standard (60+ tools); CLAUDE.md is Claude Code's richer variant with imports and path scoping. Write once, inform every session.
60+ tools support AGENTS.md · 4 Cursor rule activation modes
SKILL.md
A structured instruction file that teaches an agent a reusable capability — like a plugin in natural language. Defines role, tools, triggers, and step-by-step instructions. Over 2,600 community skills now available across Claude Code, Codex, and others.
2,636+ public skills · doubles every quarter
MCP Adoption Status
Who has it, who doesn't
Full native MCP: Claude Code, Cursor, Windsurf (Cascade), GitHub Copilot, Kiro, Cline, Roo Code, Kilo Code, OpenHands, Continue.dev, Aider (partial).

Partial / proprietary: Antigravity added partial MCP authentication in April 2026, but still routes primarily through its own AgentKit.

No MCP: Devin uses its own integration layer; Tabby (server tool, not agent).
10,000+ MCP servers available today
Security Note
MCP has known vulnerabilities
Security researchers in April 2025 identified prompt injection via tool descriptions, permission escalation by combining tools, and lookalike server attacks. MCP v2.1 (2026) introduced enhanced authentication, RBAC, and audit trails — but teams should audit MCP server permissions carefully before granting broad access to production systems.
MCP v2.1 adds RBAC · OAuth 2.1 · audit logging
Context Window vs MCP
Two different levers
A large context window (1M tokens in Claude Code) lets the agent understand your whole codebase at once. MCP lets the agent act on external systems — databases, CI/CD, Jira, Slack, GitHub. They're complementary: context for reasoning, MCP for reaching out. The best agents do both.
Claude Code: 1M ctx + deepest MCP
03

Context Engineering: Rules, Memory & Instructions

The real skill in 2026 isn't prompt writing — it's context engineering. How you structure the persistent instructions, memories, and rules that shape every agent interaction determines whether your AI behaves like a senior engineer or a confused intern.

Every major tool now has a configuration file system. But the implementations differ substantially. Understanding which file does what — and how they interact — is critical for teams that want consistent, codebase-aware AI behaviour across their entire engineering organisation.

Claude Code
CLAUDE.md ~/.claude/CLAUDE.md
ActivationProject-level + global. Supports imports, path scoping, and hooks. Three-tier hierarchy — most powerful variant available.
MemoryAgent Skills (SKILL.md) for persistent capabilities. No automatic memory — explicit instructions required each session.
Cursor
.cursor/rules/*.mdc .cursorrules (legacy)
Activation4 modes per rule: Always Active, Auto Attached (glob pattern), Agent Requested, or Manual. The most granular per-rule control of any tool.
MemoryNo persistent cross-session memory. Rules are static, version-controlled files. What you write is what the agent always sees.
Windsurf
.windsurf/rules/*.md
ActivationDual-layer: developer-written rules (version-controlled team norms) plus Cascade Memories auto-generated from interactions (stored locally).
MemoryUnique in the market: Cascade learns your architecture patterns and coding conventions over ~48 hours of use. Auto-memories can be promoted into explicit versioned rules.
GitHub Copilot
.github/copilot-instructions.md .github/instructions/*.instructions.md
ActivationRepo-wide instructions plus path-specific rules via applyTo frontmatter. Personal user-level instructions take highest priority. Organisation-level instructions became GA in April 2026.
MemoryNo session memory. Copilot Enterprise can fine-tune on your private repos for domain-specific accuracy.
Kiro
.kiro/steering/ specs/ (EARS notation)
ActivationSteering files configure agent behaviour per-project or globally. Specs are mandatory checkpoints — requirements and design must be approved before any code is generated.
MemorySpecs function as living memory — documents that evolve with the codebase and act as the canonical source of intent for every agent session.
AGENTS.md standard
AGENTS.md AGENTS.override.md
ActivationOpen standard under AAIF / Linux Foundation. 60+ tools support it. Nearest file to the edited code takes precedence in the directory tree. Write once, works everywhere.
MemoryPure Markdown, no required fields. Tool-agnostic — the highest-leverage choice for open-source projects or teams using multiple agents.
Practical recommendation: Teams should maintain both a CLAUDE.md (for Claude Code users) and an AGENTS.md (universal fallback). If your team uses multiple tools — or if you run an open-source project where contributors bring their own agents — AGENTS.md gives you broader coverage with one file. For Claude Code power users, CLAUDE.md's path scoping and hook system is worth the extra effort.

"The real skill in working with coding agents is no longer prompt design. It's context engineering — how you structure the persistent instructions and specifications that shape every agent interaction across every session."

— Medium, State of AI Coding Agents, March 2026
04

Tool Deep Dives — Agentic IDEs & CLI Agents

The five tools most developers reach for every day — each with a distinct philosophy about where AI lives in your workflow.

Cursor
Anysphere · $2B ARR · 2M users
$20/mo
Agentic IDE
  • Composer model — 4× faster than comparable models
  • Up to 8 parallel agents; auto-judges best solution
  • Background Agents — run tasks asynchronously
  • Plan Mode — editable Markdown plans before execution
  • OS-level sandboxing — reduces permission prompts 40%
  • Full MCP support · Git worktrees · JetBrains support
  • SOC 2 Type II · 360K+ paying customers · zero data loss
  • .cursor/rules with 4 activation modes (most granular)
  • No persistent cross-session memory (unlike Windsurf)
  • Credit burn on heavy agent use is unpredictable
Windsurf
Cognition AI (ex-Codeium) · #1 LogRocket Mar 2026
$20/mo
Agentic IDE
  • Cascade — agentic engine that maintains persistent context
  • Wave 13: Arena Mode (blind model comparison + voting)
  • Wave 13: Plan Mode; parallel sessions via Git worktrees
  • SWE-1.5 model: 950 tok/sec — 13× faster than Sonnet 4.6
  • Cascade Memories: auto-learns your patterns over 48hrs
  • Fast Context (SWE-grep): finds relevant context 20× faster
  • 84% success rate on multi-file refactoring
  • Voice input · BYOK for Claude/GPT · MCP for Cascade
  • Pricing rose from $15 → $20 in March 2026
  • No background/async agent mode (requires interactive session)
Claude Code
Anthropic · Terminal-native
$20–200/mo
CLI Agent
  • 80.9% SWE-bench Verified — highest reasoning quality
  • 1M token context window (Opus 4.6) — entire repos in one pass
  • Agent Teams: 16+ parallel Claude instances, shared task list
  • Deepest native MCP integration — 300+ tool connections
  • CLAUDE.md: path scoping, imports, hooks system
  • Agent Skills (SKILL.md): loadable specialised instruction sets
  • /loop: run continuously until condition met (tests pass, etc.)
  • /voice: dictate prompts; receive spoken responses
  • ~30% less code rework vs competitors in head-to-head tests
  • Token-based pricing: heavy use reaches $200/mo
  • No free tier; Claude models only (no model portability)
GitHub Copilot
Microsoft / GitHub · 15M devs
$10/mo
IDE Plugin
  • Cheapest serious entry point at $10/mo
  • Agent Mode (Jan 2026): multi-file editing + terminal use
  • Broadest IDE support: VS Code, JetBrains, Neovim, Xcode
  • Copilot Extensions: Docker, Azure, Sentry as chat participants
  • GitHub-native: tag @copilot on PRs and issues
  • Copilot Workspace: PR/issue to code, end-to-end
  • Org-level instructions (GA April 2026)
  • Agent mode weaker than Cursor/Windsurf equivalents
  • No persistent context — autocomplete and chat paradigm
  • Weakest agentic capability of the five; best for inline work
Kiro
AWS · Spec-driven
Preview
Agentic IDE
  • Spec-driven: requirement → design doc → task list → code
  • EARS-notation requirements — unambiguous acceptance criteria
  • Hooks: auto-run tests or docs on every code change
  • Steering files: configure agent behaviour per project
  • Native remote MCP integration
  • AWS GovCloud (US) available from February 2026
  • Mandatory checkpoints before code generation
  • Spec overhead can feel verbose for small bug fixes
  • Enterprise pricing post-GA unknown; Open VSX only (no VS Marketplace)
Google Antigravity
Google · Nov 2025 preview
Free*
Agentic IDE
  • Manager View: parallel multi-agent orchestration (unique UI)
  • Native Chrome browser sub-agent with video recording
  • Artifacts: verifiable task lists, browser walkthroughs
  • Broadest model roster: Gemini 3.1 Pro + Claude 4.6 + GPT-OSS
  • AgentKit 2.0: 16 specialists, 40+ domain skills
  • Partial MCP authentication shipped April 2026
  • Opaque AI Credit system; hidden weekly cap introduced March 2026
  • 5 critical CVEs (RCE, prompt injection) — not production-safe
  • 48 GB RAM usage reported; UI glow causes input lag

Terminal agents: CLI-first tools

Gemini CLI
Google · Open source
Free
Open Source CLI
  • Open source and free (MIT licensed)
  • Gemini 2.5 Pro: 1M token context window
  • Voice input · WCAG accessibility checks
  • MCP support (partial)
  • Model locked to Gemini family
  • Less mature agent scaffolding than Claude Code or Aider
Codex CLI
OpenAI · Open source · Rust
Free (BYOK)
Open Source CLI
  • GPT-5.3 Codex: highest Terminal-Bench scores (77.3%)
  • Built in Rust — fastest raw generation speed
  • Bundled in ChatGPT Plus ($20/mo); 1M+ devs in first month
  • Native GitHub integration: @codex on issues and PRs
  • Kernel-level sandboxing (Linux) — on by default
  • MCP support · Agents SDK (Python + TypeScript)
  • OpenAI models only; no model portability
05

Agentic Features: The Full Comparison

This is the matrix that actually matters. Beyond autocomplete quality, here's how every tool handles the capabilities that define truly agentic behaviour.

Agentic IDEs
CursorIDE$20/mo
Multi-agent ×8 Background tasks MCP native Git worktrees Rules (4 modes) JetBrains SOC 2 II OS sandboxing Browser agent Persistent memory Voice
WindsurfIDE$20/mo
Multi-agent MCP (Cascade) Git worktrees Arena Mode Cascade Memories Voice input JetBrains BYOK Background tasks Browser agent
GH CopilotIDE$10/mo
MCP native Git-native Rules / instructions JetBrains SOC 2 II Copilot Extensions Agent mode (limited) Background tasks Browser agent Persistent memory
KiroIDEPreview
MCP (remote) Spec-driven Hooks & triggers Steering files Git-native AWS GovCloud Multi-agent (planned) Background tasks JetBrains Voice
AntigravityIDEFree*
Multi-agent (Manager) Browser agent Artifacts Gemini+Claude+GPT-OSS MCP (partial, Apr 2026) Background tasks Rules / CLAUDE.md Persistent memory SOC 2 5 unpatched CVEs
CLI Agents
Claude CodeCLI$20–200/mo
Agent Teams ×16+ Background tasks (/loop) MCP native (deepest) 1M token context CLAUDE.md + hooks SKILL.md Voice (/voice) Git-native Browser agent Air-gap
Codex CLICLIFree / BYOK
Multi-agent (Agents SDK) Background tasks MCP support Kernel sandboxing GitHub native (@codex) AGENTS.md MIT license Browser agent Air-gap Voice
Gemini CLICLIFree
1M token context Voice input Open source (MIT) MCP (partial) Multi-agent Background tasks Air-gap
Autonomous Agents
Devin 2.2Autonomous$20 + ACUs
Fully autonomous PR Parallel sessions Background tasks Browser built-in Slack / Jira triggers Interactive Planning 67% PR merge rate MCP Air-gap Fails on ambiguous tasks
Open-Source / Self-Hosted
ClineOSS IDEFree BYOK
MCP native Browser automation Step-by-step approvals AGENTS.md Air-gap capable 57.9K ★ Multi-agent (v3.58) Inline autocomplete Persistent memory
Kilo CodeOSS IDEFree BYOK
Superset Cline+Roo Orchestrator mode Inline autocomplete Memory Bank 500+ models MCP Marketplace JetBrains Air-gap capable Browser agent
OpenHandsOSS PlatformFree BYOK
Multi-agent SDK Background tasks MCP native Docker sandbox Browser built-in Air-gap capable 50%+ GitHub issues solved
AiderOSS CLIFree Apache
Auto-commits every edit Git-native Air-gap capable 100+ languages Local model support 40K ★ Multi-agent Persistent memory
Continue.devOSS IDEFree BYOK
VS Code + JetBrains MCP native Ollama / Tabby endpoints 4 modes (auto/chat/edit/agent) Air-gap capable Agent mode (maturing) Multi-agent

Spotlight: five agentic features worth understanding

Background Agents
Work while you work elsewhere
Background agents let you assign tasks and continue coding. The agent runs asynchronously, surfaces a PR or diff when done. Cursor Background Agents and Claude Code's /loop command are the most mature implementations. Critical: each agent needs its own git worktree to avoid file conflicts.
Cursor · Claude Code · Codex Desktop · Devin
Hooks & Triggers
Event-driven agent automation
Kiro Hooks are the most polished implementation: define triggers like "after every code change, run tests and update documentation automatically." Claude Code Hooks are programmatic — you define shell triggers in CLAUDE.md. GitHub Copilot via Jira integration can trigger on ticket creation.
Kiro (best) · Claude Code (hooks) · Copilot+Jira
Arena Mode
Windsurf's blind model comparison
A genuinely novel feature: two Cascade agents run the same prompt in parallel with hidden model identities. You vote on which performed better, feeding a personal and global leaderboard. Helps teams discover which models actually work for their specific codebase — not just what benchmarks say. No other tool offers this.
Windsurf only (Wave 13+)
Spec-Driven Development
Requirements before code
Kiro is the purest implementation: natural language → EARS-notation requirements → architecture doc → task list → implementation. Every step is a checkpoint. No code is written until requirements are approved. Reduces "AI-assisted but messy" syndrome in large teams. Kilo Code's Orchestrator mode takes a lighter approach.
Kiro (mandatory) · Kilo Code (optional) · Devin 2.2
Agent Skills (SKILL.md)
Plugins in natural language
A SKILL.md file defines a reusable capability: role, tools, triggers, and step-by-step instructions. Load it and your agent gains a new specialisation — security auditing, WordPress publishing, SEO analysis. Over 2,600 community skills exist, doubling quarterly. Claude Code, Codex, and OpenCode support the format natively.
Claude Code · Codex CLI · OpenCode · 2,600+ community skills
Autonomous PR Agents
Assign a ticket, get a PR
Devin (67% PR merge rate on defined tasks) and OpenAI Codex represent the most autonomous end of the spectrum — assign via Slack or Jira, receive a reviewed PR. Best for well-scoped backlog work: migrations, test writing, documentation, dependency updates. Struggle with ambiguity or mid-task requirement changes.
Devin 2.2 · Codex Desktop · OpenHands
06

Benchmarks: What the Numbers Actually Mean

SWE-bench is the gold standard — real GitHub issues, not synthetic puzzles. But scaffolding matters as much as the model. The same model can score 10+ points apart depending on how context is retrieved.

Critical caveat on all benchmark scores: Scaffolding matters enormously. Cognition reports that their Devin agent spends 60% of its time on search and context retrieval, not code generation. Augment Code's context engine adds ~6 SWE-bench Pro points over bare scaffolding — using the same model. A product demo using proprietary scaffolding, RAG, and multi-agent review is a fundamentally different system than a single model in a bash shell. Use scores to compare similar systems, not raw numbers across different architectures.
SWE-bench Verified (bash-only leaderboard)
Claude Opus 4.6
80.9%
Claude Sonnet 4.6
79.6%
GPT-5.3 Codex
~78%
Gemini 3.1 Pro
~76%
GLM-5 (OSS)
~73%
Qwen3.5-35B-A3B
~72%
Kimi K2.5 (OSS)
~68%
Terminal-Bench 2.0 (terminal task performance)
Gemini 3.1 Pro
78.4%
GPT-5.3 Codex
77.3%
Claude Opus 4.6
74.7%
Claude Sonnet 4.6
~65%
Windsurf SWE-1.5
~63%
Real-world task autonomy (developer assessment)
Claude Code
9.2
Cursor
8.6
OpenHands
8.4
Windsurf
8.2
Kilo Code
8.0
Devin 2.2
7.2
GH Copilot
6.2
Code rework rate (lower = better)
Claude Code
★ Low
Cursor
Low–
Windsurf
Low–
Kilo Code
Med
GH Copilot
Med
Antigravity
Med+
Devin
High*

OSS model spotlight: Qwen3.5-35B-A3B & Gemma 4 31B

Two open-weight models released in early 2026 that are reshaping what's possible in self-hosted agentic workflows. Different families, different strengths — both worth knowing.

Qwen3.5-35B-A3B — Qwen3.5 family · Alibaba

35B total parameters, only 3B active (MoE). Released February 24, 2026. Beats the previous-generation Qwen3-235B-A22B (22B active) — better architecture, not bigger scale. Runs on a MacBook Pro 24GB. General-purpose with exceptional coding: strong SWE-bench, BFCL tool use, and 1M token context. The practical sweet spot for most self-hosted agentic workflows.

SWE-bench Verified
~72%
BFCL-V4 (tool use)
72.2%
MMLU-Pro
82.1%

Context: 262K native · 1M extended · License: Apache 2.0 · Run via: ollama run qwen3.5:35b-a3b

Gemma 4 31B Dense — Gemma 4 family · Google DeepMind

Released April 2, 2026. Built on Gemini 3 research. Currently #3 open model on Arena AI leaderboard. Native function calling, multimodal (text + vision + audio), Apache 2.0. No SWE-bench score published — Google evaluated against LiveCodeBench, τ2-bench, and GPQA Diamond instead.

LiveCodeBench v6
80.0%
τ2-bench (tool use)
86.4%
GPQA Diamond
84.3%
AIME 2026 (math)
89.2%

Context: 256K tokens · License: Apache 2.0 · Run via: ollama run gemma4:31b

07

Open-Source & Self-Hosted: The Privacy-First Stack

The gap between open-source and proprietary AI coding has closed dramatically. With the right model, self-hosted setups now match proprietary tools on everyday tasks — at zero recurring cost.

Cline
The original OSS agent · 57.9K ★ · 5M installs
Free
VS Code Extension
  • Step-by-step human approval before every action
  • Full repo indexing before execution planning
  • Native MCP · BYOK any provider · browser automation
  • CLI 2.0 (v3.58): headless/CI mode · native subagents
  • No inline autocomplete (agentic only)
  • ~90s per task vs ~45s for Cursor on equivalent work
Kilo Code
$8M seed · Cline+Roo superset · 1.5M users
Free
VS Code + JetBrains
  • Superset of Cline + Roo Code — all their features plus more
  • Orchestrator mode: routes subtasks to specialist modes
  • Inline autocomplete (unique among OSS agents)
  • Memory Bank · browser agent · 500+ model providers
  • JetBrains support · MCP Marketplace
  • Free through Q1 2026; $20/user/mo team plan after
OpenHands
All Hands AI · $18.8M Series A · 65K ★
Free / on-prem
Enterprise Platform
  • 50%+ of real GitHub issues solved in benchmarks
  • 87% same-day bug resolution (enterprise deployments)
  • SDK for building and deploying custom agents at scale
  • Docker sandbox: kernel-level isolation
  • Browser · Jupyter · terminal built-in
  • Platform for automation, not daily coding companion
Aider
Paul Gauthier · Git-native · 40K ★
Free (Apache 2.0)
Terminal CLI
  • Every edit = git commit with auto-generated message
  • Every session = a reviewable, revertable branch
  • 100+ languages · 4.1M installs · 15B tokens/week
  • Qwen3.5-35B-A3B via Ollama: near-frontier quality at $0
  • Air-gap capable · best for regulated environments
  • CLI-only — no visual IDE experience
Tabby
TabbyML · Rust · 32K ★ · team server
Free (self-host)
Team Server
  • Single binary / Docker — no external DBMS required
  • Truly air-gap after initial model download
  • VS Code · JetBrains · Vim/Neovim · Eclipse
  • SSO · RBAC · usage analytics · repo context indexing
  • Completions + chat only; pair with Cline for agentic tasks
Continue.dev
Open-source · VS Code + JetBrains · 26K ★
Free
IDE Extension
  • Only fully OSS agent with native VS Code + JetBrains
  • 4 modes: Autocomplete · Chat · Edit · Agent
  • Any OpenAI-compatible endpoint: Ollama · Tabby · LM Studio
  • Best pairing: Continue + Cline + Tabby server (full air-gap)
  • Agent mode less mature than Cline or Kilo Code
The recommended self-hosted stack (2026): Deploy Tabby on a GPU server (RTX 4090, 24GB VRAM) serving Qwen3.5-35B-A3B. Point Continue.dev at it for fast autocomplete and Cline or Kilo Code for agentic tasks. Use Aider for surgical git-precise edits. Escalate to Claude Code API for the hard 20% of problems. Total recurring cost: ~$0 after hardware. Data stays on your server.

"The practical setup: local models for 80% of routine coding, Claude Code for the 20% of hard problems requiring frontier reasoning. The gap is real but narrowing every month."

— InsiderLLM, Best Local Alternatives to Claude Code, February 2026
08

Pricing Reference — April 2026

Tool Free tier Entry paid Pro / Max Model access Notable
Cursor Limited trial $20/mo $40–200/mo Claude · GPT · Gemini SOC 2 II · unpredictable credit burn on heavy agent use
Windsurf Free (SWE-1.5 3mo promo) $20/mo $40–200/mo SWE-1.5 · Claude · GPT Was $15; raised to $20 March 2026. Quota-based.
GitHub Copilot 50 premium req/mo $10/mo $19–39/mo GPT-5 · Claude · Gemini Cheapest entry. Best for GitHub-native teams.
Kiro Preview (free) Enterprise custom Claude Sonnet 4.5 + Auto AWS GovCloud. Post-GA pricing unknown.
Antigravity Generous* $20/mo $249.99/mo Gemini 3.1 · Claude 4.6 · GPT-OSS ⚠ Opaque credits; hidden weekly cap; 5 unpatched CVEs
Claude Code Free (Sonnet 4.6, limited) $20/mo $100–200/mo Opus 4.6 · Sonnet 4.6 Token-based. 95% savings possible with caching + batch.
Codex CLI Free (open source) Bundled with ChatGPT Plus $20 $200/mo (Pro) GPT-5.3 Codex MIT. Best Terminal-Bench scores. 1M devs in month one.
Gemini CLI Free (open source) Gemini 2.5 Pro MIT. Voice input. Best free CLI entry point.
Devin 2.2 $20/mo + $2.25/ACU Enterprise custom Proprietary 67% PR merge on defined tasks. Down from $500/mo. ACU = ~15 min active work.
Cline Free (VS Code ext) API costs only Teams plan (SSO) Any via BYOK $5–200/mo in API costs depending on usage and model.
Kilo Code Free + $20 credits $20/user/mo (post-Q1) Enterprise custom 500+ via BYOK (no markup) First 10 seats permanently free.
OpenHands Free (self-host) SaaS plan On-prem custom Any via BYOK $18.8M Series A. Enterprise SDK. Docker sandbox.
Aider Free (Apache 2.0) API costs only Any via BYOK / local $0 with Ollama + local model. Best for air-gapped setups.
Tabby Free (self-host) Enterprise custom Local GPU Hardware cost only. After initial pull: fully air-gap.
Continue.dev Free (all features) Hub team plan Any via BYOK Only OSS with native VS Code + JetBrains. Free tier complete.
The cheapest real-world stack: GitHub Copilot Pro ($10/mo) for daily autocomplete + Claude Code Pro ($20/mo) for complex tasks = $30/mo total, covers 95% of development scenarios according to developer surveys. For teams, adding Tabby on shared GPU infrastructure brings per-seat costs near zero for autocomplete while keeping frontier model access for hard problems.
09

The Verdict — Which Tool for Which Situation

No single tool wins across all dimensions. The right choice depends on where you sit on three axes: how much autonomy you want to delegate, how much your code can leave your infrastructure, and how much you want to spend.

Your situation Best pick Why
Polished daily coding, familiar VS Code feel, reliability first Cursor SOC 2 certified, fastest completions, 8 parallel agents, zero data-loss record, most mature agentic IDE.
Best context memory and model comparison across sessions Windsurf Cascade Memories learns your patterns. Arena Mode helps you find which model actually works for your codebase. SWE-1.5 is fastest at 950 tok/sec.
Hardest reasoning, largest codebase, maximum MCP integration Claude Code 80.9% SWE-bench, 1M token context, deepest MCP, Agent Skills, /loop, hooks. The escalation tool when others fail.
Cheapest entry, GitHub-centric team, just getting started GitHub Copilot $10/mo, broadest IDE support, GitHub-native. Ceiling is real but for inline work it's the best value per dollar.
Auditable, spec-driven workflows (regulated industries, compliance) Kiro Only tool with mandatory requirement → design → task checkpoints. Hooks for automation. AWS GovCloud available.
Defined backlog tasks: migrations, test writing, documentation Devin 2.2 67% PR merge rate on scoped tasks. Assign via Slack, review the PR. Don't use on ambiguous or exploratory work.
Maximum OSS features, want autocomplete + agents in one tool Kilo Code Superset of Cline + Roo. 500+ models, Orchestrator mode, Memory Bank, inline autocomplete, JetBrains. Free BYOK.
Git precision, surgical multi-file edits, clean commit history Aider Every edit is a commit. Every session is a branch. $0 with Ollama + Qwen3.5-35B-A3B. Air-gap capable.
Code must never leave your infrastructure Tabby + Cline Tabby on GPU: true air-gap, SSO, RBAC, usage analytics. Point Cline at it. 60% lower cost than SaaS at scale.
Enterprise autonomous agents, need custom agent deployment OpenHands $18.8M backed. SDK for custom agents. Docker sandbox. 50%+ GitHub issue resolution. Deployable on-prem.
Multi-agent browser automation, prototyping, Google ecosystem Antigravity (with caveats) Manager View + Chrome sub-agent are unique. Accept quota opacity and CVEs as preview tax. Not for production repos.

The power-user stack — what most senior developers actually run

1
Cursor or Windsurf as the daily IDE driver — autocomplete, inline edits, fast feedback loops. Windsurf if you care about persistent context; Cursor if you want the most polished multi-agent UI.
2
Claude Code in the terminal as the escalation path — for the hard 20% of problems that require sustained deep reasoning, 1M-token codebase analysis, or /loop automation.
3
Cline or Kilo Code pointed at a local Tabby/Ollama server for sensitive or proprietary repos — agentic tasks without any code leaving your machine.
4
Aider for surgical git-precise edits when you need a clean, reviewable commit history — particularly useful for open-source contributions or regulated environments.
5
AGENTS.md in every repo — the single highest-leverage, zero-cost improvement you can make today. Write your conventions once; every tool that touches the repo will follow them.
© Huy Do 2026