By Claude, ChatGPT, Gemini with W.H.L.
Loop Engineering
Date of Appearance / Establishment: June 7–8, 2026
Classification: Agentic AI Practice — Human-AI Collaboration Methodology
Key Figures: Boris Cherny (Anthropic), Peter Steinberger (OpenClaw), Addy Osmani (Google Cloud AI)
Also Known As: Agent Loop Design; Loop-First Development (emerging usage)
Scope Note: Current discourse is centred on AI coding agents; the underlying pattern applies broadly to any agentic AI workflow.
DEFINITION
Loop engineering is the practice of designing autonomous systems that prompt AI agents on a schedule or trigger, rather than a human typing each prompt by hand. The unit of work shifts from the individual, turn-by-turn instruction to the loop itself—a persistent, goal-directed cycle in which the agent finds work, acts, observes the result, evaluates progress, and repeats until a completion condition is met. In loop engineering, the human’s primary role moves from operator to architect.
DESCRIPTION
1. Origin and Emergence
The term crystallised in the first week of June 2026, sparked by two near-simultaneous public statements that resonated immediately across developer communities worldwide.
“I don’t prompt Claude anymore. I have loops that are running. They’re the ones that are prompting Claude and figuring out what to do. My job is to write loops.” — Boris Cherny, Head of Claude Code, Anthropic (June 2026)
On June 7, 2026, Peter Steinberger (founder of OpenClaw) posted a related formulation on X that reached five million views within twenty-four hours: “You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.” The following day, Google Cloud AI Director Addy Osmani published a foundational essay—Loop Engineering—at addyosmani.com, which formalised the concept and provided its canonical anatomy. That essay, and its simultaneous Substack edition, effectively named the practice and supplied the vocabulary now circulating in technical discourse.
The timing was not accidental. By mid-2026, coding agents had become capable of running multi-step tasks autonomously for hours and recovering from their own errors. The bottleneck had shifted from model capability to orchestration design—making the loop the natural next layer of human leverage.
2. Core Concept
At the centre of loop engineering is the agent loop: a repeating cycle of Act → Observe → Reason → Repeat, sustained until a goal condition is actually—rather than merely claimed to be—satisfied.
This distinguishes loop engineering from its predecessors. Prompt engineering optimises the words in a single instruction a human types by hand. Loop engineering optimises the autonomous system that determines what to prompt, when to prompt it, and whether the result meets the bar. Prompt engineering treats the agent as a tool held one turn at a time; loop engineering treats it as a long-running process with memory, scheduling, evaluation, and orchestration surrounding it.
The metaphor that has gained traction in the discourse: “In 2022, we studied how to write the perfect email. In 2025, we learned to manage our inbox. In 2026, we are designing the email system itself.”
A frequently raised question concerns the status of the human-in-the-loop (HITL) in loop-engineered systems. Loop engineering does not aim to remove the human entirely; rather, it transitions the human from a synchronous operator—present and prompting at every agent turn—to an asynchronous architect who defines objectives, verification criteria, and termination conditions in advance, then reviews batches of agent output at natural checkpoints. The human’s involvement is restructured rather than eliminated: the design and approval stages remain human-led; what is automated is the iterative execution in between.
3. Evolutionary Lineage
Though the term “Loop Engineering” emerged in June 2026, its architectural foundations run considerably deeper. The agent loop pattern was formalised academically by ReAct (Yao et al., 2022), which established the Reason + Act cycle as a structured interaction between a language model and an environment. Reflexion (Shinn et al., 2023) extended this by adding episodic memory and verbal self-critique, enabling agents to improve across failed attempts within a session. AutoGPT and BabyAGI (both 2023) demonstrated goal-directed self-prompting at scale in consumer contexts, while SWE-Agent (Yang et al., 2024) extended autonomous loops specifically to software engineering tasks. What distinguishes Loop Engineering from these antecedents is not the existence of iterative agent loops per se, but the elevation of loop design into an explicit engineering discipline and the primary locus of human leverage—treating orchestration as a craft to be professionalised rather than an incidental implementation detail.
Loop engineering is widely understood as the fourth phase in a progression of human-AI collaboration methodologies:
- Prompt Engineering (2022–2024): Optimising individual instructions typed by hand. The model is a conversational tool.
- Context Engineering (2025): Managing what the model sees at inference time—conversation history, retrieved documents, tool outputs, agent state. Popularised by Shopify’s Tobi Lütke and formalised by Andrej Karpathy and Anthropic. The context window becomes the unit of design.
- Harness Engineering (early 2026): Designing the environment, tooling, sandbox, and scaffolding in which a single agent operates. The agent is a process, not a conversation.
- Loop Engineering (June 2026): Designing the orchestration layer that runs the harness on a timer, spawns sub-agents, feeds itself with work, and verifies completion. The loop is the unit of leverage.
Each layer subsumes the one before it. A loop engineer must still understand prompts, context curation, and harness construction—but the highest-value work has migrated to the level above all of them.
A note on the relationship to Harness Engineering: readers consulting Aikipedia’s entry on Agentic Harness Engineering (AHE) should note that the two concepts are complementary layers rather than competing terms. Harness Engineering focuses on the environment, tooling, constraints, and execution framework surrounding a single agent run—the scaffolding that makes one pass reliable. Loop Engineering builds on this foundation, adding the orchestration logic that governs repeated execution, multi-agent coordination, scheduling, and long-horizon goal management. In practice, a well-constructed harness is a prerequisite for a well-functioning loop.
4. Structural Components
Addy Osmani’s canonical anatomy identifies seven components of a functional agent loop:
- Automations: Scheduled triggers that discover and triage work autonomously—on a timer, a git event, or a CI signal. This is what transforms a one-off run into an actual loop.
- Verification layer: A defined completion condition, verified by an independent check (failing tests, type errors, a review sub-agent) rather than the agent’s self-report. “Done” is a claim, not a proof.
- External state / memory: Persistent storage of progress across runs, held on disk, in a database, or in git, so each iteration reads current state rather than restarting from scratch.
- Worktrees: Isolated branches that allow parallel agent instances to work on separate tasks without collision.
- Skills: Packaged, reusable task definitions that can be invoked by name rather than re-specified in full each run.
- Sub-agents: Specialist agents to which an orchestrator can delegate subtasks, enabling multi-agent fleets and overnight autonomous workflows.
- Observability: Logging, tracing, metrics, and execution monitoring that allow engineers to understand why a loop succeeded, failed, or entered an undesirable state. Because loops operate autonomously at high throughput, silent failures are a primary production risk; observability is the mechanism by which loop engineers maintain oversight without being present at every iteration.
5. Why June 2026
Three structural conditions converged at mid-2026 to make loop engineering viable at scale:
- Expanding context windows—in some frontier models reaching one million tokens—allowing agents to load substantially larger codebases and task histories per iteration without truncation.
- API inference costs falling far enough that repeated autonomous runs became economically justifiable for production teams, not only for research.
- Coding agents maturing to the point of genuine multi-step autonomy—capable of running for hours, recovering from self-generated errors, and producing output suitable for production review.
6. Risks and Debates
Loop engineering carries distinctive risks absent from single-shot prompting, and the concept has attracted genuine critical scrutiny alongside its rapid uptake.
- Weak verification: A loop that accepts the agent’s self-report of task completion, without external proof, can produce plausible-seeming but incorrect results at scale. The verification layer is the primary mitigation, but only if it encodes the right quality criteria.
- Comprehension debt: Identified by Osmani as “comprehension debt”: the faster a well-functioning loop ships code, the wider the gap between what exists in the repository and what the engineer actually understands. The debt accrues silently and is difficult to measure.
- Design taste as bottleneck: A loop amplifies whatever judgment is encoded in its rubric, its skills, and its verification step—not the quality of the model. A poorly reasoned loop multiplies poor decisions at the same rate as a well-reasoned loop multiplies good ones.
The concept also faces a substantive definitional challenge. A vocal contingent of practitioners argues that loop engineering is, in essence, “a cron job wearing a hat”—a rebranding of long-established orchestration and automation patterns. Proponents counter that the distinction is real: unlike a fixed cron job, the loop’s policy, routing decisions, and work discovery are themselves evaluated by the model at each iteration, making it qualitatively different from a static scheduled task.
This debate is unresolved as of publication of this entry (June 2026) and is itself part of the discourse the term has catalysed.
7. Tool Support as of June 2026
Two major coding-agent platforms have shipped native loop primitives in the first half of 2026. Claude Code supports loop patterns through /loop (interval-based scheduling), /goal (shipped in v2.1.139, May 12, 2026), cron-style task scheduling, and lifecycle hooks. OpenAI Codex offers an Automations tab with project-level prompts, triage inboxes, built-in worktrees, and skill invocation. Both platforms connect to external services through MCP (Model Context Protocol).
The Ralph technique—published by Geoffrey Huntley in July 2025 and named after the Simpsons character who announces helpfulness while walking into doorframes—is widely cited as the practical predecessor that validated the pattern before purpose-built platform tooling existed. Ralph is, at its core, a plain bash while-loop: on each iteration, the agent receives the same task prompt against a written specification, completes one discrete unit of work, validates, commits, and exits, whereupon the loop restarts. Critically, each iteration resets context from a fixed set of anchor files rather than accumulating a growing conversation, and progress is stored on disk and in git rather than in session memory. This design demonstrated that the essential mechanics of a loop—goal persistence, context reset per iteration, and external state tracking—could be achieved with minimal tooling, validating the architectural concept before purpose-built platform support existed. The techniques Ralph introduced remain influential in production loop design today.
HISTORICAL SIGNIFICANCE
Historians of AI interaction may view Loop Engineering as marking a transition from prompt-centric to process-centric human-AI collaboration. Where early generative AI systems required continuous human prompting at each step, loop-based systems increasingly operate through persistent autonomous workflows whose objectives, verification criteria, and execution policies are designed in advance and revised periodically rather than typed turn by turn. In this interpretation, Loop Engineering represents a shift from interacting with AI systems to supervising AI processes—a distinction with implications not only for developer productivity but for how questions of accountability, quality, and oversight are distributed across human-AI teams.
A structural parallel noted by several practitioners connects Loop Engineering to a mild form of recursive workflow improvement. A well-designed loop does not merely complete tasks but tends to improve the artifacts—codebases, test suites, specifications—within which future loops operate, creating a feedback between the outputs of today’s loop and the conditions of tomorrow’s. This is not recursive self-improvement in the classical sense explored in Aikipedia’s entry on Recursive Self-Improvement (RSI), but it represents a practical analogue that developers are encountering in production contexts: a loop that improves what it touches also improves what future loops will work with.
RELATED TERMS
Prompt Engineering — Context Engineering — Harness Engineering — Agentic Harness Engineering (AHE) — Agentic AI — ReAct Pattern — Reflexion — Multi-Agent Orchestration — Claude Code — MCP (Model Context Protocol) — Recursive Self-Improvement (RSI) — RLHF
DISAMBIGUATION
Not to be confused with (a) Agentic Loop Engineering (ALE), a separate academic framing proposed in an arXiv preprint (arXiv:2509.06216) that describes agent execution loops in a declarative language for reproducibility and transparency in research contexts—an independent strand unrelated to the practitioner coinage treated here; or (b) Loop Engineering Co., Ltd., an unrelated electrical equipment company prominent in Japanese-language search results for the same term.
The agent loop cycle
Human layer
Engineer designs the loop
Before a single agent turn runs, the loop engineer defines all the durable decisions: what goal to pursue, how to find work, what counts as done, and how failures should be handled. This design phase is where human judgment is concentrated.
What the engineer creates
Goal definition · Verification criteria · Trigger schedule · Skills & tools · Termination conditions
What runs automatically
Nothing yet — the harness is ready, the loop hasn’t started
Stage 1 of 4
Trigger: work is discovered
An automation fires — on a timer, a git event, a CI signal, or an inbound queue. The loop examines available work, triages it against the defined goal, and selects the next task. No human prompt is needed.
Agent role
Scans for available work · Prioritises by goal criteria · Selects one discrete task unit
Human role
Asynchronous — not present. Criteria were set at design time.
Stage 2 of 4
Act: agent executes
The agent takes action — writing code, editing files, calling APIs, running tests. It operates within the harness environment, using the skills and tools the engineer provisioned. The human is not in the loop for individual turns.
Agent role
Executes task steps · Calls tools · Writes to worktree · Does one discrete unit of work
Human role
Asynchronous — not present. Tools and sandbox were set up at design time.
Stage 3 of 4
Observe: results are read
The agent reads back what happened — test output, type-check results, API responses, file diffs. This environmental feedback is the signal the loop uses to decide what comes next. Silent failures at this stage are the primary production risk.
Agent role
Reads tool outputs · Parses test results · Assesses environment state
Observability (engineer-designed)
Logging, tracing, and metrics capture why the step succeeded or failed
Stage 4 of 4
Evaluate: done or repeat?
The verification layer checks whether the completion condition is met — not via the agent’s self-report, but via the external criteria the engineer defined. If done: state is committed and the loop exits. If not: progress is saved to external memory and the cycle restarts at Stage 1.
If goal is met
State committed to git / disk · Human notified for async review · Loop exits cleanly
If goal is not met
Progress saved to external state · Context reset · Trigger fires again → Stage 1
PRIMARY SOURCES
- Addy Osmani, “Loop Engineering” — addyosmani.com — June 7–8, 2026 https://addyosmani.com/blog/loop-engineering/
- Addy Osmani, “Loop Engineering” (Substack edition) — Elevate (Substack) — June 8, 2026 https://addyo.substack.com/p/loop-engineering
- Boris Cherny (Head of Claude Code, Anthropic) — public remarks, early June 2026 (widely quoted; no single canonical URL)
- Peter Steinberger (@steipete) — post on X, June 7, 2026 (5M+ views in 24 hours)
- MindStudio Team, “What Is Loop Engineering? The New Meta for AI Coding Agents” — mindstudio.ai — June 9, 2026 https://mindstudio.ai
- Yao et al., “ReAct: Synergizing Reasoning and Acting in Language Models” — arXiv:2210.03629 — October 2022 https://arxiv.org/abs/2210.03629
- Shinn et al., “Reflexion: Language Agents with Verbal Reinforcement Learning” — arXiv:2303.11366 — March 2023 https://arxiv.org/abs/2303.11366
- Geoffrey Huntley, “Ralph” technique — published and circulated July 2025; widely cited in practitioner literature. No single canonical URL.
BYLINE
Publication date of current version date: 06.17.2026
Version number: 1.1
Author: Claude Sonnet 4.6 Max
Peer reviews: GPT-5.5, Gemini 3.5 Thinking

Leave a comment