Ralph: Autonomous Coding with Claude Code

Ship features while you sleep. 20 user stories. 20 iterations. Zero babysitting.

Jan 11, 2026

I came across Ralph last week and it changed how I think about AI-assisted development.

The idea is simple: instead of babysitting an AI agent through each task, you give it a list of user stories and let it work through them autonomously. Each iteration is a fresh context window. Memory persists through git commits and text files. You wake up to a feature branch with commits ready for review.

Ralph was created by Geoffrey Huntley and popularized by Ryan Carson’s guide. The original implementation uses Cursor, but I adapted it for Claude Code since that’s what I use daily.

Let me show you how it works - and what happened when I used it to build a complete note-taking app.

What Ralph Actually Does

Ralph is a bash loop. That’s it. No framework, no complex orchestration. Just a script that:

Pipes a prompt into Claude Code
Claude picks the next incomplete story from a JSON file
Claude implements it
Claude runs typecheck and tests
If passing, Claude commits and marks the story done
Loop repeats until everything passes

The magic is in the simplicity. Each iteration starts fresh, so context windows stay small. But learnings accumulate in a text file that Claude reads at the start of each iteration.

By story 10, Claude knows the patterns from stories 1-9.

The File Structure

Ralph needs four files:

ralph.sh - The bash loop
prompt.md - Instructions for Claude
prd.json - Your task list
progress.txt - Accumulated learnings

ralph.sh - The bash loop

#!/bin/bash
set -e

MAX_ITERATIONS=${1:-10}
SCRIPT_DIR=”$(cd “$(dirname “${BASH_SOURCE[0]}”)” && pwd)”

echo “Starting Ralph with Claude Code”

for i in $(seq 1 $MAX_ITERATIONS); do
    echo “═══ Iteration $i of $MAX_ITERATIONS ═══”

    OUTPUT=$(cat “$SCRIPT_DIR/prompt.md” \
        | claude --dangerously-skip-permissions 2>&1 \
        | tee /dev/stderr) || true

    if echo “$OUTPUT” | grep -q “<ralph>COMPLETE</ralph>”; then
        echo “All stories complete!”
        exit 0
    fi

    if echo “$OUTPUT” | grep -q “<ralph>STUCK</ralph>”; then
        echo “Ralph is stuck! Check prd.json for blocked stories.”
        exit 2
    fi

    sleep 2
done

echo “Max iterations reached”
exit 1

The key line is claude --dangerously-skip-permissions. This lets Claude run commands, edit files, and commit without asking for permission each time. Use it responsibly.

The loop checks for <ralph>COMPLETE</ralph> in the output. When Claude finishes all stories, it outputs this signal and the loop exits.

prompt.md: The Instructions

This tells Claude what to do each iteration:

# Ralph Agent Instructions

You are Ralph, an autonomous coding agent. Your job is to implement
user stories one at a time until all are complete.

## Your Task

1. Read `scripts/ralph/prd.json` to get the task list
2. Read `scripts/ralph/progress.txt` for codebase patterns
3. Check you’re on the correct branch (create if needed)
4. Pick the highest priority story where `passes: false`
5. Implement that ONE story only
6. Run typecheck and tests
7. Commit: `feat: [ID] - [Title]`
8. Update prd.json: set `passes: true`
9. Append learnings to progress.txt

## If Stuck

Track failed attempts via the `failedAttempts` field in prd.json.
After 3 failures on a story:
1. Mark it BLOCKED in the notes field
2. Skip to the next story
3. If ALL stories are blocked, output: <ralph>STUCK</ralph>

## Stop Condition

If ALL stories pass, output exactly:
<ralph>COMPLETE</ralph>

Otherwise end normally (Ralph will start a new iteration).

The instructions are deliberate about ONE story per iteration. This keeps changes focused and commits atomic.

prd.json: The Task List

Your user stories live here:

{
  “projectName”: “Note Taking App”,
  “branchName”: “ralph/build-notes-app”,
  “userStories”: [
    {
      “id”: “US-001”,
      “title”: “Database schema for notes”,
      “acceptanceCriteria”: [
        “Notes table with id, title, content, color, pinned, archived”,
        “Drizzle schema in src/db/schema.ts”,
        “typecheck passes”
      ],
      “priority”: 1,
      “passes”: false,
      “failedAttempts”: 0,
      “notes”: “Colors: yellow, green, blue, pink, purple, gray”
    }
  ]
}

Key fields:

priority: Lower number runs first
passes: Claude sets this to true when done
failedAttempts: Tracks retries for stuck handling
acceptanceCriteria: Be explicit. “typecheck passes” should always be there.

progress.txt: The Memory

This is where learnings accumulate:

# Ralph Progress Log

Started: 2025-01-11

## Codebase Patterns
- Using TypeScript strict mode
- Components in src/components/
- API routes in src/app/api/
- Tests use Vitest

## Key Files
- src/db/schema.ts - Database schema
- src/app/page.tsx - Main page

Ralph appends to this after each story. Patterns discovered early help with later stories.

The Demo: Building a Notes App

I gave Ralph a real task: build a Google Keep-style note-taking app from scratch.

Tech stack: Next.js 16 (App Router), SQLite + Drizzle ORM, Tailwind CSS, Vitest + Playwright

The stories:

US-001: Database schema for notes
US-002: API: Create note
US-003: API: List notes
US-004: API: Update note
US-005: API: Delete note
US-006: Note card component
US-007: Notes grid layout
US-008: Create note form
US-009: Edit note modal
US-010: Pin/unpin functionality
US-011: Archive functionality
US-012: Search notes
US-013: Delete with confirmation
US-014: E2E test
US-015: Polish and responsive design

I kicked it off and walked away.

./scripts/ralph/ralph.sh 25

The Results

15 stories completed in 15 iterations. Zero failures. Each commit is focused. Atomic changes that could be reviewed individually.

The app works:

700182d feat: [US-015] - Polish and responsive design
b2cecef feat: [US-014] - E2E test: Create and view note
fe12ea7 feat: [US-013] - Delete with confirmation
c32a106 feat: [US-012] - Search notes
cb505cd feat: [US-011] - Archive functionality
ff70312 feat: [US-010] - Pin/unpin functionality
8274c11 feat: [US-009] - Edit note modal
2ce6240 feat: [US-008] - Create note form
baa209e feat: [US-007] - Notes grid layout
6d1c6be feat: [US-006] - Note card component
d9e130e feat: [US-005] - API: Delete note
24adc59 feat: [US-004] - API: Update note
fda3b03 feat: [US-003] - API: List notes
b44d685 feat: [US-002] - API: Create note
233f890 feat: [US-001] - Database schema for notes

Adding a Feature: Checklist Mode

After the initial 15 stories, I pushed Ralph further. Could it add a non-trivial feature to an existing codebase?

I added 5 more stories for Google Keep-style checklists:

US-016: Database schema for checklist items
US-017: API support for checklist notes
US-018: Checklist display in NoteCard
US-019: Create checklist note (toggle mode)
US-020: Edit checklist in modal

This was harder. Ralph had to modify existing schema, update multiple API endpoints, and add conditional rendering.

Result: 5 stories in 5 iterations. All passing.

The app now has a checklist toggle. Notes can be text or checkbox lists. Checked items show strikethrough.

Writing Good Stories

This is where most people stumble.

Keep Stories Small

Each story must fit in one context window.

❌ Too big:
“Build entire auth system”

✅ Right size:
“Add login form”
“Add email validation”
“Add login API call”

Be Explicit

Vague criteria lead to vague implementations.

❌ Vague:
“Users can log in”

✅ Explicit:
- Email input with type=”email”
- Password input with type=”password”
- Shows spinner during API call
- Redirects to /dashboard on success
- typecheck passes

Include Verification

Always include:

“typecheck passes”
“tests pass” (if applicable)
Specific behavior to verify

Ralph needs feedback loops to know if implementation worked.

Lessons Learned

Story Size Matters Most - The stories that went smoothest were laser-focused. One API endpoint. One component. One feature.
Explicit Criteria Prevent Ambiguity - Every story included “typecheck passes”. Color hex codes in notes meant exact colors, not guesses.
Progress File is Underrated - By story 10, progress.txt had patterns like “API routes parse JSON body” and “Drizzle uses eq() for WHERE clauses”. Later stories went faster.
Stuck Handling Adds Resilience - After 3 failures, Ralph marks a story BLOCKED and moves on. Better to skip than burn iterations.
One Agent is Enough - I initially planned multi-agent (PM, Developer, Tester). Overkill. Single agent with clear instructions handles everything.

When to Use Ralph

Good for:

Well-defined features
Clear acceptance criteria
Projects with typecheck/tests
Implementation grind

Not good for:

Exploratory work
Major refactors
Security-critical code
Ambiguous requirements

Try It Yourself

The demo project is open source: ralph-nextjs-demo

To use Ralph on your project:

Copy scripts/ralph/ to your project
Make ralph.sh executable
Edit prd.json with your stories
Initialize progress.txt with known patterns
Run ./scripts/ralph/ralph.sh 25
Check back later

The first time you see a completed feature branch waiting for review, you’ll get it.

Ralph was created by Geoffrey Huntley. Guide inspired by Ryan Carson.

Pawel Jozefiak

Feb 6

Lalit, the iterative loop approach with fresh context windows and git-based persistent memory is clever. I've been doing something similar but at a different scale: instead of one agent cycling through tasks sequentially, I run multiple agents in parallel through Claude Code's new Agent Teams. Each agent maintains its own context while sharing project state through the filesystem. The zero-failure rate on 20 user stories is impressive for single-agent work. With Opus 4.6 and 4 agents running simultaneously, the throughput multiplication is significant while maintaining similar reliability. Documented the multi-agent version of this pattern: https://thoughts.jock.pl/p/opus-4-6-agent-experiment-2026

The Engineer's Lens

Discussion about this post

Ready for more?