Ralph: Autonomous Coding with Claude Code
Ship features while you sleep. 20 user stories. 20 iterations. Zero babysitting.
I came across Ralph last week and it changed how I think about AI-assisted development.
The idea is simple: instead of babysitting an AI agent through each task, you give it a list of user stories and let it work through them autonomously. Each iteration is a fresh context window. Memory persists through git commits and text files. You wake up to a feature branch with commits ready for review.
Ralph was created by Geoffrey Huntley and popularized by Ryan Carson’s guide. The original implementation uses Cursor, but I adapted it for Claude Code since that’s what I use daily.
Let me show you how it works - and what happened when I used it to build a complete note-taking app.
What Ralph Actually Does
Ralph is a bash loop. That’s it. No framework, no complex orchestration. Just a script that:
Pipes a prompt into Claude Code
Claude picks the next incomplete story from a JSON file
Claude implements it
Claude runs typecheck and tests
If passing, Claude commits and marks the story done
Loop repeats until everything passes
The magic is in the simplicity. Each iteration starts fresh, so context windows stay small. But learnings accumulate in a text file that Claude reads at the start of each iteration.
By story 10, Claude knows the patterns from stories 1-9.
The File Structure
Ralph needs four files:
ralph.sh - The bash loop
prompt.md - Instructions for Claude
prd.json - Your task list
progress.txt - Accumulated learnings
ralph.sh - The bash loop
#!/bin/bash
set -e
MAX_ITERATIONS=${1:-10}
SCRIPT_DIR=”$(cd “$(dirname “${BASH_SOURCE[0]}”)” && pwd)”
echo “Starting Ralph with Claude Code”
for i in $(seq 1 $MAX_ITERATIONS); do
echo “═══ Iteration $i of $MAX_ITERATIONS ═══”
OUTPUT=$(cat “$SCRIPT_DIR/prompt.md” \
| claude --dangerously-skip-permissions 2>&1 \
| tee /dev/stderr) || true
if echo “$OUTPUT” | grep -q “<ralph>COMPLETE</ralph>”; then
echo “All stories complete!”
exit 0
fi
if echo “$OUTPUT” | grep -q “<ralph>STUCK</ralph>”; then
echo “Ralph is stuck! Check prd.json for blocked stories.”
exit 2
fi
sleep 2
done
echo “Max iterations reached”
exit 1The key line is claude --dangerously-skip-permissions. This lets Claude run commands, edit files, and commit without asking for permission each time. Use it responsibly.
The loop checks for <ralph>COMPLETE</ralph> in the output. When Claude finishes all stories, it outputs this signal and the loop exits.
prompt.md: The Instructions
This tells Claude what to do each iteration:
# Ralph Agent Instructions
You are Ralph, an autonomous coding agent. Your job is to implement
user stories one at a time until all are complete.
## Your Task
1. Read `scripts/ralph/prd.json` to get the task list
2. Read `scripts/ralph/progress.txt` for codebase patterns
3. Check you’re on the correct branch (create if needed)
4. Pick the highest priority story where `passes: false`
5. Implement that ONE story only
6. Run typecheck and tests
7. Commit: `feat: [ID] - [Title]`
8. Update prd.json: set `passes: true`
9. Append learnings to progress.txt
## If Stuck
Track failed attempts via the `failedAttempts` field in prd.json.
After 3 failures on a story:
1. Mark it BLOCKED in the notes field
2. Skip to the next story
3. If ALL stories are blocked, output: <ralph>STUCK</ralph>
## Stop Condition
If ALL stories pass, output exactly:
<ralph>COMPLETE</ralph>
Otherwise end normally (Ralph will start a new iteration).The instructions are deliberate about ONE story per iteration. This keeps changes focused and commits atomic.
prd.json: The Task List
Your user stories live here:
{
“projectName”: “Note Taking App”,
“branchName”: “ralph/build-notes-app”,
“userStories”: [
{
“id”: “US-001”,
“title”: “Database schema for notes”,
“acceptanceCriteria”: [
“Notes table with id, title, content, color, pinned, archived”,
“Drizzle schema in src/db/schema.ts”,
“typecheck passes”
],
“priority”: 1,
“passes”: false,
“failedAttempts”: 0,
“notes”: “Colors: yellow, green, blue, pink, purple, gray”
}
]
}Key fields:
priority: Lower number runs firstpasses: Claude sets this totruewhen donefailedAttempts: Tracks retries for stuck handlingacceptanceCriteria: Be explicit. “typecheck passes” should always be there.
progress.txt: The Memory
This is where learnings accumulate:
# Ralph Progress Log
Started: 2025-01-11
## Codebase Patterns
- Using TypeScript strict mode
- Components in src/components/
- API routes in src/app/api/
- Tests use Vitest
## Key Files
- src/db/schema.ts - Database schema
- src/app/page.tsx - Main pageRalph appends to this after each story. Patterns discovered early help with later stories.
The Demo: Building a Notes App
I gave Ralph a real task: build a Google Keep-style note-taking app from scratch.
Tech stack: Next.js 16 (App Router), SQLite + Drizzle ORM, Tailwind CSS, Vitest + Playwright
The stories:
US-001: Database schema for notes
US-002: API: Create note
US-003: API: List notes
US-004: API: Update note
US-005: API: Delete note
US-006: Note card component
US-007: Notes grid layout
US-008: Create note form
US-009: Edit note modal
US-010: Pin/unpin functionality
US-011: Archive functionality
US-012: Search notes
US-013: Delete with confirmation
US-014: E2E test
US-015: Polish and responsive designI kicked it off and walked away.
./scripts/ralph/ralph.sh 25The Results
15 stories completed in 15 iterations. Zero failures. Each commit is focused. Atomic changes that could be reviewed individually.
The app works:
700182d feat: [US-015] - Polish and responsive design
b2cecef feat: [US-014] - E2E test: Create and view note
fe12ea7 feat: [US-013] - Delete with confirmation
c32a106 feat: [US-012] - Search notes
cb505cd feat: [US-011] - Archive functionality
ff70312 feat: [US-010] - Pin/unpin functionality
8274c11 feat: [US-009] - Edit note modal
2ce6240 feat: [US-008] - Create note form
baa209e feat: [US-007] - Notes grid layout
6d1c6be feat: [US-006] - Note card component
d9e130e feat: [US-005] - API: Delete note
24adc59 feat: [US-004] - API: Update note
fda3b03 feat: [US-003] - API: List notes
b44d685 feat: [US-002] - API: Create note
233f890 feat: [US-001] - Database schema for notesAdding a Feature: Checklist Mode
After the initial 15 stories, I pushed Ralph further. Could it add a non-trivial feature to an existing codebase?
I added 5 more stories for Google Keep-style checklists:
US-016: Database schema for checklist items
US-017: API support for checklist notes
US-018: Checklist display in NoteCard
US-019: Create checklist note (toggle mode)
US-020: Edit checklist in modalThis was harder. Ralph had to modify existing schema, update multiple API endpoints, and add conditional rendering.
Result: 5 stories in 5 iterations. All passing.
The app now has a checklist toggle. Notes can be text or checkbox lists. Checked items show strikethrough.
Writing Good Stories
This is where most people stumble.
Keep Stories Small
Each story must fit in one context window.
❌ Too big:
“Build entire auth system”
✅ Right size:
“Add login form”
“Add email validation”
“Add login API call”Be Explicit
Vague criteria lead to vague implementations.
❌ Vague:
“Users can log in”
✅ Explicit:
- Email input with type=”email”
- Password input with type=”password”
- Shows spinner during API call
- Redirects to /dashboard on success
- typecheck passesInclude Verification
Always include:
“typecheck passes”
“tests pass” (if applicable)
Specific behavior to verify
Ralph needs feedback loops to know if implementation worked.
Lessons Learned
Story Size Matters Most - The stories that went smoothest were laser-focused. One API endpoint. One component. One feature.
Explicit Criteria Prevent Ambiguity - Every story included “typecheck passes”. Color hex codes in notes meant exact colors, not guesses.
Progress File is Underrated - By story 10, progress.txt had patterns like “API routes parse JSON body” and “Drizzle uses eq() for WHERE clauses”. Later stories went faster.
Stuck Handling Adds Resilience - After 3 failures, Ralph marks a story BLOCKED and moves on. Better to skip than burn iterations.
One Agent is Enough - I initially planned multi-agent (PM, Developer, Tester). Overkill. Single agent with clear instructions handles everything.
When to Use Ralph
Good for:
Well-defined features
Clear acceptance criteria
Projects with typecheck/tests
Implementation grind
Not good for:
Exploratory work
Major refactors
Security-critical code
Ambiguous requirements
Try It Yourself
The demo project is open source: ralph-nextjs-demo
To use Ralph on your project:
Copy
scripts/ralph/to your projectMake
ralph.shexecutableEdit
prd.jsonwith your storiesInitialize
progress.txtwith known patternsRun
./scripts/ralph/ralph.sh 25Check back later
The first time you see a completed feature branch waiting for review, you’ll get it.
Ralph was created by Geoffrey Huntley. Guide inspired by Ryan Carson.



Lalit, the iterative loop approach with fresh context windows and git-based persistent memory is clever. I've been doing something similar but at a different scale: instead of one agent cycling through tasks sequentially, I run multiple agents in parallel through Claude Code's new Agent Teams. Each agent maintains its own context while sharing project state through the filesystem. The zero-failure rate on 20 user stories is impressive for single-agent work. With Opus 4.6 and 4 agents running simultaneously, the throughput multiplication is significant while maintaining similar reliability. Documented the multi-agent version of this pattern: https://thoughts.jock.pl/p/opus-4-6-agent-experiment-2026