ralph-starter vs doing it manually
I tracked one full week of development. Half the tasks with ralph-starter, half by hand. Same sprint, same project, same me, same coffee intake (a lot).
I tracked one full week of development. Half the tasks with ralph-starter, half by hand. Same sprint, same project, same me, same coffee intake (a lot).
ralph-starter works with multiple coding agents. I use Claude Code for basically everything, but I wanted to actually test the others on real tasks instead of just assuming. So I ran the same task on Claude Code, Cursor, Codex CLI, and OpenCode over the past few weeks. Some surprises, some not.
I spend more time writing specs than writing code now, and my output went up, not down. That genuinely surprised me.
I run ralph-starter a lot. Multiple tasks per day, 3 to 7 loops each. Last month I checked my Anthropic dashboard and the total was $62, almost scrolled past it, but then I looked at the prompt caching line and did the math. Without caching? $109. I saved $47 by literally doing nothing.
Linear is where my team plans work. ralph-starter is where it gets built. I have been running this combo every single day for weeks now, and I want to show you exactly what my workflow looks like.
ralph-starter is a CLI tool that orchestrates AI coding agents in autonomous loops. You give it a task (or point it at a GitHub issue, a Linear ticket, a Notion page), it runs the agent, checks if tests pass, if lint is clean, if build works. If something fails it feeds the error back to the agent and loops again. When everything passes it commits, pushes, and opens a PR.
A designer handed me a Figma file on Friday afternoon with 12 screens for a dashboard. My immediate thought was "cool, that is next week gone." Instead I pointed ralph-starter at it and had working React components before I left for the weekend.
I wanted to write the post I wish existed when I started: how to go from zero to your first automated PR with ralph-starter and Claude Code. No fluff, just the steps.
I keep saying "I type one command and get a PR" and people want to know what actually happens in between. So let me walk you through a real one.
Do you know when you have a GitHub issue with the full spec, and then you open Claude or ChatGPT, copy the issue, paste there, get code back, paste in your editor, run tests, something breaks, go back to chat, paste the error? I was doing this 20 times a day. Twenty. I counted.