The Tools Are Good Enough to Build Themselves

This week, Boris Cherny, the creator of Claude Code (and our personal hero) said on X that he used Claude Code to build essentially the entire codebase for Claude Cowork. It wasn’t a demo or a proof of concept, this has just become part of his normal development process.

The ClaudeAI reddit is the coolest club on the internet and an amazing resource for tips and tricks.

When someone at Anthropic—one of the people who actually builds these tools—trusts the tool enough to use it to build the next version of itself, that says something about where we are.

It suggests the tools have crossed a threshold; that they are reliable enough to hand real work to.

What Claude Cowork Actually Is

Cowork takes the kind of agentic workflow developers have been using inside more traditional coding terminals/ interfaces and makes it available to everyone else.

Instead of working in a code editor, you give Claude access to a folder on your computer and tell it what you want done. Claude makes a plan, works through the files, and keeps you updated along the way. It asks before taking any meaningful action, so you can step in or redirect at any point.

A preview of the Claude Cowork interface.

That kind of workflow isn't new. But the fact that it's now usable by people who don't think of themselves as technical is.

Three Things That Are Different Now

Having worked with these tools every day for over a year, we've watched the models get dramatically better in ways that are hard to appreciate unless you're in them constantly. Here are the three biggest changes we've noticed; the ones that are actually unlocking new ways of working for us and the people we work with.

1. The hallucination problem has meaningfully improved

It used to be that you couldn't walk away from an AI agent. You had to watch closely, because it might "helpfully" rewrite code you didn't ask it to touch, or make sweeping changes in files you forgot to think about.

That risk hasn't disappeared entirely, but it's dropped enough that you can finally step away and come back to work that's actually done, not work that's been quietly derailed.

2. The testing loop is real now

Agents don't just write code anymore. They test it, validate behavior against criteria you care about, and catch obvious issues before you ever look at the output.

That feedback loop used to be weak and inconsistent. Now it's good enough that it actually saves time instead of creating more work.

We always ask Claude to test and check its work for quality control. It’s the best intern we’ve ever had.

3. Agents inside everyday products have quietly gotten a lot better

Here’s an example: we rarely write Notion tasks by hand anymore. We just ask the agent to create the task and fill in the fields—title, description, priority, due date.

That might sound small, but think about what it changes. The reason people build elaborate Notion tables for project management isn't because they love setting them up. It's because they want to see information clearly once it's organized. The painful part was always gathering and formatting everything to that level of polish. Now you can just ask, and the structure shows up already filled in.

What This Adds Up To

These examples are small on their own, but together they explain why Claude Code can be used to build Claude Cowork, and why Cowork can be useful to someone who's never opened a terminal.

The models are more accurate, more predictable, and better at staying inside the lines. That reliability changes how you work with them.

When the tools are good enough to build themselves, they're usually good enough to take real tasks off your plate too. And that's why this moment is hard to ignore.

Stay curious,

Julia & Russell

The Tools Are Good Enough to Build Themselves

What Claude Cowork Actually Is

Three Things That Are Different Now

What This Adds Up To

Reply

Keep Reading