The end of tech-debt?

After Claude Code performed a refactor across our codebase in less than 24 hours and mostly autonomously, I started to wonder if we could end our tech-debt, and put a price on it?

Setting the stage
#

The codebase I’m referring to is that of our main Next.js application. It’s not a small codebase by any measure: 200KLOC, 1500 files, Typescript, Terraform and other languages.

Within this codebase, we use both API route handlers and server functions to fetch or mutate data. Each mechanism fits a specific purpose: an application should make its public API available via route handlers, while it should use server functions for internal mutations only.

In our application, however, we had leftover server functions used only to fetch data — without mutating it. This often resulted in the server function returning stale information, because it was doing something it was not designed for. We needed to migrate them to route handlers.

I had already fixed 4 of them, but I found 18 more that needed addressing.

This is not hard work, but not a dumb task either. Each case requires a new API route, a new contract, properly bounded inputs, validated outputs, updating how the client-side sends requests and handles responses, all while making sure it lints correctly and passes the ~2700 unit tests of our application — a perfect task for an agent.

Ralph to the rescue!
#

To make fixing the first 4 easier, I had already refined and tested a 150-line Claude skill. I knew it was time for a Ralph loop. So I set up the skill as the spec file, provided additional instructions around merge request (MR) submission and handling, and got it going in auto mode.

And it did. After 8 hours of work, it created 18 MRs for me to review, and continued to autonomously address the comments and fix any failures reported by the CI pipeline.

At the point where all 18 MRs were submitted and ready to merge, this task had cost roughly USD$180. I estimate that Claude went roughly 5-10 times faster than I would.

$180 is not a large amount of money for a specific, targeted refactor. We have dozens of those we can do to migrate and improve our codebase. Can we end tech-debt in our codebase?

Then reality caught up
#

After MR 6/18 merged into the main branch, our testing environment went down. It was not hard to notice: even a local test server would not run after rebasing. MR 6/18 contained a route that was not set up properly, and that could only be caught at runtime. It was an easy fix, and it reminded me that I had forgotten a step in my Ralph loop: I quickly instructed it to build comprehensive test cases for us to exhaustively verify the changes.

Testing is going to need improvement, but we knew this; we had already started to experiment with end-to-end tests to tackle this tech-debt. Here again, with great help from Claude, we managed to automate some end-to-end testing. However, in contrast to the current migration, none of the results so far have been satisfactory. This time, it seems our issues have more to do with the design of our application — and the solution isn’t yet clear.

The trouble with the process
#

But the testing environment going down was not actually the biggest friction — it was fixed quickly. The problem was that it took Claude 4 to 5 hours just to merge the remaining 12 MRs after MR 6. 8 hours of productive work, 4 hours of watching paint dry.

We use short-lived development branches against the origin/main trunk, and maintain a semi-linear history on merges (trunk-based development). This means that after each merge, the next merge request needs to be rebased, and the CI pipeline must run again. A rebase + CI cycle takes around 8 to 10 minutes today; and if the pipeline fails or a conflict occurs during rebase, additional CI cycles may need to run.

Every day, we race each other to merge our work. That day, we also raced the 18 MRs the agent had to merge. And our CI pipeline is not going to get faster — if anything it’s going to take more time once we add end-to-end tests.

So, can we end tech-debt?

In short, not all tech-debt scales equally well for agents, especially when the issue may have more to do with application design or the solution is not yet clear — as in our case with end-to-end testing.

And while we can certainly price some migration work, or even execute it concurrently with our feature work, it still needs to clear CI. Agents didn’t eliminate the implementation bottleneck — they moved it to the CI process.

I now find myself thinking about merge trains and other merging strategies to eliminate the CI bottleneck. I’m also wondering whether larger organisations are now experiencing the same pressure on their CI processes, and how this will shape the future of CI and trunk-based development.

Setting the stage#

Ralph to the rescue!#

Then reality caught up#

The trouble with the process#

Setting the stage
#

Ralph to the rescue!
#

Then reality caught up
#

The trouble with the process
#