Ai on sbgrl.me

Why I continue to review code

sylvain.bougerel@gmail.com (Sylvain Bougerel) — Fri, 01 May 2026 10:01:00 +0800

Given the increase in velocity from coding assistants such as Claude Code and Copilot, automated review had become a necessity. Since we rolled out Coderabbit on our project, almost everybody stopped reviewing code. It didn’t happen immediately, but organically.

As it turns out, Coderabbit is a formidable reviewer: with only a bit of context in the commit messages and a well-written AGENTS.md, Coderabbit will often be more thorough in reviews than most of my colleagues. And if that wasn’t enough, Coderabbit is almost always available—98.3% of the time in the last 30 days, not good for typical SLA targets but much better than colleagues—and a fraction of their cost.

Eventually everybody stopped reviewing code… Save for me.

For one thing, coding assistants can hallucinate blunders that neither humans—regrettably—nor Coderabbit will catch:

const NON_EDITABLE_BOOKING_STATUSES = new Set([
 'cancelled',
 'canceled',
 'declined',
 'withdrawn',
]);

I took that snippet from a merge request submitted yesterday. We do not have any declined or canceled statuses in our entire domain model—they simply don’t exist. I chose this example because it’s one of the simplest. The day before, a merge request had a complete re-implementation of a 200+ lines utility we already had. This happens all the time.

But most importantly, if the phone rings because of an issue on the system, I need to get on the call and fix the issue. When my work has potential to impact my personal life and my weekends with family, there’s no AI that I will trust enough. And every alert, whether it fires on the job or not, still costs the whole team velocity when we scramble for unplanned work.

Ultimately, our customers don’t care who wrote the code: “We’re sorry, Claude wrote that line and Coderabbit didn’t catch the problem” is not a good look. No matter how powerful our tools have become, responsibility is still mine. So I’ll keep on reviewing code.

AI exacerbated the divide between engineers

sylvain.bougerel@gmail.com (Sylvain Bougerel) — Sat, 25 Apr 2026 22:12:00 +0800

Any complex product requires deep context about the product’s domains, architecture, implementation. And if you want to make effective use of AI to solve a problem, you need to inject the right subset of that entire context into the prompt, or that smart auto-complete will spit out costly nonsense.

Naturally, engineers with the best understanding of the context, the capacity to articulate the problem and the skills to implement it are even stronger with AI: they can provide better input to the model and quickly validate its output. The once derided 10x engineer is starting to become a reality for anyone who can use AI to implement a correct solution at a fraction of the time it would have taken them.

On the other end of the spectrum, weaker engineers are now compelled to keep up with the pace set by their stronger peers, only they can neither prompt the LLM effectively nor validate what it produces. So they just let the garbage out—knowingly or not.

If you feel like you’ve become a “man-in-the-middle”—a proxy between somebody else’s request and an LLM—you will be phased out. The great engineers of today learned the ropes at a time AI didn’t exist, when they could invest in their fundamentals. Give yourself the same gift: carve out time to use less AI, and start learning again so you can close the divide.

Claude Code more than doubled my productivity

sylvain.bougerel@gmail.com (Sylvain Bougerel) — Sat, 14 Mar 2026 15:28:00 +0800

I wrote twice as much code last quarter, and it was better. I use Claude Code daily via Steve Molitor’s claude-code.el and monet.el integration for Emacs. Once a review is ready, CodeRabbit handles it. My daily AI usage really kicked into high gear back in October 2025, when we got a large amount of Claude Code credit grants and I could use it as much as I wanted.

The graph below is the number of lines of code changed (added and deleted) in my merge requests (MR), summed monthly, on our main application repository; a Next.js application with 200KLoC — not something an agent can easily get into. I authored 773 merge requests on this project alone which runs in production to serve our clients — just to emphasize the value of the sample-size:

You can see a clear 2x increase when I started to use Claude Code daily around October 2025. Like most folks, I found Claude Code to be amazing when working as a pair programmer to write the code or review my changes in short feedback loops, to explore greenfield projects or to strengthen tests on brownfield projects.

Around that time, I felt the quality of my code improving too. Below is the number of normalized comments \(N\) in each of the MRs: or the number of comments \(C\) divided by the sum of lines of code added \(A\) and deleted \(D\):

\begin{equation} N = \frac{C}{(A + D)} \end{equation}

Back in April 2025, we rolled out CodeRabbit which picked up over 70% of the reviewing work for everyone. Most team members do not perform code reviews anymore, except for myself — I’ll probably explain why in another post. Thus, on the changes I author, over 95% of the comments are just done by CodeRabbit from that time onwards.

Normalized comments is an interesting metric since it reflects the quality of my work with Claude Code, prior to review by others and CodeRabbit. If the value decreased there, it could be that:

We got lazy about reviewing, but CodeRabbit is pretty much the only reviewer of my work;
We output a lot more code or the quality of the code increased;
The work we do is a lot easier, so there’s just fewer issues with it.

AI certainly had an impact on point 2, given the top graph. You can also see a dip soon after October 2025 when I started to use Claude Code, but then a bump appears again in January 2026. That bump is due to the fact that I had to tackle a challenging issue that spanned several MRs. It’s quite visible when looking at the histogram of independent values:

There are other common proxies for code quality: such as unit tests. This is the raw number of unit tests on the main project branch, over time and it went up nearly 5x in the last 4 months:

Additionally, we have not had production rollbacks or outages in the last 4 months, consistent with the past year (We only ever had to perform production rollbacks twice). So agent usage is not called into question in our team.

The stats in this post were extracted using a tool written by Claude Code.