My first week on full AI
This 2026, I started the year going all-in on AI, as is the new kid on the block. Also, if I listen to X, if I don’t do anything, it is going to replace me. Since I am doing it and, to be honest, it is extremely fun, I decided to blog about it. Curiously, I had written about the subject back in in 2023, my general opinion did not change, but it is very interesting to see how AI and my usage developed from those times.
My First Week
Sadly for me, my first task in 2026 was to familiarize myself with a new codebase, because I would be making some changes to it in the coming months. It is not sad per se; however, it will not give me the chance to launch one of those agent orchestrators I read about on X and summon an army of bots that will one-shot my request while I enjoy a tasty cappuccino.
My method of getting familiar with a new codebase is simple: do something with it, implement a feature, fix a bug, or whatever that improves it. To do this, I found a bug that happened in production and tried to solve it or at least reproduce it. The bug was a SEGFAULT in production that was caused because of a data race.
The First Day
Anyway, I downloaded Claude Code (I had been using Codex up to now in my not AI tasks) and got ready to be blown away by its capabilities. To be honest, I was blown away by what I accomplished, BUT I did not find the bug. Although Claude is pretty sure that, even if we did not reproduce it, we had a strong case. I did not agree with that, and my guess is that the reader of a bug report without an actual way to reproduce will not be convinced either.
Many of the cool things that Claude did were:
- find the possible cause of the problem in the code
- set up a test environment in a remote machine
- re-compile the code with debug symbols and an address sanitizer (this needed my help).
- check that the new binary has the correct instrumentation
- change the tests to make finding the failure more likely
- write a beautiful report of our session and our findings
Some of the things it did not do:
- the compilation with address sanitizer was throwing a weird error, so it decided to continue without it. I did a quick Google search and found someone with a similar problem in a forum, found the solution in the forum and told Claude what to do to fix the compilation issue
- consider the possibility that the error might come from a dependency
Sadly, the result of our session was inconclusive. Even so, each time we failed, Claude insisted that even if we could not reproduce the bug, it our evidence was strong enough.
The Second Day
Something I see a lot on X is people saying that you should use different models. So, on the second day, I went against the bug with codex.
From this task, I can conclude that Codex is as competent as Claude. It did the same things and also reused Claude’s notes to start faster. We wrote even more tests, and I tried for it to find something new, but nothing was found.
An interesting difference: Claude is constantly asking to write its findings as an md file. For example, it analyzes the stack trace and tells and then writes a report of that activity. Codex, on the other hand, just prints what it did, and then continues, so certain information can be lost in the wall of text on the screen. Personally, I liked Claude’s approach, so I asked Codex to do the same now and then.
The second day was sadly as unfruitful as the first one, but I learned to work with Codex.
The Third and Fourth Days
The good things about the first and second days are that I got a lot done. Probably, doing the same by myself would have taken me one or two weeks. The productivity gain would have been great had I solved the bug. However, I did not, so it feels like I did not make any progress.
To be fair to both the AI and myself, this bug has happened only once in 4 months, and not even the reason is clear. The reason I chose this bug is that it will force me to look into the details of the code, and then I will understand it better. So, it is probably extremely tricky to reproduce, and doing so requires a deep understanding of what is going on.
So, here we arrive at the problem of day three. Bug not solved, but also very little understanding of the code base. Indeed, I could say I vibe understand what happened and what the agents have been doing, BUT, I do not understand it deeply to do my own reasoning about it.
So, on the third and fourth days, AI agents become my assistants. I use them mostly as ship pilots. They guide me in a place where I am not very familiar, but I am still the captain. I use them mostly with prompts like: show me the entry point of this functionality. Then, I maneuver myself and try to build a mental model of what is happening without their aid.
The Fifth Day
The fifth day was interesting. I needed to review a large piece of a specification. This sadly could not be done with agent and AI was not very helpful. Once again, to judge the correctness of a piece of spec, I need to compile it into my brain, and understand exactly what it is meant by the author. AI might explain a difficult passage, but for the moment I do not think that AI can help me review one of these.
Conclusion
To be honest, this was an exciting experience. I hope that in the coming weeks I will take on distinct challenges using AI. I hope you enjoyed this. See you in the next entry.
