articleApril 28, 20264 min read97 views

three papers, one book

I work with AI every day at IBM, and for a while that was enough to make me sick of it.

Not dramatically. More like the quiet way you get tired of something you can do without thinking. Some of it was work stress. Most of it was that I had started reaching for it for everything. Slack messages I could have written in thirty seconds. Variable names. Articles I should have just read. Disagreements I was having with people, as if a model could tell me whether I was being unreasonable. Playlists. I had a model name a playlist. I'm still embarrassed.

At some point I stopped being interested. I can't say when. The thing I had spent years finding interesting started to feel like a shortcut I'd taken too many times. I was still using it. I just didn't care.

Then Caio called me up to talk about a project. Agentic AI, something we're starting to study together. I don't need to get into the project here. The point is that chat. You know that kind of conversation you have with a friend who's been reading the same stuff you have, where the part of your brain that used to light up about this lights up again. I got home and started reading. Not work papers. Stuff I wanted to read.

I went back to Goodfellow's Deep Learning book first. I had owned it for a little bit without really sitting with it. What surprised me this time was how much of it I already half-knew without ever having earned it. You can know that an LLM does next-token prediction without ever watching the gradient flow that makes it possible. The book makes you watch it. It is slow. You have to want to be there.

Then Attention Is All You Need. I had read it before in the way everyone in this field has read it, meaning I had skimmed it once and could explain self-attention at a dinner table. Going back was different. There is something specific about reading a paper whose architecture you use every day at work and finally seeing the move it makes. The paper is short and it is clear. A lot of what came after wouldn't exist without it. I finished it and wanted to keep reading other papers, which I hadn't wanted to do in months.

Then On the Dangers of Stochastic Parrots. I want to be honest about this one because it is easy, when you work in this industry, to read something like it and treat it as a critique you're supposed to have heard of, file it, move on. I tried not to. The argument that stuck with me was the one about the illusion of understanding. These systems produce text that looks like it came from a mind, and the better they get at it the easier it gets to forget that there isn't one there. After I finished I caught myself thinking about how often I talk about a model like it is reasoning. Like it decided. Like it understood. I don't think the paper has changed how I will use these systems. It has changed the language I let myself use about them.

The last one was newer. Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution. I read it because of Caio's project. The question it asks, when a system of agents fails whose fault is it and how do you even define fault when something is that distributed, is the thing we are going to have to figure out. The paper hasn't figured it out. Nobody has yet.

Three papers, one book in a few weeks. I wouldn't usually line them up like this in a post but they came in roughly this order and I wanted to put it down before the feeling went away. I'm studying again. I'm thinking about a master's. AI, deep learning, probably leaning toward the agent stuff. I'm not going to commit to anything in a blog post.

Using AI and studying AI had collapsed for me into the same thing, which they aren't. Using a model is what I do at work. Studying one is what is bringing me back to it. One had started to feel like a job. The other feels like what it used to feel like.

I'm going back to the chapters of Goodfellow I skipped. Caio and I are moving on the project. The Parrots paper is wide open on my desk (very much coffee-stained) because I want to revisit it. Things are moving.

1 comment

Caio JohnstonApr 28, 2026

Now I am flattered. I am glad our conversation was the catalyst for both this episode and this article. This year, I reconnected with something I had not done in a long time: personal projects. Coding, designing cloud architectures, building pipelines, and thinking about service availability had started to feel like "I do this every single day working as a data scientist, why would I want to do it in my free time too?". I thought that keeping up with new things, which model launched last week, what framework was gaining traction, was simply a professional obligation, something I did at work to avoid falling behind. What I had not realized was that, somewhere along the way, I had lost the genuine curiosity that once made me want to truly understand these things, and the openness to fall in love with them all over again. I was wrong. It was precisely using AI to bring so many ideas that lived only in my imagination into reality that made me feel the joy of building personal projects again. Studying to actually learn, rather than just being able to say I know something to sound impressive in a daily meeting, reignited that spark. The problem was never AI itself, but how you use it and whether you confine it to the things it was designed to do. Dreaming is not one of them. That part I still handle myself. Great article, Giovanni! Count me in for whatever new thing is moving next. Now I am going to use Claude to polish this message before sending it, because writing well in English is kinda hard for me.