Every week my LinkedIn feed fills with people bragging about how many tokens their engineers burn, or stating don't worry about the tokens this is what productivity is now. Last week, the CEO of Nvidia said your $500K engineer should be spending $250K a year on tokens.
My first reaction was honestly: Am I missing something? I'm on a Claude Ultra account, coding every single day, shipping at a pace I've never shipped before — and I'm nowhere near those numbers. I like never hit my quota. So maybe I'm missing something.
Who Remembers Heroku and the Dyno?
~15 years ago, Heroku democratized cloud computing. You could spin up a Ruby on Rails app and deploy it in minutes. It was revolutionary. They had this concept called a Dyno — a unit of compute with fixed constraints. The smallest one gave you 512MB of RAM.
When you hit your first out-of-memory error, some people saw that as a bug.
It was a feature.
Those constraints taught you how to build proper database indexes. How to paginate. How to be intentional about the context you load into a page. Those guardrails forced you to build solutions that were infinitely scalable and cost-efficient from day one. The skills I learned from working within those constraints shaped how I think about architecture to this day.
I think the ever-growing context windows are going to be a crutch and can lead to bad, inefficient engineering.
So here's what I think is missing from the "look how many tokens we burn" conversation: the word Pragmatic.
Listen, I'm all for using AI to build at speed and at scale. This isn't an either/or.
It's a yes, AND:
- Yes, use AI aggressively — AND think about token efficiency.
- Yes, empower your engineers — AND put patterns and constraints in place.
- Yes, move fast — AND be thoughtful about when you release those constraints.
Constraints aren't the enemy of productivity. They're the foundation of it.