I’ve been thinking about where tech is headed, and the more I watch what’s happening with LLMs and vibe-coding tools, the more I feel like we’re misnaming the shift. People keep saying “software development is getting faster.” That’s true, but it’s not the main thing. The main thing is that the whole shape of building software is changing. It’s starting to look less like a craft with handoffs and more like a continuously running production system.
For years, the big question was “What stack are we on?” Now the question everyone asks is “Which model is best?” And I don’t think that’s the right question either. The future doesn’t look like one model dominating everything. It looks like a portfolio. A fast model for routine work, a deeper model for harder reasoning, something specialized for codebases, something strict for security, maybe something tuned for a domain. The winners won’t be the ones who picked the fanciest model. They’ll be the ones who built the best system to route, measure, and constrain whatever model they use.
Vibe coding is the perfect example of why this shift feels both exciting and dangerous. When you can describe what you want and the code appears, it feels like you’ve removed the bottleneck. But it also creates a new imbalance: generating code becomes cheap, and verifying it becomes the real cost. And verification isn’t a “nice-to-have.” It’s the entire game. Most engineering pain was never typing. It was understanding what you’re building, protecting users, preventing regressions, dealing with edge cases, and making sure the product doesn’t fall apart in production.
That’s why I think the SDLC isn’t just “evolving” – it’s bending into a loop.
The old mental model was a line: requirements, design, build, test, deploy, maintain. The new reality is closer to a cycle: intent, generate, verify, ship safely, observe, learn, regenerate. When change becomes easy, you either build control along with speed, or you end up shipping chaos faster than you can understand it. 🙂
There’s a part of this conversation that people avoid saying out loud because it ruins the hype: a lot of “agentic” projects are going to disappoint. Not because the idea is wrong, but because the implementation is backwards. Teams fall in love with output. “Look, it opened a PR.” “Look, it wrote the tests.” “Look, it deployed.” And then a few months later they’re stuck with rising costs, uneven quality, security noise, drifting architecture, and senior engineers spending their lives cleaning up. The problem usually isn’t that the models aren’t smart enough. It’s that the system around them wasn’t built to produce trustworthy results.
The missing layer is what I think of as the trust pipeline. Not a vague sense of “we’ll review it,” but a real set of constraints and evidence that turns AI output into something you can ship without fear. Every meaningful change needs to carry proof: what was intended, what changed, what tests validate it, what risks remain, and how to roll back. The more AI generates, the more “evidence” becomes the real unit of progress. Without evidence, you don’t have speed – you have movement.
Economics is the other layer founders can’t ignore. Once you’re using multiple models and agents, AI isn’t just a productivity story-it’s a spend category. The teams that scale won’t run every task on the “smartest” model by default; they’ll route work the way good systems route compute: cheap-first for routine changes, deeper reasoning only when the risk or complexity demands it. Caching, batching, and reusing context start to matter. And ROI stops being a vague promise and becomes measurable: cycle time reduction, fewer escaped defects, lower incident rates, and less senior-engineer time spent on cleanup. If you can’t connect your AI workflow to those outcomes, you’re not building a factory-you’re just burning tokens.
And if you’re going to let AI touch anything close to production, governance can’t live in slide decks. It has to be real. Permissions need to be explicit. High-risk actions need approvals. There needs to be an audit trail for what happened and why. This isn’t bureaucracy for the sake of it. It’s what allows you to scale output without scaling fragility.
The irony is that as the machines do more of the typing, engineering becomes more human, not less. The valuable work moves up the stack: deciding what to build, setting constraints, designing boundaries, anticipating failure modes, building safe rollout patterns, and learning from production signals. It’s less about grinding through implementation and more about judgment. What’s acceptable risk? What’s the right tradeoff? What’s the simplest thing that won’t create a mess six months from now?
So when I zoom out, I don’t think the future belongs to the teams that can generate the most code. It belongs to the teams that can generate code and still keep their system coherent. The real winners will be the ones who build a software factory that compounds: generation on one side, trust on the other, and a tight feedback loop from production back into better constraints and better tests. In the AI era, writing code is getting easier every month. Proving it’s the right code – and shipping it safely – is where the business will be won.