Rebuild to scale.
Don't patch the rot.
The fastest way to a product that holds isn't more fixes on a foundation that can't carry weight. It's the real version, built right, once. Here's how I think about it, and how the work runs.
A vibe-coded or no-code MVP is genuinely valuable: it proves the problem and the experience. It tells you people want this and shows what the product should feel like. That's the hard, ambiguous part, and it's done.
What it usually can't do is scale. The architecture was optimised for speed-to-demo, not for many users, real data, tenancy, or safety. So the moment the product succeeds, it starts fighting you: it crashes in production, leaks data, slows with every user, and every change breaks something else.
At that point, patching is a trap: each fix sits on the same failing foundation, and you pay for it again next month. Rebuild it to scale: keep the proven product and experience, replace the architecture underneath with something production-grade. We start right and set a foundation you can iterate on, one that solves the core technical problem for the volumes you actually expect. Done right, the rebuild beats the original, because now you're building with the answers the MVP gave you.
The fix is rarely the page
or the function. It's two things underneath.
When a product won't scale, the answer is almost never "build a new screen" or "tweak a function." It's two deeper choices, working together: (1) the right tech stack (the languages, frameworks and databases the product is built from) and (2) the right infrastructure to deliver it on, how and where it actually runs.
The two are related but not the same: the technologies live on the infrastructure. Pick the wrong stack and no infrastructure saves you; pick the wrong infrastructure and the best stack still buckles under load. Getting both right is the difference between building websites and building systems. Below are the levers, and the roadmap that comes with them.
Split architecture
Microservices distributed across server solutions so the system scales by part, not all-or-nothing, and one hot path can't take everything down.
Multi-database systems
The right store for each job (relational, vector, graph, cache) instead of forcing one database to do everything badly.
Global distribution
The system deployed geographically close to your users, so latency stays low wherever they are. Performance is an architecture decision.
Server architecture
How compute, queues, workers and data layers fit together: the part that decides whether the product holds at peak or buckles.
Language + tech choices
Picking the stack for the load and the team, not for fashion. The wrong choice here quietly caps the product for years.
Future-proofing
Built so the next ten changes are cheap, not catastrophic. The rebuild should make tomorrow easier, not lock you into today.
A clean monolith or modest split, one region, sensible defaults. Fast to ship, easy to reason about.
Caching, read replicas, queues for the heavy work. The first seams where the architecture is designed to split.
Services split along the hot paths, multi-database by job, autoscaling. Load isolated so one part can't sink the rest.
Globally distributed, deployed near users, multi-region data. A plan that already exists on day one, not a panic later.
Note the unit: concurrent users, not total. A thousand people hitting the system at the same instant can mean a hundred thousand accounts behind them, depending on how often they show up. A million concurrent is giga-scale. The point isn't any single rung. It's that the architecture is built to see all of them from day one, so each jump is a planned move, not an emergency.
Most AI products are
token-guzzling slop. Mine aren't.
We live in a discount era. LLMs are cheap right now, far cheaper than what the tokens actually cost, so most builders reach for the biggest model for every job and never feel the bill. That works until prices normalise. Then a system that wastes tokens becomes expensive as hell to scale, and unprofitable exactly when it starts to win.
I build the opposite way: fast, cheap and scalable. The rule is simple: never use an LLM for what plain executable code solves simply, and never use a more capable model than the problem needs. That isn't being cheap. It's the smart, durable choice, and it's a real differentiator while everyone else ships slop that dies at scale.
Frontier model: rare, high-value work
Crawling, scraping and synthesising large volumes of messy data into something useful. It runs rarely, and the value of the output is high, so paying for the most capable model is the right call here.
Capability earns its costLean model + tools: everyday queries
A routine support question is just "fetch the data and match it." That doesn't need a genius. A lean, cheaper model with the right tool-use answers it perfectly, at a fraction of the cost, every single time.
Right-sized, not under-poweredA cheap model isn't a cheap choice. It's the smart one.
Model routing per task: frontier where the output value is high, lean everywhere else, and plain code wherever an LLM was never needed. That's why "old", lean models still earn their place in my builds. It's deliberate, and it's what keeps a system profitable when you scale it to thousands of concurrent users.
Signals you've hit the wall.
If two or more of these are true, you're past the point where patching pays off.
One senior head beats a team.
To cover all four verticals properly (product, tech, strategy and design) you don't hire one or two people. You hire 7 to 12. And the moment you do, you inherit three structural problems a single operator simply doesn't have.
Diagnosis → Rebuild → Retainer.
Three phases, each with a clear deliverable and a clear decision point. You never commit to the next phase blind.
Architecture review
A focused review of your current product and code. I map where it breaks, why it won't scale, what's worth keeping, and what a rebuild realistically costs in time and scope. You get a written read on the architecture and a concrete plan, useful even if you stop here.
Rebuild to scale
The real version, built right: production-grade architecture, proper auth and data boundaries, multi-tenancy where it's needed, the parts that hold under load. I keep the product and experience you proved and replace the foundation. Weeks, not quarters, because one senior operator spanning product, tech, strategy and design doesn't lose time in handoffs.
Strategic build partner
Optional, ongoing. A sounding board for product and tech decisions, plus continued building as you grow, made with someone who knows the codebase from the inside. This is strategy and development, not ops or on-call. The goal is leverage, not lock-in.
risk
A delivery guarantee, in writing. Designed never to trigger.
Agencies never put a clock on themselves. I do, because after twenty years I know what I can deliver. We agree the spec and the timeline up front, then I stand behind both.
Go past the agreed timeline by any amount and you pay the same amount less. Take twice as long as promised, and the rebuild is on me. The ambition is that it never moves off zero.
Not sure if it's a patch or a rebuild?
That's exactly what the diagnosis answers. Book a 20-minute call and we'll talk through where your product is and what it would take to make it scale. No pitch.