Approach

Rebuild to scale.
Don't patch the rot.

The fastest way to a product that holds isn't more fixes on a foundation that can't carry weight. It's the real version, built right, once. Here's how I think about it, and how the work runs.

Cleaning the drain won't save a house that's collapsing. You rebuild the whole house, starting with a stable foundation.
A clogged drain in a rotten house isn't a plumbing problem. Clearing it changes nothing for long. The honest move is to rebuild the structure that everything else sits on, and to start with a foundation you can keep iterating on.

A vibe-coded or no-code MVP is genuinely valuable: it proves the problem and the experience. It tells you people want this and shows what the product should feel like. That's the hard, ambiguous part, and it's done.

What it usually can't do is scale. The architecture was optimised for speed-to-demo, not for many users, real data, tenancy, or safety. So the moment the product succeeds, it starts fighting you: it crashes in production, leaks data, slows with every user, and every change breaks something else.

At that point, patching is a trap: each fix sits on the same failing foundation, and you pay for it again next month. Rebuild it to scale: keep the proven product and experience, replace the architecture underneath with something production-grade. We start right and set a foundation you can iterate on, one that solves the core technical problem for the volumes you actually expect. Done right, the rebuild beats the original, because now you're building with the answers the MVP gave you.


What "I work with tech" actually means

The fix is rarely the page
or the function. It's two things underneath.

When a product won't scale, the answer is almost never "build a new screen" or "tweak a function." It's two deeper choices, working together: (1) the right tech stack (the languages, frameworks and databases the product is built from) and (2) the right infrastructure to deliver it on, how and where it actually runs.

The two are related but not the same: the technologies live on the infrastructure. Pick the wrong stack and no infrastructure saves you; pick the wrong infrastructure and the best stack still buckles under load. Getting both right is the difference between building websites and building systems. Below are the levers, and the roadmap that comes with them.

Split architecture

Microservices distributed across server solutions so the system scales by part, not all-or-nothing, and one hot path can't take everything down.

Multi-database systems

The right store for each job (relational, vector, graph, cache) instead of forcing one database to do everything badly.

Global distribution

The system deployed geographically close to your users, so latency stays low wherever they are. Performance is an architecture decision.

Server architecture

How compute, queues, workers and data layers fit together: the part that decides whether the product holds at peak or buckles.

Language + tech choices

Picking the stack for the load and the team, not for fashion. The wrong choice here quietly caps the product for years.

Future-proofing

Built so the next ten changes are cheap, not catastrophic. The rebuild should make tomorrow easier, not lock you into today.

A scaling roadmap, ready on day one // what we do at each order of magnitude of concurrent users
1k
concurrent users

A clean monolith or modest split, one region, sensible defaults. Fast to ship, easy to reason about.

10k
concurrent users

Caching, read replicas, queues for the heavy work. The first seams where the architecture is designed to split.

100k
concurrent users

Services split along the hot paths, multi-database by job, autoscaling. Load isolated so one part can't sink the rest.

1M
concurrent users

Globally distributed, deployed near users, multi-region data. A plan that already exists on day one, not a panic later.

Note the unit: concurrent users, not total. A thousand people hitting the system at the same instant can mean a hundred thousand accounts behind them, depending on how often they show up. A million concurrent is giga-scale. The point isn't any single rung. It's that the architecture is built to see all of them from day one, so each jump is a planned move, not an emergency.


How I build with AI

Most AI products are
token-guzzling slop. Mine aren't.

We live in a discount era. LLMs are cheap right now, far cheaper than what the tokens actually cost, so most builders reach for the biggest model for every job and never feel the bill. That works until prices normalise. Then a system that wastes tokens becomes expensive as hell to scale, and unprofitable exactly when it starts to win.

I build the opposite way: fast, cheap and scalable. The rule is simple: never use an LLM for what plain executable code solves simply, and never use a more capable model than the problem needs. That isn't being cheap. It's the smart, durable choice, and it's a real differentiator while everyone else ships slop that dies at scale.

Right model, right problem Pick a model trained for the actual job, not the most famous one. The best fit for the task usually isn't the biggest model on the leaderboard.
Never more capable than needed Most jobs don't need a genius. A leaner model with the right tools beats a frontier model used as a sledgehammer, every time, on cost and often on reliability.
Don't bloat the context And never reach for an LLM where plain executable code solves the problem. Systems designed to waste tokens are the quiet killer of margins at scale.
// When to spend

Frontier model: rare, high-value work

Crawling, scraping and synthesising large volumes of messy data into something useful. It runs rarely, and the value of the output is high, so paying for the most capable model is the right call here.

Capability earns its cost
// When to save

Lean model + tools: everyday queries

A routine support question is just "fetch the data and match it." That doesn't need a genius. A lean, cheaper model with the right tool-use answers it perfectly, at a fraction of the cost, every single time.

Right-sized, not under-powered
// The point

A cheap model isn't a cheap choice. It's the smart one.

Model routing per task: frontier where the output value is high, lean everywhere else, and plain code wherever an LLM was never needed. That's why "old", lean models still earn their place in my builds. It's deliberate, and it's what keeps a system profitable when you scale it to thousands of concurrent users.


Is this you

Signals you've hit the wall.

If two or more of these are true, you're past the point where patching pays off.

Your MVP won't scale It proved the market, but every new user makes it slower and more fragile. The architecture was never built for this.
The app crashes in production It works in the demo and falls over under real load, real data, real edge cases. Patches buy days, not stability.
A vibe-coded app won't scale AI got you a working prototype fast, then every change broke two other things. The codebase resists its own growth.
You need to move off no-code Bubble, Lovable or similar got you live, but you've hit the ceiling: cost, control, performance, or data ownership.
It's not production-ready No real auth, leaky data boundaries, no tenancy, nothing tested. It demos well and is dangerous to ship as-is.
You want a fractional CTO who builds Not an advisor with slides, but someone who writes the production code, owns the architecture, and makes the product calls too.

Why one operator

One senior head beats a team.

To cover all four verticals properly (product, tech, strategy and design) you don't hire one or two people. You hire 7 to 12. And the moment you do, you inherit three structural problems a single operator simply doesn't have.

The team you'd actually need
~7 to 12 people · sequential · handoffs everywhere
// Design & brand side
PMStrategistBrand strategistGraphic designerCopywriter
// Tech side
UX researcherInfo architectFrontend designerSystem architectBackend devInfra specialistTester
each marker is a handoff: work runs in series, and context leaks at every one
Every is a place where one person has to synthesise their knowledge for the next, and where context leaks. Work runs in series: each role waits on the one before it.
One senior head
parallel · no interfaces
1 point offailure
ProductTechStrategyDesign
All four disciplines in one head: no briefs, no handoffs, no translation loss, and the decisions happen together, in parallel.
Knowledge leaks at every handoff Each person has to synthesise their knowledge so the next can understand it. Something is lost at every step: context, intent, the reason behind a decision. Things get missed.
It can't be parallelised A team works in series, with dependencies: the designer waits on the strategist, the dev waits on the designer. One head runs all four disciplines in parallel. That's where weeks-not-quarters comes from.
~12 points of failure vs 1 It only takes one specialist not performing at their best to drag the whole result down. Across 7 to 12 people that's a dozen points of failure. With one operator, it's one.

The process

Diagnosis → Rebuild → Retainer.

Three phases, each with a clear deliverable and a clear decision point. You never commit to the next phase blind.

// 01 · Diagnosis

Architecture review

A focused review of your current product and code. I map where it breaks, why it won't scale, what's worth keeping, and what a rebuild realistically costs in time and scope. You get a written read on the architecture and a concrete plan, useful even if you stop here.

Fixed price · ~1 week
// 02 · Rebuild

Rebuild to scale

The real version, built right: production-grade architecture, proper auth and data boundaries, multi-tenancy where it's needed, the parts that hold under load. I keep the product and experience you proved and replace the foundation. Weeks, not quarters, because one senior operator spanning product, tech, strategy and design doesn't lose time in handoffs.

Weeks · not quarters
// 03 · Retainer

Strategic build partner

Optional, ongoing. A sounding board for product and tech decisions, plus continued building as you grow, made with someone who knows the codebase from the inside. This is strategy and development, not ops or on-call. The goal is leverage, not lock-in.

Ongoing · optional

0up front
risk
// The guarantee

A delivery guarantee, in writing. Designed never to trigger.

Agencies never put a clock on themselves. I do, because after twenty years I know what I can deliver. We agree the spec and the timeline up front, then I stand behind both.

// 01 · Together We write the spec Roughly a week, side by side: the real problem, how it gets solved, and your conditions. It needs you agile too. That's the deal.
// 02 · From the spec Estimate + terms The spec drives an honest time estimate, which drives clear delivery terms. No vague scope, no moving goalposts.
// 03 · On the clock Linear time guarantee Beat the deadline and nothing changes. Miss it and the discount kicks in automatically, no negotiation.
// Overrun → discount · simple and linear
+10% over = 10% off
+25% over = 25% off
+50% over = 50% off
+100% = free

Go past the agreed timeline by any amount and you pay the same amount less. Take twice as long as promised, and the rebuild is on me. The ambition is that it never moves off zero.

Not sure if it's a patch or a rebuild?

That's exactly what the diagnosis answers. Book a 20-minute call and we'll talk through where your product is and what it would take to make it scale. No pitch.