← Back to work AI sales platform · in production

autoply

An AI sales agent for the trades. It answers leads within minutes, runs cost calculations live in chat, books meetings, and carries the whole sales journey from embeddable widget to signed contract, across 20 trade specialisations, as fully multi-tenant SaaS running in production. Built cheap to run, with a clear plan to scale. Built solo in roughly two months.

By request · early stage

autoply runs by request only with a select set of pilot clients while we run all the testing before we scale. That pace is the point: build slowly, test-driven, by-request: the smart way, not throw-it-out-and-crash. The platform is genuinely live, with real tenants and real invoices, exactly because it was grown deliberately rather than rushed to a public launch.

~2 mo

From problem to live, multi-tenant product in production.

Parallel PM2 workers: chat, AI, SMS, email, scoring, crawl, billing.

Trade verticals with curated knowledge and cost models.

8 to 10

Paying pilot tenants in ~10 weeks, no external marketing.

// What it does

The feature set, in production

True multi-tenancy Every query scoped by tenant_id, separate branch and team tables, role-based access, plan-based quotas (leads, SMS, pipelines, users) and per-tenant token tracking with warning thresholds.

Model routing per task The right model for each job, not one model for everything: a lean model with tool-calling for everyday inbound / outbound / support, frontier capability reserved for the rare, high-value work like analysing a scraped site. Dual-key failover with 60-second recovery, role-based tool gating, and self-healing JSON repair for broken model output.

In-chat cost calculators Cost models for 18 of 20 verticals run locally in Node (not in the model), triggered by a tool call and delivered as a structured price card: demo-friendly and conversion-optimised, not loose text. Code where code is enough.

Embeddable widget Self-contained vanilla-JS script tag: configurable position / colour / icon, public API, booking context, session replay on reconnect, auto-detected multilingual UI.

Auto demo-onboarding Enter a tradesperson's URL → crawl → a capable model analyses products, services, FAQs, tone and SEO → pipelines materialise and a chatbot config is generated, in 2 to 3 minutes. This is exactly where frontier capability earns its cost. Competitors need a sales team to do this by hand.

Full sales journey Meeting booking with .ics invites, Google Calendar + CalDAV sync, IMAP/OAuth, eIDAS contract signing, invoice-PDF generation: widget to signed deal.

// How it's built

The architecture

Backendruntime

Node.js 20+, Express REST + WebSocket, 18 parallel PM2 cluster-mode workers (chat, ai, sms, email, scoring, intel, crawl, outbound, billing and more).

Datastorage

PostgreSQL as the primary store, Redis + BullMQ for queueing. SQL-based search, deliberately no vector store for this workload.

AI layerintelligence

Model routing per task rather than one model everywhere: lean models for routine queries, a more capable model reserved for crawl-and-synthesis. Dual-key failover, tool gating per agent role, and real-time USD cost tracking per call to a token_usage table so spend is visible, not a surprise.

Knowledgethe IP

Four layers: global per-vertical knowledge → tenant-specific documents (PDF/URL/text) → crawl results as JSONB → an AI-consolidated version per pipeline. 6k+ lines of curated, market-rooted cost data: months of domain work.

Infradelivery

Runs on AWS cloud (eu-north-1), using AWS SES for email and GatewayAPI for SMS, with zero-downtime deploys via atomic symlink switch, PM2 reload, smoke test and automatic rollback. Today it sits on a modest cloud instance because the use-case is early (≈10 tenants); the architecture is built to distribute, not to stay on one box.

// How it scales

Cheap today. A clear plan to scale.

Distributed by design, not by one big box // pay for what each market needs, when it needs it

Today

≈10 tenants

A single modest AWS cloud instance. Deliberately small and cheap to run while the product is proven with pilot tenants. Proof it can be operated for almost nothing at this stage.

per-market instances

The architecture is distributed: one independent local instance per market, each running on its own, with the core updated centrally; packages pushed and the core updated across all instances at once.

The split

core vs knowledge

The core lives centrally; the knowledge layer and market-specific points sit separately, per instance. That keeps each market self-contained while a single push updates everyone: the smart design, not "10 tenants on one box".

At scale

cost stays down

You only ever pay for the instances a market actually needs, when it needs them. Scaling is adding instances, not rebuilding, and the bill tracks real usage, not a token-guzzling default.

Why it matters

The knowledge layer
is the moat.

Anyone can wrap a chatbot. What separates autoply is the depth of the curated trade knowledge: market-rooted pricing, competitor analysis, seasonality, 6 to 8-step decision journeys, ROT/green-tech deduction calculators and regulation per category. New entrants can copy the architecture; they can't copy months of validated domain expertise.

The cost models run locally in Node, not in the model (code where code is enough) so a solar enquiry feeds roof area, orientation, roof type, floors and panel choice into a deterministic calculator and returns a structured price card in seconds. Repeatable, auditable, and cheaper than asking a model to do arithmetic.

And it's architected to grow: tenants are isolated at the database level with their own plan and token limits, and the system is designed to distribute one independent instance per market, so growth is adding instances, not a rebuild, and the bill tracks real usage rather than a wasteful default.

Want a build like this?

autoply is what one senior operator ships in two months when product, tech, strategy and design live in one head. If you've got a product that needs the same treatment, let's talk.

Book a 20-min call See the work →