Back to index
Project · 01 / 072024Web platform

StratusOne pane of glass for AWS and Azure.

Year
2024
Role
Lead Engineer
Sector
DevOps / SaaS
Status
In production
- Cover
MULTI-CLOUD MANAGEMENT PLATFORM
Stratus
01 / 07
- Subtitle

Multi-cloud management platform

- Overview

The brief.

A multi-tenant control plane that centralises cloud accounts, exposes resources for budgeting, scheduling, monitoring, and automated container scaling, built end-to-end from Figma to production.

  • Laravel
  • Inertia + React
  • TypeScript
  • Tailwind
  • AWS SDK
  • Azure Graph
- Screens

The surface.

- Notes

The story.

Problem

Mid-size teams running both AWS and Azure were drowning in two consoles, two billing dashboards, and zero unified policy. Cost overruns went unnoticed for weeks; provisioning was tribal knowledge.

Approach

  • High-fidelity Figma system covering 60+ screens before a line of code.
  • Inertia + React for the SPA-feel without the API duplication; Laravel queues coordinate long-running cloud sync jobs.
  • Role-based access control for multi-tenant orgs, with delegated admin and audit trails.
  • Sentry across the stack for real-time error tracking and perf telemetry.

Outcome

Shipped a platform handling thousands of resources across both clouds, with budget alerting and automated scaling reducing wasted spend and surfacing infra hygiene issues before they bite.

- Impact

By the numbers.

Numbers from the first nine months in production. Tracked through the platform itself, sampled monthly, anonymised aggregates across pilot tenants.

18%
reduction in monthly cloud spend
across pilot tenants, after auto-scaling rules went live
60+
screens shipped to production
Figma to React in tight feedback loops
to2 min
from new tenant to first sync
down from a half-day onboarding call
92%
of budget alerts caught issues
before the next billing cycle closed
- Before & After

What changed.

Before, ops leads kept context across two consoles, a billing CSV, and a private Notion page. After, one screen, one budget, one set of rules.

BEFORE
aws.console / azure.portal / billing.csv
i-af40c2us-east-1$420.00OVER BUDGET
i-bf41c2us-east-1$433.00ok
i-cf42c2us-east-1$446.00ok
i-df43c2us-east-1$459.00ok
i-ef44c2us-east-1$472.00OVER BUDGET
i-ff45c2us-east-1$485.00ok
i-10f46c2us-east-1$498.00ok
i-11f47c2us-east-1$511.00ok
i-12f48c2us-east-1$524.00OVER BUDGET
i-13f49c2us-east-1$537.00ok
i-14f410c2us-east-1$550.00ok
i-15f411c2us-east-1$563.00ok
i-16f412c2us-east-1$576.00OVER BUDGET
i-17f413c2us-east-1$589.00ok
Two consoles, billing CSV, tribal knowledge
AFTER
Total cloud spend
$184.2k 18%
AWS
$112.8k
Azure
$71.4k
Auto-scaling rules · 14 active
Stratus, one pane, one source of truth
drag to compare
- Process

How it came together.

  1. 01
    Discovery
    Two weeks shadowing ops leads at the pilot tenant. Mapped where their day actually went, console-hopping, billing reconciliation, scaling decisions made on intuition.
  2. 02
    Figma system
    Built a 60+ screen Figma library before code. Resource list, detail, budget, scaling rules, audit trail, every state, every empty state, every error.
  3. 03
    Spike: SDK abstraction
    Three days proving I could unify AWS SDK and Azure Graph models behind a single resource interface without leaking provider concepts upward.
  4. 04
    MVP build
    Inertia + React for the SPA-feel without API duplication. Laravel queues for long-running cloud syncs. RBAC and audit trails wired in from day one, not retrofitted.
  5. 05
    Pilot rollout
    Two tenants, weekly syncs, a Sentry feed I checked over coffee. Iterated on the scaling-rule editor four times before it felt right.
  6. 06
    Hardening
    Performance pass on the resource explorer, lazy hydration on Inertia partials, and a write-through cache for cloud reads. Budget alerting moved from polling to event-driven.
- Reflections

What I took away.

  • Boring is a feature.
    The most-loved part of the platform is the audit trail. People want to know who pushed the button. Designing for trust beat designing for delight.
  • Spec the seams.
    The provider-agnostic resource interface looked like over-engineering at week two. By month four it was the only reason adding GCP was a sprint, not a quarter.
  • RBAC from day one.
    I have never regretted starting a multi-tenant project with role-based access wired in. I have always regretted bolting it on later.