AyuHealth Infrastructure Case Study | P95 <500ms with 80% Cloud Cost Reduction

The Challenge

AyuHealth's infrastructure needed to do something rare: scale 1000x without breaking, while spending like a startup. From pre-seed with a handful of hospitals to Series C serving 100+, every architectural decision made early would either enable growth or become the bottleneck that killed it.

The constraints:

Pre-seed budget, enterprise-grade requirements — healthcare data demands the same reliability whether you have 1 hospital or 100
Latency matters clinically — slow systems in hospitals mean slower patient care
Zero tolerance for downtime — hospitals operate 24/7, the platform can't go down for maintenance
Multi-cloud complexity — workloads split across AWS and GCP based on service requirements
Growing team, growing codebase — infrastructure needed to support 10+ developers shipping across multiple verticals without stepping on each other

The Solution

Architected the entire backend infrastructure from day one with an event-driven approach — then continuously optimized as scale demanded.

Event-Driven Architecture from Day One

The single most important decision: no monolith. Every service communicates through events, every operation that can be async is async.

Decoupled services — document processing, calling system, diagnostics, and communication each operate independently
Event sourcing for critical workflows — full audit trail of every state change, essential for healthcare compliance
Prevented monolithic bottlenecks at scale — when document processing volume spiked 10x, it didn't affect appointment booking latency

Performance Engineering

P95 latency <500ms achieved across all services — not just averages, but tail latency under control
Apdex = 1 — every service meeting or exceeding user satisfaction thresholds
Database performance increased 10x through query optimization, indexing strategy, and connection pooling
Redisson for distributed scheduling — reliable job execution across clustered services without duplicate processing

Cloud Cost Optimization

GCP costs reduced 80% — right-sizing instances, leveraging preemptible VMs for batch workloads, eliminating idle resources
AWS costs reduced 40% — reserved instances for steady-state workloads, spot instances for burst capacity, storage tiering
Strategic multi-cloud — workloads placed on the cloud where they're most cost-effective, not where they happened to be first deployed

Zero-Downtime Backend Revamp

Revamped the entire backend in 30 days while maintaining production stability — no outages, no degraded service, no "maintenance window" that blocks hospital operations.

Blue-green deployment strategy for service cutover
Database migrations run as backward-compatible schema changes
Feature flags for gradual rollout of new service versions

CI/CD & Observability

<5 minute deploys via DAAS + optimized CI/CD pipelines — deploy speed improved 2x from initial setup
New Relic for production APM — latency tracking, error rates, and throughput across all services
Deployment speed improved 2x — from 10+ minute builds to sub-5-minute deploys

Geospatial Features

Hospital finder based on patient location via Google Maps API — patients see nearest hospitals ranked by distance and specialty
Sales agent travel tracking with kilometer calculation for reimbursement — automated what was previously a manual spreadsheet process
Live location tracking for managers with cost-optimized API usage — batched location updates to minimize Maps API costs

Product Thinking

Product team designed a 7-screen doctor appointment flow. Instead of building the full backend, pushed back: build UI only, no backend, test user interest first. Showed "no doctors available" at the end. Result: 80%+ drop-off by screen 2. Feature killed before wasting weeks of engineering time.

The Impact

P95 <500ms, Apdex = 1 across all services at production scale
GCP costs reduced 80%, AWS costs reduced 40% through strategic optimization
Database performance increased 10x with query and indexing optimization
Entire backend revamped in 30 days with zero downtime
Deploy time <5 minutes — 2x improvement enabling faster iteration
Event-driven architecture supported growth from pre-seed to Series C without a single architectural rewrite
10+ developers shipping independently across 2 product verticals with clean service boundaries

Testimonial

"I worked with Yatharth for almost 2 years across a range of very complex product implementations. There are 3 areas where Yatharth raises the bar: Implementing complex technology — he helped me build a product for prioritizing and routing, in real-time, incoming calls to our call center. This was a highly complex piece of work delivered over many months with a near flawless, defect-free go-live. Simplifying product launches — for another large feature-set going live, he suggested breaking down the product into 3 logical phases based on a solid understanding of the product, creating no new tech overhead and shortening time to use. Super-responsive — business teams would feel comfortable approaching him for any support. Overall, a very sound tech-leader."

— Gaurav Gadgil, Head of Product, AyuHealth

How AyuHealth Achieved P95 <500ms at Scale While Cutting Cloud Costs 80%

The Challenge

The Solution

Event-Driven Architecture from Day One

Performance Engineering

Cloud Cost Optimization

Zero-Downtime Backend Revamp

CI/CD & Observability

Geospatial Features

Product Thinking

The Impact

Testimonial

Related Case Studies

How AyuHealth Reduced Patient Churn 40% with an Integrated Diagnostics Platform

How AyuHealth Cut Insurance Processing from 5 Hours to 1 Hour with AI

How AyuHealth Increased Pickup Rates 40% with Smart Call Prioritization

Have a similar challenge?