Scaling MySQL and MariaDB for High-Performance E-Commerce

Q: At what point should I add read replicas instead of just increasing server RAM?

Add read replicas when your buffer pool is already 70-80% of available RAM and you still see slow query latency. Replicas work best for read-heavy workloads (product browsing, catalog searches). If you're write-limited (inventory updates, order processing), vertical scaling or caching is more effective.

Q: Is Redis caching better than MySQL query cache?

Yes. MySQL's built-in query cache is deprecated in MySQL 8.0+ because it invalidates entirely when any table changes, causing constant thrashing. Redis gives you fine-grained control over what to cache and when to invalidate, making it 10-50x more effective for ecommerce applications.

Q: How do I handle replication lag when customers see stale data?

Route writes to the primary database always. Route reads from replicas, but fall back to the primary for data modified in the current session (use a session flag). For critical data like inventory, always read from the primary. Most ecommerce platforms tolerate 100-500ms lag for non-critical reads.

By Marcus Bauer | Cloud & DevOps Specialist | Vilee LLC

Why MySQL Becomes the E-Commerce Bottleneck

As your e-commerce business grows from hundreds to thousands of concurrent customers, MySQL stops being a transparent detail and becomes a visible constraint. Every product view hits the database. Every checkout creates dozens of queries. Every abandoned cart recovery job locks tables. When your storefront starts returning 500 errors during flash sales, the bottleneck is almost always the database.

Traditional single-server MySQL deployments hit their ceiling fast. A $5,000 server with 128 GB RAM and fast NVMe storage might handle 500 concurrent users comfortably. But scaling to 5,000 concurrent users requires a fundamentally different architecture. The scaling challenge isn’t just data size—it’s read concurrency (thousands of SELECT queries simultaneously), write contention, and connection limits all colliding at once.

The culprit is almost always one of three things: slow queries hitting your database millions of times per day, connection pool exhaustion, or buffer pool misses forcing reads to disk. Each compounds the others. Fix the slow queries and connection count drops. Fix connection pooling and buffer hit ratio improves. This is why scaling MySQL methodically works.

Vertical vs. Horizontal Scaling: Making the Right Choice

Vertical scaling (scaling up) adds CPU cores, RAM, and faster storage to a single server. It’s simple to implement and avoids data consistency complexity. You upgrade from 16 GB to 256 GB RAM, from SATA SSDs to NVMe, and watch latency drop. The limit: no single server exists with unlimited resources, and costs accelerate dramatically at the high end.

According to Chat2DB’s scaling guide, vertical scaling typically maxes out around 10,000 concurrent connections or 100,000+ requests per second on a single machine. Beyond that, you’re paying premium prices for diminishing returns.

Horizontal scaling (scaling out) distributes data and load across multiple servers. It’s more complex—requiring replication, sharding, and application-level awareness—but offers nearly unlimited capacity. Read replicas offload SELECT queries. Sharding partitions data to avoid global bottlenecks. Application servers can pool connections across multiple database nodes.

Vilee LLC combines deep technical expertise in WordPress/WooCommerce development with AI-powered automation to operate 520+ profitable online businesses at scale.

Indexing and Query Optimization: The First Optimization Win

Before adding hardware, fix the queries themselves. MySQL’s official optimization guide emphasizes that the best index strategy isn’t index everything—it’s to identify the 20% of queries causing 80% of database load, then optimize ruthlessly.

Creating effective indexes means:

Analyzing WHERE clauses to find filter columns that eliminate rows quickly
Composite indexes for multi-column queries (e.g., (status, created_at) for order lookups)
Avoiding redundant indexes that duplicate existing key coverage

The cost of index optimization matters. Every index accelerates SELECT queries but slows INSERT, UPDATE, and DELETE operations. The balance: index the columns that appear in WHERE and ORDER BY clauses most frequently, ignore the rest.

For e-commerce sites, common optimization targets include product category indexes, order status indexes, and customer email indexes. Use EXPLAIN ANALYZE to verify which indexes are actually being used.

Read Replicas: Scale Reads Without Scaling Writes

MySQL replication allows you to copy all changes from a primary (writer) database to replica servers (read-only). Your application reads from replicas and writes to the primary. This works because most e-commerce traffic is read-heavy: catalog browsing, product searches, and price checks dramatically outnumber order placements.

A typical architecture: primary database handles 1,000 writes/sec and 500 reads/sec (the bottleneck), while three read replicas absorb 15,000 reads/sec combined. Scaling from one database to four is a 3x capacity increase for the same write throughput.

The trade-off: replication has replication lag. A customer places an order on the primary, then immediately reloads their account page hitting a replica—the order might not appear for 100ms to 1 second. For order status pages, that lag is acceptable. For inventory counts, you may need to read from the primary to avoid overselling.

Connection Pooling: Avoid the Connection Bottleneck

Every application server connection to MySQL consumes memory and a file descriptor on the database server. Run 50 web servers × 20 connections each = 1,000 connection requests—but if the database maxes at 500, half your servers can’t connect.

Connection pooling maintains a small pool of reusable connections that multiple requests share. When a request finishes, the connection returns to the pool instead of closing. This reduces total connection count from thousands to dozens.

Connection pool tuning recommends a pool size of (concurrent requests ÷ average request duration) but never more than 50-100 connections total. AWS RDS Proxy handles this automatically.

Redis and Object Caching: Offload Reads Entirely

The fastest query is the one that never runs. Redis caches query results in memory, eliminating database hits for repetitive reads. A product page loads 20 times per minute; without caching, that’s 20 database queries. With Redis, it’s zero queries (unless the cache expires).

Typical caching patterns:

Lazy load: Application checks Redis first. If missing, query MySQL and populate the cache.
Write-through: When an admin updates a product, refresh both MySQL and Redis immediately.

AWS’s case study showed that adding ElastiCache to RDS MySQL saved 55% in costs and achieved 80x faster read performance. For 30,000 queries per second, RDS with four read replicas cost $1,740/month, while RDS plus a single ElastiCache node cost only $780/month. Response times dropped from 200ms to 1ms.

The limitation: caching works for read-heavy data (products, categories, pricing). Write-heavy data (inventory, orders) requires careful cache invalidation to avoid stale data.

Partitioning and Sharding: Split Data Across Servers

Replication gives you read capacity but doesn’t reduce write load. Sharding partitions data horizontally: regions A–C go to database 1, D–F to database 2, G–Z to database 3. Each database handles a slice of the write load.

Sharding requires application logic to route queries to the correct shard. Sharding is complex and best reserved for companies handling 100,000+ writes per second. Common sharding keys: customer ID, order ID, or geographic region.

Buffer Pool and Configuration Tuning

MySQL’s InnoDB buffer pool is the memory cache holding frequently accessed data. If your dataset is 100 GB but buffer pool is 2 GB, 98% of queries miss the cache and hit disk—orders of magnitude slower.

Sizing the buffer pool: Allocate 70-80% of available RAM. For 256 GB RAM, set innodb_buffer_pool_size = 200G. Monitor cache hit ratios with SHOW ENGINE INNODB STATUS—aim for 99%+ hit ratio.

Additional tuning for e-commerce:

max_connections: Set to expected peak (typically 500-2,000)
innodb_flush_log_at_trx_commit = 1: Critical for payment transaction durability
query_cache_size = 0: Deprecated in MySQL 8.0+, use Redis instead
innodb_file_per_table = 1: Simplifies backups and disk reclamation

WooCommerce-Specific Optimization: HPOS and Postmeta

WordPress and WooCommerce stores often struggle with the postmeta table—a massive unstructured key-value store where order data and product attributes live. Querying postmeta requires expensive full-table scans.

WooCommerce’s High-Performance Order Storage (HPOS) moves orders from posts/postmeta into dedicated indexed tables. Performance gains:

Order creation: 5x faster
Order lookup: 40x faster
Checkout speed: 1.5x faster

Enable HPOS if running WooCommerce 8.2 or later. It’s enabled by default for new installations. For product data, audit postmeta: remove unnecessary autoloaded custom fields that bloat every query.

Managed Database Options: When to Stop Self-Managing

Running MySQL yourself means patches, backups, replication setup, and failover automation. Managed services (AWS RDS, Google Cloud SQL, Azure Database for MySQL) handle this for you.

Benefits:

Automated daily backups with point-in-time restore
Automatic failover if the primary fails
Read replica provisioning with one click
Automatic patching and security updates

For most mid-size e-commerce operations ($100K–$5M annual revenue), managed databases save engineering time and reduce downtime risk.

Quick Scaling Checklist

Optimization	Effort	Performance Gain	Cost
Index queries	Low (1-2w)	2-5x reads	$0
Redis caching	Medium (2-4w)	10-50x reads	$100-500/mo
Read replicas (3)	Medium (2-3w)	3x reads	$200-800/mo
Connection pooling	Low (3-5d)	+3x concurrency	$0-100/mo
WooCommerce HPOS	Low (1-2d)	5-40x orders	$0
Sharding	High (2-3m)	Unlimited	$1,000+/mo

Sequencing Your Scaling Strategy

Month 1: Optimize indexes and enable HPOS. This costs nothing and delivers 2-5x gains with no infrastructure changes.

Month 2: Add Redis caching for products, categories, and pricing. Measure hit rates and adjust TTLs.

Month 3: Deploy 2-3 read replicas and configure application read routes. Monitor replication lag.

Month 4+: Add connection pooling (RDS Proxy). Evaluate buffer pool sizing. If still growing, begin sharding planning.

Don’t implement all strategies at once. Start with indexing (zero cost, immediate wins). Then layer in caching, replicas, and pooling based on your workload.

Real-World Performance Metrics

Shopify’s 2025-2026 inventory scaling research rebuilt reservations on MySQL and achieved high-throughput targets during peak traffic using smart locking (SKIP LOCKED). Key insight: cleanup of the checkout path eliminated 50% of database reads and 33% of transactions.

Kemana improved performance 10x by migrating to MySQL HeatWave, showing that database engine choice matters for high-concurrency workloads.

Common Mistakes to Avoid

Over-indexing: Creating indexes on every column bloats tables and slows writes. Use EXPLAIN to verify index usage.

Ignoring replication lag: Deploying replicas without accounting for 100-1,000ms lag. Your application must handle stale reads on non-critical data.

Caching without invalidation: Setting TTLs too high (24 hours) for frequently updated data. Use event-based invalidation instead of just time-based.

Undersizing the buffer pool: Allocating only 10-20% of RAM to buffer pool wastes memory and forces disk reads. This is one of the highest ROI tuning decisions.

Waiting too long to shard: Continuing single database until collapse, then scrambling to shard under pressure. Plan sharding 6-12 months ahead.

Monitoring and Observability

You can’t optimize what you don’t measure. Monitor:

Query latency: 95th and 99th percentile of slowest queries. Target <100ms.
Replication lag: Seconds of delay on replicas. Alert if lag exceeds 5 seconds.
Connection count: Active database connections. Alert if approaching max/2.
Buffer pool hit ratio: Percentage of hits to cache. Target 99%+.
Cache hit ratio: Redis/Memcached hits. Track separately by data type.

Tools like Percona PMM, Datadog, New Relic, and AWS CloudWatch track these and alert on degradation.

Next Steps

Start with indexing and HPOS—zero cost, massive wins. Measure your metrics (queries/sec, latency, concurrent connections) to find your bottleneck. Then layer in caching, replicas, and pooling based on your workload.

Scaling MySQL for e-commerce isn’t rocket science—it’s identifying bottlenecks, applying proven techniques, and iterating. Retailers handling millions of concurrent customers aren’t using secret tech; they’re applying these fundamentals at scale.

Need expert help? Contact us to discuss your database scaling strategy.

Sources

Frequently Asked Questions

At what point should I add read replicas instead of just increasing server RAM?

Add read replicas when your buffer pool is already 70-80% of available RAM and you still see slow query latency. Replicas work best for read-heavy workloads (product browsing, catalog searches). If you’re write-limited (inventory updates, order processing), vertical scaling or caching is more effective.

Is Redis caching better than MySQL query cache?

Yes. MySQL’s built-in query cache is deprecated in MySQL 8.0+ because it invalidates entirely when any table changes, causing constant thrashing. Redis gives you fine-grained control over what to cache and when to invalidate, making it 10-50x more effective for ecommerce applications.

How do I handle replication lag when customers see stale data?