Modern Hardware Numbers for System Design Interviews (2025)
Don't get called out with outdated knowledge.
Preparing for system design interviews? Work through common questions with personalized feedback to help you improve and own your next interview.
Hardware Numbers That Matter in System Design
Don’t get called out in an interview with outdated knowledge. In system design interviews, outdated hardware assumptions instantly reveal the gap between theory and production experience.
When someone uses memory and disk numbers from 2015 (or even 2020), it shows they've learned from textbooks rather than building modern systems. Nothing wrong with textbooks - they're often the best way to learn fundamentals. But our industry moves fast, and the hardware we build on evolves constantly.
Those system design books are teaching solid patterns, but their numbers can be off by orders of magnitude.
Today's Hardware Reality
Modern numbers may shock you. A standard AWS M6i.32xlarge comes with 512 GiB of memory and 128 vCPUs. For serious power, the X1e.32xlarge offers 4 TB of RAM.
And this isn't just cloud territory—Hetzner’s AX162-R dedicated server packs up to 1.152 TB of RAM and 48 cores for a fraction of the cloud cost. At the extreme end, the AWS U-24tb1.metal reaches a whopping 24 TB (!!).
Storage has exploded too. Modern instances pack up to 60 TB of local SSD storage, with options for hundreds of terabytes of HDD storage. Object storage like S3 handles petabyte-scale deployments as standard practice.
Networks haven't been sitting still either. 10 Gbps is standard in datacenters now, with 1-2ms latency within regions and 50-150ms cross-region.
What This Means for Key Components
Here's why these numbers matter for the components you'll use:
Caching
Modern caches are beasts. They handle terabyte-scale datasets without breaking a sweat, giving you single-digit millisecond latency. A single Redis instance processes hundreds of thousands of ops per second. This means instead of building complex partial caching schemes, you can often just cache your entire dataset.
Numbers that matter:
Memory: Up to 1TB standard, more for specialized cases
Latency: Single-digit milliseconds within region
Operations: 100k+ requests/second per instance
Databases
Single PostgreSQL or MySQL instances now handle dozens of terabytes while maintaining millisecond-level response times. They'll process tens of thousands of transactions per second on a single primary. You can push single-instance databases much further than conventional wisdom suggests. While the largest tech companies still need sharding, most applications can run on a single well-tuned database. The decision to shard is now usually driven by operational concerns (backup times, maintenance windows) rather than pure performance limits.
Key metrics:
Storage: Up to 64 TiB per instance
Reads: 1-5ms cached, 5-30ms disk
Writes: 10-20k transactions per second
Connections: Up to 20k concurrent
Message Queues
Queues like Kafka process millions of messages per second with single-digit millisecond latency on modern hardware. Not only is throughput insane, but Kafka's role has expanded far beyond traditional message queues. Data engineers use it as their go-to for real-time data ingestion and streaming pipelines. And it's not just for background jobs anymore - with this kind of performance, it can play a part in synchronous request flows too. You get reliable delivery and service decoupling without sacrificing response times.
Performance numbers:
Throughput: Up to 1M messages/second per broker
Latency: 1-5ms end-to-end within region
Storage: Up to 50TB per broker
Application Servers
Modern app servers handle thousands of concurrent connections. CPU, not memory or connections, sets the limit now. The old advice about keeping application servers stateless and thin doesn't always hold. Sometimes adding state locally gives better performance than reaching for external services. You have a lot of memory on these boxes, consider using it!
Key capabilities:
Connections: 100k+ concurrent
CPU: 8-64 cores
Memory: 64-512GB standard, up to 2TB available
The Bottom Line
The hardware landscape of 2025 breaks traditional system design assumptions. Sure, the largest tech companies still need massively distributed architectures – if you're building the next TikTok or Twitter, you'll need extensive sharding and horizontal scaling. But that threshold for "large" has shifted significantly.
In your next system design interview, don't jump to overly distributed solutions too quickly. A well-tuned database with proper redundancy and a few replicas for high availability handles more load than entire clusters could just a few years ago.
The strongest candidates aren't the ones reaching for microservices or sharding immediately – they're the ones who know how to balance simplicity with practical operational needs. Start with the basics, then scale complexity only as required.
Request for Community
Over the past year, we’ve seen the community around Hello Interview grow massively - from just a few early users (you know who you are, thank you!) to O(100k) monthly visitors. While the thousands of emails and comments have been a great way to get to know you all, we feel like we’re missing an opportunity to help you in your journey.
This request is short: we’d love to hear your ideas on how we can help connect the Hello Interview community. We’ve heard interest in a Discord, do you have ideas of things you want to see there? Some users have requested the ability to connect with peers in a cohort, what things would you want to talk about? We love when people send us their experience in interviews, how can we best privately and anonymously share those with others?
If you have answers to these questions or ideas on how we should structure the community, we are excited to hear them! Email community@hellointerview.com and we’ll discuss more. And stay tuned for updates from us on this front.
I find this article more harmful than useful. Developers read such articles and start to think that optimizations are not needed. They can write very straightforward code and the hardware will do the magic.
In reality, we have Jira, a single page of which may download 27(!) megabytes of JS. Is it because the developers thought everyone already has a 10 Gbps network?
We have Facebook iOS app that takes 1.2 TB of the phone's storage (350 MB of the app + 850 MB of caches). Is it because the storage is cheap and fast?
I think that a developer should always care about performance and should try to consume fewer resources. Because in the end these terabytes of memory, hundreds of CPU cores, and gigabit throughputs cost money. If developers think about the software but not only hardware, we will live in a different world where services and apps are fast and small and do not require 48 CPU cores to do their job.
It is all true but how does it map to the costs? Operational costs for such solutions might be rather high for many businesses, imo.