StackOverflow Architechture is a widely used question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. The scale of this site can be understood by looking at its user base stats

With a hopping user base of 15million , Stack Overflow tops the chart. As per today overall Stack Exchange network gets 1.3 billion page views per month, that means 55 TB data / month. Designing the architecture of such a site network is a humongous task in itself.

What is going to blow your mind ,Stack Exchange employs a two tier architecture, no sharding no Big Data. It’s fascinating that they run one of the most trafficked pages (that also uses long-polling “real-time” messaging) on just 25 servers, most of them on 10% load all the time.

StackOverflow’s Nick Craver, Joel, Cecconi are a big advocate of Scaling Up. At a time when world is moving to Scaling Out by adding more machines, StackOverflow still lives in an old world of Scaling Up since they are getting best performance out of it with just 5% average CPU load on each of their servers and achieving 15ms page load time. Lets talk how do they achieve this level of performance while serving 1.3 page views per month:

Caching is done at multiple levels.

  • One is at Browser/Client level by caching images, javascripts,etc until their TTL(Time to live).
  • At Load balancer level, load balancer serves cached requests instead of making another query to the webserver behind it. This caching results in improved response times for those requests and less load on the webserver.
  • Via Redis, Redis Server caches data in-memory rather than frequent database hits. This caching is made highly consistent such that all data is in-memory & thus reducing database footprint. In-memory caching saves 10x time compared to databse hits .
  • At database level, Sql Server also provides a level of internal caching to avoid frequent disk calls
  • Using Elastic Search, which is an index server that powers quick searching. 90% of the queries are read only. This is the Read only server of Stackoverflow that help render pageviews real quick.

StackOverflow ensures High Availability on all its services by maintaining layers of redundancy. There has been centuries old debate regarding whether to scale up or scale out.

Scale out ,being the newest counterpart,is also the base of BigData & is proven to be performant for very large systems. Thats why everybody is rushing towards it . Scale out is Horizontally scaling, i.e. distributing the heavy jobs processing into multiple small nodes/machines.

Scale Up (Vertical Scaling) is increasing the machine’s processing power(Memory,CPU) rather than adding more machines.

Why StackOverflow choose to Scale Up?

At StackOverflow, they believe ‘Performance is a feature’. To get the best performance they maintain their own Servers and scale it up when needed.

When a new requirement comes up for more resources, they decide on these parameters :

  • How much Redundancy do we need?
  • Do we actually need to Scale Out?
  • How much disk storage will be used?
  • SSDs or Disk Drives
  • How much Memory ?
  • Memory intensive application?
  • Parallel usage?
  • What CPU?
  • Cores based on parallelism
  • Network, whats the transaction rate?
  • How much Redundancy do we need?

Based on these questions they majorly drill down to one single machine with large RAM & great processing. Most of their servers are running at an average 5% CPU load. Thats a blow to all the Horizontal Scaling followers.

Another factor of StackOverflow’s performance is its Shared Nothing Architechture.

According to the definition by Wikipedia, “A shared-nothing architecture (SN) is a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. More specifically, none of the nodes share memory or disk storage.”

StackOverflow distributes work evenly among servers, an Elastic Search Server for faster reads, Writable DB server for Update Operations, Caching at each layer.

Dedicated server per task help get the best performance out of each server.


Leading all phases of diverse Technology Products with hands-on technical edge ,an aspiring Triathlete, illustrator who loves giving back to society