Designing Scalable Software

This is the first article in a three-part series inspired by something I read in Scalability Rules: 50 Principals For Scaling Web Sites by Martin Abbott and Michael Fisher:

Design for 20x capacity. Implement for 3x capacity. Deploy for ~1.5x capacity.

Fortunately, designing for scalability doesn’t necessarily require us to change the fundamentals of how we develop software. But it does mean we need to expand our thinking beyond the design characteristics we’d expect to see in the individual software components themselves: a team of the best developers can follow all of the best practices they’ve spent years to master and still develop an application that doesn’t scale. Even if you have an experienced team that designs a flexible robust architecture, there are still no guarantees with respect to how the application will scale.

Everyone plays a role

The responsibility doesn’t lie solely within the engineering team. Whether you’re a developer, a product manager or a member of the executive team, we all would rather be part of a successful software project than an unsuccessful one. Being responsible for a project that attracted 1,000,000 users instead of 50,000 is a problem all startups dream of. It’s also a problem that quickly turns into a nightmare if each and every member of your organization isn’t committed to unlimited growth.

It is absolutely critical to have all of the stakeholders agree on the priority of scalability:

♦ An application that was not designed to scale will only scale by accident. This kind of fortunate accident is extremely rare.
♦ Scalability issues are never a priority until they become a customer issue in a production environment.
♦ Scalability issues almost always occur when the resources most qualified to deal with them have limited bandwidth.
♦ Solving a scalability issue beyond the design phase will be done so at a significantly higher cost and usually at the expense of another high priority project.


Use design patterns appropriately

Unlike technology, design patterns are not concepts that are constantly evolving. There is and will always be a limited number of logical ways in which components can interact with each other. The operative word here is logical. You must be able to manage the physical relationships and dependencies outside the logical boundaries of the major components you use to construct your application. This is the best means of ensuring that we do not inadvertently introduced a tight coupling where none should exist.

Apply best engineering practices rigorously and consistently

Not only would it be impractical to provide a comprehensive list, but all best practice idioms do not necessarily apply to all software development projects. I’ve included the mantras I most often apply to my own designs, but it’s up to you to determine the core set of design principles that best support your goals.

♦ Algorithms should be as complex as they need to be but as simple as they can be.
♦ A useful software component does only one thing and does it really well.
♦ Collaborating software components are loosely coupled.
♦ Third-party software components are used sparingly and appropriately. These components must be even more loosely coupled than the components we develop ourselves.
♦ A software component must expose a well-defined contract and well-documented unambiguous behavior.
♦ A software component increases its refactoring potential by keeping its algorithms private.


Apply the eight principles of Service Oriented Architecture

Service Oriented Architecture (SOA) addresses many of the fundamental principals that are necessary to developing a scalable application, but scalability itself is not SOA’s primary focus. In fact, SOA is often achieved at the expense of performance and scalability and may introduce an even tighter coupling in the release process for dependent shared services. The key is to apply these principals intelligently, taking extra precautions to avoid unwarranted and unnecessary side effects that might impede scalability.

Quantify for maximum success and multiple your best guesses by a factor of x

One of the questions I’m sometimes asked by one of the developers I work with is how much load do we expect? My answer is usually something along the lines of show me your design and tell me why you think we need to worry about it. The fact is that it can be difficult to predict growth for a new innovative application.‘s retail business is a perfect example.

I had the pleasure of listening to Jon Jenkins relay Amazon’s early history at an AWS Summit a few months ago. Few people are aware of the fact that while Amazon was emerging as a successful online book store, they were running the site on three servers sitting in a closet. Shortly after what JJ referred to as a water event in 1988, they moved to a more reliable data center where they continued to grow both functionality and customers at a rapid pace.

Fast forward to 2005 when the Amazon engineers realized that their architecture would not support continued growth. If they neglected to address scalability, there would be no business. They spent the next six years migrating to Amazon Web Services, turning off the last physical web server in November, 2010.

Design for the cloud

Of all the propositions discussed here, cloud computing is probably the biggest enabling factor. The great thing about cloud computing is that it really doesn’t ask us to discard any of the good design habits we’ve already acquired. It simply encourages us to expand the scope of how we apply those principals.

Let’s take a simple example: I need to develop an application that will read data from a database, apply a series of analytic operations and write the results to a database where it will be subsequently be consumed by a reporting application.

Identify the technical requirements

Defining the requirements for the project extends well beyond the scope of your customer’s business needs. You also need to consider all of the factors that will impact your ability to deploy and support the application, as well as sustain growth without impacting performance.

♦ What are the limitations of the database I’m using? Are there built-in bottlenecks I need to worry about such as page or table level locks?
♦ How fast is the input data being generated and how quickly do I need to consume it?
♦ Is my database subject to high transaction volume?
♦ Are there other applications competing for significant I/O in the same database?
♦ Are there user-facing applications consuming the database in ways that may be impacted by a new data consumer?
♦ How much data am I going to read, process and write in one complete iteration? 10Mb? Or 10Gb?
♦ What factors determine the data volume and do I have control over any of those factors?
♦ What are the events that might impact transaction volume, I/O and data volume?
♦ Is it possible I’ll encounter unpredictable spikes for I/O and transaction throughput.
♦ Are the analytic operations CPU intensive?
♦ Are the operations time intensive?
♦ Do the operations have external dependencies that introduce latencies I can’t control?
♦ Are there events that result in unpredictable spikes in the number of operations I need to perform?


Not only do I have to quantify the requirements for my new application, I need to have a complete understanding of the reporting application that’s going to consume my data:

♦ What triggers the report generation? Are reports dynamically generated and displayed in a UI when a user clicks a button? Or are they generated in the background where they can be downloaded at some future point in time?
♦ How often are these triggers fired?
♦ Are there events that result in unpredictable spikes in the report frequency or size?
♦ How much data is consumed by the typical report?
♦ How much data is consumed by the worst-case report?


Identify proximity requirements

Once I have a complete understanding of how I might expect my application to be used, I can begin to make intelligent decisions about the proximity of my dependencies. If my new application expects to read a million rows, I probably want my application to live in close proximity to the database. If my reports are dynamically generated and presented to the user via a web interface, I can take advantage of the fact that users can visually consume a limited amount of data at one time — I can tolerate a considerable distance between the web application and reporting data without sacrificing the perception of good performance.

Mapping proximity requirements will generally be an iterative process during which natural system boundaries will start to emerge. These boundaries will highlight the areas where we can take advantage of distributed deployment opportunities as well as the areas where we need to minimize if not eliminate proximity requirements altogether.

Use the AKF Scale Cube to measure the scalability of your design

A well-designed application will be able to scale equally well in any of three dimensions:

AKF Scale Cube

AKF Scale Cube (see References)

The X Axis represents our ability to scale horizontally, otherwise known as scaling out. Simply put, this is our ability to split our load across multiple instances. The ability to efficiently deploy an unlimited number of instances of my application means that I can always accommodate unexpected growth or load. In fact, the optimal solution would allow me to do any of the following with the least amount of time and effort:

♦ Deploy multiple instances to the production data center.
♦ Deploy multiple instances to different administrative domains, i.e., one or two permanent instances running in the production data center with additional temporary instances deployed to an EC2 server via Amazon Web Services.
♦ Dynamically acquire and deploy to AWS EC2 spot instances when the application detects unusual spikes in load and transaction volume.


The Y Axis represents the degree to which we’ve applied best practices and SOA. A system of smaller, more focused services almost always exhibit the characteristics usually associated with successful applications:

♦ Distributed functionality is usually well-encapsulated, easier to understand and easier to maintain.
♦ Services that collaborate via well-established contracts are generally more robust.
♦ Elastic architectures allow individual services to be improved independently.


The Z Axis indicates how well we will be able to support big data. Even the simplest application can be subjected to huge volumes of data generated by external events outside our control. The degree to which we can distribute data and transactions will determine whether our design imposes undesirable limits on growth.

♦ ♦ ♦

Next time, we’ll take a deeper dive into how we apply these design principals to the implementation of scalable software.

About the author

Brenda Bell has been actively involved in software design and implementation for nearly thirty years. She is currently employed as a Software Architect at Awareness, Inc. and lives in Henniker, NH.


SOA Principles’s Journey to the Cloud
AKF Partners: Splitting Applications or Services for Scale
Scalability Rules: 50 Principals For Scaling Web Sites

America has lost it’s ambition? Really?

We have lost our ambition, our imagination, and our willingness to do the things that built the Golden Gate Bridge.

— President Obama at a fundraiser in San Francisco on Tuesday October 25, 2001.

I’m sorry Mr. President, but I haven’t lost anything. And I have a problem with any politician who gets on a stage and tries to convince me I’m a big fat zero.

Here’s a thought: What if we spent all of what we borrow on interests here at home instead of giving a significant portion of it to another country in the form of foreign aid. Don’t get me wrong — I’m all for us helping our allies in times of need, but borrowing from Peter to pay Paul is not fiscal responsibility in a weak economy.