Performance is a metric of the system speed your clients experience. When the user makes a request, how quickly does the system come back?
What is an acceptable level of performance? It depends. If any UI takes longer than 0.25 seconds, it has been shown that it will cognitively interrupt the user’s thought process. Web sites don’t typically perform that fast — they can’t because of all the network delays — so our users have to put up with a level of slowness anyway.
After that very stringent threshold, it’s an expectations game. How complex does the user think the task is? If they think it is simple, they will be less tolerant of delays. If they think it is complex, more tolerant.
Around the 30 seconds mark, for web applications, you start to run into timeout values of the various components between the browser and the servers. If the operation will take longer than that, the UI should release the user from having to wait for the browser to refresh.
Scalability is a metric of how many clients you can service. The economics of the business are strongly influenced by scalability.
Scalability of a system is constrained by system bottlenecks. Every component of the system has a throughput rate — a rate it can not exceed. In a typical system, one component is operating at its maximum throughput rate and all others have a bit of slack.
If one increases the capacity of the bottleneck component, that component may no longer remain the bottleneck; the bottleneck will shift to another component.

Availability is best defined by visualizing its absence. When your system isn’t available, your business is stalled.
There are two ways to increase availability of a service: 1. reduce the down time of each component and 2. adopt a design that allows for redundant individual components that may fail but the overall system continues to function.
Dollar for dollar, systems are more robust with the second strategy.
Copyright © 2008, J Singh