Beyond the scalability myth: How to improve application response time

Component scaling isn't likely to be helpful in situations where there isn't a clear bottleneck shown in the application response time analysis.

Application performance has always been an issue in productivity because a long response time delay will stall

worker progress. The issue is becoming more acute as mobility and point-of-activity empowerment interest grows. For machine-to-machine communications, application response time and performance can be critical.

hand holding stop clock

Many software professionals believe modern applications can simply scale out to improve performance, but in most cases that isn't possible without special design and integration steps. To design real performance elasticity, architects need to:

  • Figure out how much scalability can help performance
  • Determine what mechanisms are needed to create scalable applications
  • Look to exploit scalability optimally in a real deployment, even in the cloud

Break bottlenecks

Performance is improved by making the components that contribute to a delayed application response time more efficient. One way that can be achieved is by making the performance of hardware or software platforms used to support lagging components better. Another method would entail spawning additional copies of components that have become performance bottlenecks.

Where there isn't a clear bottleneck shown in the application response time analysis, component scaling is not likely to be helpful.

The object of a performance enhancement review is to determine whether a small number of components are creating a bottleneck in the workflow or if the delay is evenly divided. Any review should include:

  • Examining application response time
  • Measuring end-to-end response time
  • Dividing the delay among workflow components

Where there isn't a clear bottleneck shown in the application response time analysis, component scaling is not likely to be helpful. Experience shows that if all components have to scale, integrating components and managing multiple parallel workflows is too complex. One option is to improve the performance of the hardware or software platform by digging into platform tools.

Another method is to look at either creating multiple copies of the entire application, with work distributed by a Web front end, for example. This involves ensuring the front-end process can divide the work and that if multiple databases are used, multi-phase commit keeps them synchronized.

The largest source of application performance problems for virtualization and the cloud is inefficient data flow handling to virtual machines. This will impact every element of workflow equally, and scalability isn't a solution to these data-path delays. The only reliable way to get a good handle on how much delay is introduced by cloud or virtualization software is to run the application without the resource-sharing and then again with sharing supported, and measure the difference.

Sometimes problems can be inferred by seeing if performance issues relate directly to the inter-component traffic at that point and don't seem to correlate with complexity of processing. Some cloud and virtualization platforms include data-path acceleration. Platform problems should be fixed before scalability is even considered.

Use load balancing

Scalability is potentially valuable when the processing delay component is a major source of application response time problems. The scalability tool addresses this by adding component copies to process work in parallel, increasing the output and reducing delay. Component scaling requires that work (input from users) is divided among copies of a component, which is usually called load balancing. A scaling strategy starts by considering how work will be divided and looking for signs that the technique won't work.

In situations where bottleneck components do database access, multiple copies of the component won't help unless the database is replicated. If a database is read-only, then replication is a strong solution and it's generally easy to divide workflows at the data access level.

The dividing side is load balancing. One mechanism for division is to use a form of round-robin component addressing at the DNS level so successive requests for a URL are decoded to address different components in a scaled array of copies. Another method is to put a load balancing (Level 4 or application) switch in front of the servers hosting the copies. A third mechanism is to build component load balancing into the service bus used to distribute work.

More on application response time and performance

Network-based tools help with complex applications

How to avoid common application performance pitfalls

Using flash storage to improve application performance

Architects will face state management issues regardless of which mechanism is used for load balancing. Any application dialog establishing a persistent relationship with a user over a sequence of messages could be stateful in that it may require that the application remember where the user is in the sequence.

If multiple copies of a component are used, state can be lost if later messages are directed to a different copy. Some security processes also require a user be validated to transact with the application. Such a validation may be a state that is lost if the user is switched to another component.

Look for terms like stateful load balancing if stateful behavior in scaled components is needed. When multiple copies of a component exist, the workflow out of each copy has to converge. It may be necessary to send responses back to the component that originated the request.

The key point to remember about any scalability strategy to improve performance is that it will inevitably add complexity in handling delays. As components are added, measure application performance carefully and chart a trend line to ensure there is enough gain to overcome issues.

Also watch queue depths in the load balancing process; additional copies won't make a difference if work isn't waiting. Careful analysis of performance is critical to ensure the greatest gains are obtained from scalability changes, and that is something that can't be taken for granted.

About the author:
Tom Nolle is president of 
CIMI Corporation, a strategic consulting firm specializing in telecommunications and data communications since 1982.

Follow us on Twitter at @SearchSOA and like us on Facebook.

This was first published in April 2014

Dig deeper on Service-oriented architecture (SOA) implementations

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

Related Discussions

Tom Nolle, Contributor asks:

What techniques have you deployed to improve application response time?

0  Responses So Far

Join the Discussion

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchSoftwareQuality

SearchCloudApplications

SearchAWS

TheServerSide

SearchWinDevelopment

Close