Friday, February 12, 2010

Performance Planning with Modeling & Simulation

SOA environments are characterized by an eclectic mix of components, service flows, processes and infrastructure systems (servers, local area networks, routers, gateways, etc.). This complexity makes it difficult to predict the capacity of the needed infrastructure.  In addition, trying to evaluate the impact of changes in how services are routed or used is more an exercise in the art of divination than in the scientific method.  Understanding the dynamics of an SOA system usually takes place over a period of time. Having to wait months to optimize a system is not usually a good option.
An alternative approach is to create a model of the system in order to simulate its current and future performance. Depending upon the complexity of the model, you will be able to simulate the actual system latency and the predicted response times of a variety of service flows. 
Simulation can help identify potential bottlenecks and streamline processing times by pinpointing areas where resources can be best optimized.  Imagine knowing the answers to these questions in more concrete terms:
·         What are the transaction response times?
·         How many servers, data bases or links do I actually need?

Without the ability to simulate, system designers and administrators are left with the choice of deploying what they believe to be the best system, praying, and then taking a reactive approach based on the on-going measurement of actual performance data via monitoring tools. By then, it might be too late or too expensive to fix the system.
In general, simulations fall within one of the following levels of details:
Rapid Model (also known as “NapkinSim”). You’ve probably been simulating in this manner for quite some time. If a clerk takes 10 minutes to serve a customer, and on average two customers arrive every 20 minutes, what is the average wait time? “Simple,” you might say, “the answer is zero.”  The answer is, of course, never simple and it is not zero. The answer depends upon how the customers arrive. If two customers arrive at the same time, one of them will have to wait at least 10 minutes. When running a simulation tool you will soon come to realize the importance  inter-arrival distributions have  on simulation results.
Aside from the simplest SOA problems, you cannot predict the desired resource-requester relationship by resorting to simple napkin arithmetic. 
Mathematical Analysis. Mathematics is not entirely helpful either. Significant work has been done to analyze the so called M/M/1 (single queue with exponential arrivals and services) problem. However, most mathematical approaches cannot satisfactorily cope with dynamic or transient effects and quickly become too complex for multi-server environments.  In real life, however, most queuing problems cannot be solved easily by resorting to linear equations. Indeed, the norm is for complexity to quickly drive the problem area to behave like a non-linear system. This in turn requires the assistance of complex mathematics for a reliable solution.  What then is the alternative?
Queuing  Simulation. Regardless of the level of abstraction chosen for the system under simulation, you will want to have the most precise and reliable information for the expected behavior of the system. In this case, simulation known as Queuing Simulation can be the most helpful.
Queuing simulation is particularly suited to SOA because you can simulate almost any process in which a “client” requests a service and a “resource” provides that service. No doubt about it, queuing simulation is the most viable and obvious way to model and predict how an SOA system will behave. 

To be clear, the simulation approach is not a panacea. First of all, you have to learn about the simulation tools. Secondly, detailed modeling can be time consuming. Modeling should not be viewed as a quick way to get answers to questions. You should also keep in mind that simulations yield only approximate answers which—in many cases—are difficult to validate. In the end, simulation is merely a more precise way to venture a guess. You should not accept simulation results as gospel. It is easy to forget that the simulation is an abstraction of reality; not reality itself.  A thorough validation of the results must be made, especially prior to publication of the results. Simulations should be supported by careful experiment design, an understanding of the assumptions, and reliability of the input data used in the model. Despite these caveats, you will find that simulation can be an invaluable tool in your day to day business activities.

While you could develop a simulation by writing a program yourself, you could also use one of the many simulation tools on the market. Today’s simulation tools are not as expensive as in the past, but they do demand the discipline to capture and create the base model and to keep the simulation model current for future simulation runs. A modern simulation tool for SOA should provide a visual interactive modeling and simulation tool for queuing systems that has the following attributes:
General purpose.  You can simulate almost anything that involves a request, a queue and a service, whether this includes a complex computer network or the service times at a fast food counter. This capability will give you the option of simulating the SOA system at various levels of granularity; from the underlying packet-level communications layers to the upper service flows.
Real-time. Unlike other costlier programs, you can view how the resources in your system behave as the simulation progresses.
Interactive. You can dynamically modify some essential parameters to adjust the behavior of the simulated components even as the simulation runs!
Visual Oriented. Allows you to enter the necessary information via a simple, and intuitive user interface, while removing the need to know a computer language. In addition to running the simulation, it also provides you with important information to help you fine tune it.
Discrete oriented.  Discrete-event systems change at discrete points in time, as opposed to continuous systems which change over time. 
Flexible. You can see the dynamic effects of the simulated system, or the accumulated averages representing the overall mean behavior of the system.

As a Valentine’s Day gift to the readers of this article, I am making Prophesy—A Complete Workflow Simulation System available for free!
Prophesy is a simulation product that I developed and marketed back in the roaring 90’s (when in retrospect I should have been putting my efforts into developing something for the exploding World Wide Web—but that’s another story). Prophesy meets the requirements listed above, but unfortunately, the product is aged. It’s no longer supported, and it will not run under Windows 7 (“thank” Microsoft for their lack of backward compatibility).
You can visit to download it for free and hopefully to use as a learning tool.