Friday, February 26, 2010

The Role of Engineering with SOA: The Foundation

Let’s face it, developing a new system can be such a “sexy” undertaking that it’s only natural to want to place most of the focus on the cool stuff such as leading-edge technologies (wireless, social media), design and development of algorithms, flashy user interfaces, and the implementation of complex system features.  This type of focus often results in the neglect of the more “pedestrian” aspects of the actual implementation. It’s not much fun dealing with nuanced matters such as ensuring that back-up processes are in place, that the system actually includes fallback and recoverability capabilities, that the system is truly secured, and that the system is stable. 
It’s true that most of the actual engineering processes tend to come from pre-defined, out of the box vendor products (clustering, default configurations, etc.), but the target operational metrics should come from the enterprise needs and not from the vendor defaults.  From the outset your very own engineering planning should focus on ensuring these targets are met as early as possible.
From a governance perspective you will need to ensure you have a dedicated engineering team, able to tackle all detailed implementation and operational questions and also able to interact with the architecture team in a continuous and equal basis.  The engineering team should be able to push-back on some architecture elements in order to validate that the solutions are sufficiently practical and implementable. In this sense, the engineer is not unlike the building contractor who interprets the architect’s blueprints and guides the building construction via the selection of actual materials, enforcement of building codes, and performance of the necessary detailed adjustments to the design.  Architecture may be an art, but engineering is a science.
Still, in the same veneer as development, engineering needs to be an iterative process.  Engineering must initially deal with high level designs and approaches. However as additional “construction” data is gathered, the engineering process should also adapt to the various fine-tuning variables: capacity metrics, configuration parameters, availability, performance strategies and others.
In the end, the final acceptance test must include testing of engineering aspects as well as software development. That is, the final testing should take a holistic approach to coverage of the system operation as well as to its functionality. Having a system providing nice applications that do not scale cannot be considered a successful outcome. That’s why the engineering objectives are paramount. These healthy engineering key objectives are known in a tongue-in-cheek fashion as the “-ities” of the system: Availability, Security, Serviceability, Reliability, etc.  I will next cover three of the key engineering areas targeting these “-ities”:
·         System Availability and Reliability
·         Security & Continuance
·        Systems Management.

Sunday, February 21, 2010

Interlude Three: On Technology

This being my 50th blog, it represents a good vantage point to take stock of the road traversed and the reason why we are in this journey in the first place. I started my blog describing the promise of technology and the importance of technology transformation to accomplish that promise. I moved on to a discussion on how to make the business case to start the technology transformation ball rolling. I then proceeded to cover more technical matters, such as the characteristics of Service Oriented Architecture and the various classifications of services, and then delving even deeper into the detailed considerations for SOA design and management.

In fact, we have gone so deep that I am reminded of this “gedanken” (mental experiment):

Assume there is a tunnel so deep that it reaches the center of the earth. In fact, imagine digging this tunnel until it reaches the surface on the opposite side of the Earth (the antipode). Now, let’s have a brave athlete jump into the tunnel. What would happen?

Removing all other physical considerations such a as air resistance, and temperature and pressure, as the athlete reaches the center of the Earth, she should begin to feel less and less gravity. In the center of the Earth she should be completely weightless. The force of gravity is zero down there. The reason being that gravity is caused by the Earth’s mass. At the center, the gravitational pull is offset by the Earth’s surrounding mass.

So far, so good, but now the athlete has inertia and will continue to “fall upwards” towards the surface on the other side of the Earth! As the athlete falls upward, the gravitational pull will increase (more and more mass from the Earth will be behind her), slowing her down until the “upward fall” is halted just as she reaches the surface at the antipode. At this point, our athlete will begin to fall once again toward the center of the planet until she returns to the entrance of our tunnel . . . only to fall again. In this hypothetical, frictionless environment, our athlete would act like a perpetual Yo-Yo, repeatedly falling and re-falling back to the surface.

So, imagine that this SOA blog is a bit like this athlete. It feels like we have reached the center and that it’s time to now “fall” upwards towards the surface. Next I will be covering detailed engineering considerations (remember, we are still near the SOA core!), followed by less technical discussions. These items will be related to program execution governance, project management, and organizational and people matters. That is, we will return from the detailed to the general.

Still, given that we are still knee deep in the details, it is also good to remind ourselves why we are on this journey. In the end, this is not about SOA or even Technology, but about what we can do with SOA and with Technology. Yes, there is the technologist viewpoint regarding the power of SOA. While you can certainly run non-SOA system in a Cloud Computing environment, without SOA it is almost impossible to truly leverage the power of Cloud computing on behalf of an enterprise-wide system. Then again, the labor involved in creating SOA systems has an objective beyond Cloud Computing or using Software-as-a-Service. The most exciting goals are all about shaping the future of technology. That is, our ability to make technology so flexible that it eventually becomes hidden.

Arthur C. Clarke’s famed third law states that any sufficiently advanced technology is indistinguishable from magic. I would add a fourth law: The best indication that a technology has matured is that it has become invisible.

Think of electricity, the water supply, or even the internal workings of an automobile. In all these cases, we operate these technologies almost obliviously in a Switch On/Switch Off basis.

For the most part, technologies follow a well-defined life-cycle that takes them from inception in a lab all the way to invisibility. The time spent within a cycle is technology-dependent, but the average time to maturity can span decades.

Many futurists believe that one of the main evolutionary aspects of computing in the future is to have it also become invisible—embedded in the fabric of the thing we call “reality”. Instead of screens, keyboards and mouses, users will interface with computers in a seamless manner.

The ultimate interface achievement will be to hide the fact that a user is accessing, or even programming, a computer. This later attribute is often confused with the famed Turing Test of Artificial Intelligence (AI). However, the Turing Test establishes that Artificial Intelligence will only be achieved when a computer is able to hide the fact that it is a computer when communicating with a human in a broad domain. AI has been long in coming, and many believe it to be still a century away; others that it is around the corner. But AI requires common-sense and pattern recognition capabilities if it is to work, and progress has been fairly slow on these fronts. I tend to agree that AI as originally envisioned will take a long time to be achieved. However once it happens, AI will not appear as an overnight invention; instead, we will continue to see improvements in computer systems that gradually appear to make them smarter and smarter.

Think about your car’s navigation system which already appears quite smart and of the novel capabilities of your digital camera, such as face recognition. Pseudo-AI behavior in narrow knowledge domains is arriving thanks to the growing computer power made possible by Moore’s law. Consider that in the beginning it was assumed that a chess program capable of beating a chess grandmaster would require a full-fledged AI system. However, this feat has been achieved thanks to the use of the brute-force represented by massive parallel processors and the ingenuity of sophisticated heuristics; not by the invention of a human mind emulator. In May, 1997 an IBM computer nicknamed Deep Blue beat World Chess champion Garry Kasparov much to the chagrin of the Grand Master who found it difficult to accept he had been beaten by a computer! To all intents and purposes, playing against a chess computer does convey the eerie feeling of competing against an “intelligent” device. The machine behaves like AI, but it is actually based on the narrow domain of chess-playing, making the computer an “idiot-savant” of sorts.

As discussed earlier, most transformative technologies are the result of synergistic combinations of various evolutionary advances. To the degree that we see continued advances in user interface paradigms as represented by gestures ala iPhone or voice recognition, combined with improved algorithms and availability of ultra-fast communication bandwidths, we will see a wealth of interesting applications; many of them with true transformative effects. For example, enhanced user interfaces in the future, combined with more advanced artificial intelligence heuristics and the merging social networking paradigms, can deliver a suite of Virtual Sidekick capabilities:

· Attaining complete knowledge of your preferences. In fact, complete knowledge of you as a person.

· Exercise controlled empowerment to take independent action.

· Have immediate access to all sources of information available electronically. The ability to alert you to those specific developments that interest you, such as breaking news or TV specials.

· Adopt different service personalities based on context.

· Monitor actions performed on your behalf in a non-obtrusive manner. Certain events will automatically initiate pre-approved actions. For example, a calendar event schedule change will automatically trigger an action from your Virtual Sidekick to initiate a flight change.

This type of automated avatar will spawn new industries just as the Internet has spawned the multi-billion dollar Google. The Virtual Sidekick is but one example of the kind of thinking that should be propelling your R&D efforts. There are others. For example, it’s logical to imagine a future in which web access devices will have become so small and non-intrusive that they can be implanted into our bodies. In a world permeated with wireless access to the Web (the” Infosphere”, I discussed earlier), imagine a scenario where you can search and access the Internet by simply thinking about it; where you can “Skype” your wife and talk to her using your own embedded phone. You won’t even need to speak to communicate. A microprocessor embedded in your brain will convert your brain waves into speech. Think of this scenario as technology-enabled telepathy! These and other interesting possibilities can be extrapolated from the intriguing technology forecasts by author, Ray Kurzweil, in his book “The Singularity is Near: When Humans Transcend Biology”.

There can be no doubt that the transformative effects of such future inventions will generate heated debates about the ethics and dangers associated with their use, but that’s a subject matter for a future blog.

Friday, February 12, 2010

Performance Planning with Modeling & Simulation

SOA environments are characterized by an eclectic mix of components, service flows, processes and infrastructure systems (servers, local area networks, routers, gateways, etc.). This complexity makes it difficult to predict the capacity of the needed infrastructure.  In addition, trying to evaluate the impact of changes in how services are routed or used is more an exercise in the art of divination than in the scientific method.  Understanding the dynamics of an SOA system usually takes place over a period of time. Having to wait months to optimize a system is not usually a good option.
An alternative approach is to create a model of the system in order to simulate its current and future performance. Depending upon the complexity of the model, you will be able to simulate the actual system latency and the predicted response times of a variety of service flows. 
Simulation can help identify potential bottlenecks and streamline processing times by pinpointing areas where resources can be best optimized.  Imagine knowing the answers to these questions in more concrete terms:
·         What are the transaction response times?
·         How many servers, data bases or links do I actually need?

Without the ability to simulate, system designers and administrators are left with the choice of deploying what they believe to be the best system, praying, and then taking a reactive approach based on the on-going measurement of actual performance data via monitoring tools. By then, it might be too late or too expensive to fix the system.
In general, simulations fall within one of the following levels of details:
Rapid Model (also known as “NapkinSim”). You’ve probably been simulating in this manner for quite some time. If a clerk takes 10 minutes to serve a customer, and on average two customers arrive every 20 minutes, what is the average wait time? “Simple,” you might say, “the answer is zero.”  The answer is, of course, never simple and it is not zero. The answer depends upon how the customers arrive. If two customers arrive at the same time, one of them will have to wait at least 10 minutes. When running a simulation tool you will soon come to realize the importance  inter-arrival distributions have  on simulation results.
Aside from the simplest SOA problems, you cannot predict the desired resource-requester relationship by resorting to simple napkin arithmetic. 
Mathematical Analysis. Mathematics is not entirely helpful either. Significant work has been done to analyze the so called M/M/1 (single queue with exponential arrivals and services) problem. However, most mathematical approaches cannot satisfactorily cope with dynamic or transient effects and quickly become too complex for multi-server environments.  In real life, however, most queuing problems cannot be solved easily by resorting to linear equations. Indeed, the norm is for complexity to quickly drive the problem area to behave like a non-linear system. This in turn requires the assistance of complex mathematics for a reliable solution.  What then is the alternative?
Queuing  Simulation. Regardless of the level of abstraction chosen for the system under simulation, you will want to have the most precise and reliable information for the expected behavior of the system. In this case, simulation known as Queuing Simulation can be the most helpful.
Queuing simulation is particularly suited to SOA because you can simulate almost any process in which a “client” requests a service and a “resource” provides that service. No doubt about it, queuing simulation is the most viable and obvious way to model and predict how an SOA system will behave. 

To be clear, the simulation approach is not a panacea. First of all, you have to learn about the simulation tools. Secondly, detailed modeling can be time consuming. Modeling should not be viewed as a quick way to get answers to questions. You should also keep in mind that simulations yield only approximate answers which—in many cases—are difficult to validate. In the end, simulation is merely a more precise way to venture a guess. You should not accept simulation results as gospel. It is easy to forget that the simulation is an abstraction of reality; not reality itself.  A thorough validation of the results must be made, especially prior to publication of the results. Simulations should be supported by careful experiment design, an understanding of the assumptions, and reliability of the input data used in the model. Despite these caveats, you will find that simulation can be an invaluable tool in your day to day business activities.

While you could develop a simulation by writing a program yourself, you could also use one of the many simulation tools on the market. Today’s simulation tools are not as expensive as in the past, but they do demand the discipline to capture and create the base model and to keep the simulation model current for future simulation runs. A modern simulation tool for SOA should provide a visual interactive modeling and simulation tool for queuing systems that has the following attributes:
General purpose.  You can simulate almost anything that involves a request, a queue and a service, whether this includes a complex computer network or the service times at a fast food counter. This capability will give you the option of simulating the SOA system at various levels of granularity; from the underlying packet-level communications layers to the upper service flows.
Real-time. Unlike other costlier programs, you can view how the resources in your system behave as the simulation progresses.
Interactive. You can dynamically modify some essential parameters to adjust the behavior of the simulated components even as the simulation runs!
Visual Oriented. Allows you to enter the necessary information via a simple, and intuitive user interface, while removing the need to know a computer language. In addition to running the simulation, it also provides you with important information to help you fine tune it.
Discrete oriented.  Discrete-event systems change at discrete points in time, as opposed to continuous systems which change over time. 
Flexible. You can see the dynamic effects of the simulated system, or the accumulated averages representing the overall mean behavior of the system.

As a Valentine’s Day gift to the readers of this article, I am making Prophesy—A Complete Workflow Simulation System available for free!
Prophesy is a simulation product that I developed and marketed back in the roaring 90’s (when in retrospect I should have been putting my efforts into developing something for the exploding World Wide Web—but that’s another story). Prophesy meets the requirements listed above, but unfortunately, the product is aged. It’s no longer supported, and it will not run under Windows 7 (“thank” Microsoft for their lack of backward compatibility).
You can visit to download it for free and hopefully to use as a learning tool.

Friday, February 5, 2010

Best Performance Practices

As mentioned earlier, using “thin” services that require multiple trips to the server to obtain a complete response is one of the most common performance mistakes made with SOA.  Most other SOA performance problems occur due to basic engineering errors such as miss-configurations (low memory pools, bad routings, etc.) which can be fixed with relative ease once identified. Performance problems caused by inappropriate initial design are much harder to correct:
·         Inefficient implementation. The advent of high level and object oriented languages does not excuse the need to tighten algorithms. Many performance problems are the result of badly written algorithms or incorrect assumptions about the way high-level languages handle memory and other resources.

·         Inappropriate resource locks and serialization. Just as it is not an good idea to design a four-lane highway that suddenly becomes a one lane bridge, best practice design avoids synchronous resource-locking as much as possible. Its’ best to implement service queues whenever possible to take advantage of the multitasking and load balancing capabilities provided by modern operating systems.  Still, avoid using asynchronous modes for Query/Reply exchanges.

·         Unbalanced workloads. This is a scenario more likely to occur when services must run from a particular server due to the need to keep state or because the services are not configured correctly. The more you can avoid relying on state, the more capable you will be in avoiding unbalanced workloads.

·         Placing the logic in inappropriate places. Don’t let grandma drive that Lamborghini. Emerging web site implementations were developed with an organic view that placed business logic in the front-end portals.  So-called Content Management Systems were developed to provide flexible frameworks for these web portals. Unfortunately, this architecture pattern leads to monolithic, non-scalable designs. Despite the assumed performance overheads implied by modular designs, it is best to put the business logic in back-end engines that can be accessed via services through front-end portals.
Designers aware of SOA’s inherent inefficiencies, tend to architect the system in a traditionally monolithic manner.  However, it is a mistake to shy away from the use of services during the design phase just to “preemptively” alleviate performance concerns. You risk reducing flexibility in the design and this defeats one of the main reasons for the use of SOA.
There are many other, better ways to remedy the performance concerns of SOA:
·         Applying best practices in service design. Watch for service granularity, service flows and the use of superfluous execution paths. For example, avoid “in-band” logging of messages (control messages mixed with the application data-carrying messages). That is, quickly copy the messages to be logged and handle them asynchronously to the main execution path. Make the logging process a lower priority than application work (alerts must be the highest priority!).

·         In SOA, caching is essential. Caching is to SOA what oil is to a car’s engine. Without caching, there is no real opportunity to make SOA efficient and thereby effective. However, provided that the necessary enablers are in place (i.e. ability to use caching heavily), performance is an optimization issue to be resolved during system implementation (remember the dictum: Architecture is about flexibility; engineering about performance.)

·         Finally, with SOA there is a need to proactively measure and project the capacity of the system and the projected workloads. Modeling and Simulation must be a part of the SOA performance management toolkit.

More on each of these next . . .