Thursday, August 20, 2009

The SOA Distributed Processing Pattern

It’s said that one of the keys to human intelligence is the ability for abstract thought and to instinctively rely on patterns. By expediently matching new situations to a “library” of pre-existing patterns normally referred to as “experience”, humans have been able to react more quickly in the face of new challenges. The sky is covered with dark clouds? No matter the shape of the clouds, their darkness and conglomeration indicate a storm is on its way. A large animal growls and salivates as it menacingly stares at you? I doubt you will stop to investigate what’s this is all about. If you did, your chances at reproducing would be as low as those of an ascetic monk. There’s no question that pattern recognition has been a key to our survival as a species.

Patterns have hierarchies, and the highest level pattern hierarchy deals with the overall system structure. I will discuss more about the use of patterns specific to SOA later, but first I want to discuss the broader Distributed Processing Pattern because the introduction of SOA has forced a rethink on how this pattern is defined. Just as a typical DMV office has the frowning employee at the window, the sullen clerk riffling and stamping papers in the back, and the rack with files along the back wall, most traditional distributed systems models have converged to a pattern consisting of these three tiers:

1. A Presentation tier which displays the program’s output and allows the user’s input.

2. A Business-Process tier that deals with the “heart” of the application. The actual business rules and processes are performed here.

3. The Data tier. Applications request user data via the presentation; then process the request within the business-processing area and interact with data as appropriate.

This three-component pattern has traditionally been referred to as 3-Tier architecture. Furthermore, traditional proponents of distributed processing use this 3-Tier architecture term to physically map each of the parts with actual distributed components. In this very literal interpretation of the model, the desktop devices perform presentation functions, and an intermediate server computer does some processing and then accesses data, usually via SQL or Store procedures. This fixed distributed model is typical of what was originally promoted by Data Base vendors as part of their preferred architectural model (e.g. Oracle Forms, using PL/SQL). The problem with this view of distributed processing is that it takes such a physical view of the distributed system that it soon becomes very static and inflexible, failing to accommodate new technology capabilities.

Because the PCs emerged outside the realm of the mainframe priesthood, the sad reality is that just as with a very intelligent blonde (not an oxymoron, all joking aside!) desperately trying to get a date, PCs had to sneak into the corporate world by pretending to be dumb terminals, a good fit within the static boundaries of a traditional presentation device. Also, while old intermediate systems were mostly used as communication switches or to act as specialized gateways, the physical view of the 3-Tier model tended to view today’s Servers just as database front-ends. Things have changed significantly. "Access" devices such as today's personal computers and wireless devices like your phone have tremendous power. The traditional 3-Tier view can’t accommodate their broader use.

Whereas the traditional distributed processing pattern separated processing into three physical tiers (presentation, business processing, and data), in reality, data rarely resides in a single source, and business processes cannot always be executed from a single server. Also, in real life, computation can take place anywhere, and even though organizations tend to be hierarchical, the actual business flows look more like a network than a strict hierarchy.

If SOA is to mirror this meshed topology then we must shift the paradigm somewhat. A proper SOA design should support true distributed environments; not just three tiers, but rather an n-Tier meshed topology with an intrinsic 3-Layer logical pattern.

The fundamental distributed pattern with SOA is that there are three layers; and multiple tiers—something I describe as the n-Tier/3-Layer SOA Distributed Processing pattern. The shift from Tiers to Layers has important implications: the layers in SOA are logical and are not meant to directly represent the underlying physical systems.

A typical SOA scenario is shown below:

This n-Tier/3-Layer pattern exists independently of the actual number of computers or entities. For example, imagine that the service pattern above depicts airport Kiosks displaying flight information. The user inputs the desired airline via the touch-screen terminal P1. This entry originates a service request to business process B1. Business process B1 logs the request by calling an authentication and log service that front-ends the database D1. Once the request has been authenticated, B1 requests the assistance of business process B2 (either one of the two B2’s shown). Process B2 may call the assistance of B3 for as many services as needed. It then extracts the flight information for the selected airline by calling service front-ending database D2. Finally, B2 returns the information to B1 which then passes the result onto P1 to output the requested flight information.

When dealing with this level of system design, little is assumed about the physical nature of the environment. It might well be that, initially, all business processes depicted (B1, B2, B3) execute in the same machine in which databases D1 and D2 reside. A second instance could have B2 running in a separate server, and so on.

The system can be scaled up by allowing the deployment of multiple service instances on different systems. Multiple instances also happen to improve the system robustness. The presentation services P1 and P2 may or may not reside in separate computers (remember the transparency tenets discussed earlier). Furthermore, assume there is an increase in the number of transactions going to the computer handling the business processes, and so we now wish to move business processes B2 and B3 to another machine. No problem. A key attribute of the n-Tier/3-Layer service oriented pattern is that there is no need to change applications when deploying services in separate computers.

Say we find a vendor is offering a cheaper and faster way to do things than our own B3 service. No problem. B3 can then run from the external vendor’s system. As a final note, you may have noticed that not once have I mentioned whether these computers run on Microsoft or Linux software, or are a mainframe or PC. Why not? Because the technology transparency tenet is that all software should be able to run on any given platform.

The concept of Cloud Computing, based on computer infrastructure available as a virtualized computing service via Internet-like mechanisms, is emerging as one of IT’s future directions. The idea is that the higher penetration of standards and the convergence of technologies is driving commoditization to the point that we don’t much care about what kind of technology provides the service we receive. Having an n-Timer/3-Layer pattern is a necessary (but not sufficient) condition to allow your solution to eventually garner the benefits of cloud computing in the future.

Having said this, while it’s easy to appreciate the flexibility that this type of architecture provides, keep in mind that it does have its drawbacks! For starters, there could be overhead in computing processing and message delivery latencies. This type of architecture is not designed for performance but for flexibility. On the plus side, a smart service-oriented design can optimize the way services are called and how data is passed between components via judicious use of caching techniques. Secondly, n-Tier/3-Layer can be complex, especially when deployed in a distributed fashion. SOA demands an extra focus on management and control. Thirdly, you’ll need to tighten your deployment guidelines or you might end up with a zoo of redundant services, just like when you see a traffic cop signaling traffic even though the semaphores are working just fine. Lastly, we began with patterns and end with a reaffirmation for their need.

A meshed system like the one shown has an exponential number of combinations, and it would not make sense to try and architect specific SOA arrangements over and over. Instead, the industry has now defined a series of SOA patterns that system architects can apply. Managing and taming the complexity of an SOA solution demands a disciplined use of patterns.

The story of how to make SOA work on the face of these challenges will be my next topic.