Friday, April 2, 2010

Dashboards and Centralized Logging

“Doctor,” the patient says, “When I press on my calf with my finger, it hurts…”
“I see,” the doctor replies.
“And then when I press my finger on my thighs, it hurts too!”
“Really?” The doctor mutters skeptically.
“I tell you doctor, I must have something serious because it also hurts whenever I poke my arms, my chest and my neck! It hurts all over!”
“Have you considered,” the doctor asks, “that it might be your finger that’s broken?”
Establishment of a control layer will give you access to the needed sync points from which to control the parameters affecting the dynamics of the system. The SOA Fabric (which includes the human element!) will have to react to the conditions you’ve set in these message headers. In the end, while the management infrastructure is there to provide you with real time snapshots of the system, it will be up to you and your staff to properly diagnose the problems that do arise.
Having an SOA Fabric, a Control Layer, and a Centralized Logger gives you the opportunity to view and manage the SOA system in the same grandiose manner as Captain Kirk on the bridge of the Starship Enterprise. The point being that part of the early transformation plans should include the design and development of a centralized dashboard capability for management of the SOA system.  Attempting to run an SOA system without a dashboard that offers you a 360 degree on-demand view of all the elements of your system is not unlike Slade—the blind man in “Scent of a Woman”—driving a Ferrari at high speed on the streets of the city.  (Now, as the doctor in my story shows, you’ll need to make sure your dashboard system is not broken and that it properly detects and diagnoses what’s wrong with the system and not what’s wrong with itself!)
The Dashboards would not be possible without a Central Logging Server and the placement of event triggers that can drive the dashboard displays. This server must be part of your initial system design and should be treated the same the same as you would any other database for business analytic purposes. To the extent that you capture logged data over time and obtain detailed analysis of your system dynamics (resource utilization versus performance/failures), you will also develop the capability to pro-actively plan the future evolution of your system and to better understand the thresholds that might trigger failures under stress conditions.
Needless, to say, logging and monitoring should minimize interference with the actual operation of the system as much as possible. Logging should always be conducted on an off-band basis. This means that all logging events should be sent to the Central Logging Server on an asynchronous basis. Do you want to log an entire message? Copy it and log the copy, but do not make the production messages flow through the logging logic. In other words, you should duplicate the message containing the logging information and send it to the log server without increasing the latency of the main message flow.
You should be able to run reports against the various logs in the database to proactively identify deficiencies and trends requiring attention. Ideally, you will extract and backup the appropriate summaries and all the logs that you are mandated to preserve for business or legal compliances reasons.
I for one don’t believe you need to make this Central Logging Server a fault-tolerant element in the system, but you should certainly make certain it receives the appropriate amount of attention to ensure its high reliability.
Clearly, the entire area of system management for SOA is very complex and I have just scratched its surface. The key message to keep in mind is that managing SOA is not the same as managing traditional legacy environments. You will need new tools, new methodologies, new processes and prayers to new gods to make it all work!