Friday, November 19, 2010

Information Distillers, Aggregators & Your Electronic “Mini-mes”

As the years ahead move us toward the enabling of understanding and wisdom, we should expect an increase in the commercially available services leveraging these new automation models. For example, consulting is already an embedded part of services provided by professionals, but in the future, consulting will evolve into a set of online services provided via moderated access to human experts or via the access to software-based expert systems. Whether they are made of flesh or metal, these will be bona-fide Information Distillers will always be ready to augment your thirst for information at the push of the button and the opening of your Pay-Pal wallet.

Emerging Information Distillers will successfully locate and turn the required information into “understandable bits” which can be digested by customers under several revenue models. While in principle these services are not fundamentally different from those provided by traditional consulting entities such as the Gartner Group or your corner H&R Block, the difference is that they will be democratized and available to all—individuals and companies alike. For example, in travel, distillers will not only publish travel magazines (electronically or via hard-copy), but will also package tours and offer special negotiated travel deals. ( can be seen as a first generation distiller leveraging the power of social networking.) However, information distillers in the future will be able to provide personalized advice either from paid human experts or from next generation expert mining tools.

As electronic commerce becomes more pervasive, and the speed and specialization of business increases, proxies or electronic avatars will become more prevalent. Functionally, such an avatar will not be much different from today’s travel agency role when booking travel for a client. However, whereas today’s agencies do not truly represent the interest of the traveler (agencies, in principle, represent the interest of the supplier), future avatars will act as your proxies—your electronic “mini-mes”—working automatically under business and engagement rules that you’ll define in order to be presented with the best deal.

As artificial intelligence becomes mainstream, and as technical standards facilitating electronic brokering are implemented, these avatars will become virtual software entities capable of representing you, the consumer. Eventually, avatars will completely broker and execute the best possible arrangements for you.

This type of avatar is already a reality in the hectic world of electronic trading, where complex software algorithms make nanosecond level decisions on whether to buy or sell stock assets. From this world, we should be forewarned that, as proxies become more commonplace, we must be prepared to face the consequences of relying too heavily on software avatars endowed with automated decision making permits. On September 7, 2008, in an already volatile and jittery financial setting, a Florida newspaper accidentally entered an old web article detailing United Airline’s 2002 bankruptcy. Google, all-obligingly, indexed the article and distributed it to e-mail subscribers who had requested alerts on any news regarding this airline. This is where automated software proxies took over. The stock trading software scanned the article and found the keywords “bankruptcy” and “United Airlines” and automatically ordered sales of UAL’s stock portfolio. Other software robots, responsible for monitoring unusually large trade volumes in the stock market, quickly took notice of the sudden sale of UAL stock and proceeded to sell their stock. The outcome was a selling frenzy that resulted in more than one billion dollar loss to UAL stockholders. The Securities and Exchange Commission began an investigation to determine responsibility. After all, who is at fault? The Florida newspaper? Google? The developers of the software? The companies that transact stock in such a perilous manner?

Clearly, we are entering a brave new world that requires added protocols to safeguard software agents going rogue and to answer the myriad concerns related to protection of privacy; not to mention the expected security issues related to fraud and software impersonators, with the logical progression to identity theft. In the meantime, if you are in the supplier’s side, you can start designing your systems to enable this future “Electronic Mini-Me” concept. Define and be prepared to have the appropriate services and architecture layers that can leverage the deployment automated selling brokers.

As you define this architecture, you will have to rely heavily on the implementation of publish/subscribe systems and asynchronous response patterns. You will also need to focus on implementing a sophisticated combination of Business Rules and Business Process Management based systems that can allow your business team to easily configure and define the automated way broker services will be made available to your customers. For example, these brokers could be configurable to making distressed inventories available electronically and able to dynamically price on-line offers with available inventory via dynamic revenue management rules as applicable. Think of how an electronic auction process in works but on steroids.

Just as the electronic avatars discussed here are a practical instantiation of the move towards cyber-understanding, future systems applying basic rules-of-wisdom will emerge. True, Wisdom will always be a purview of humans and not computers. However, following the precepts of “Wisdom of the Masses”, we are now experiencing the benefits of the wisdom provided by virtual communities; areas where we find reviews in a broad range of areas, “How-To” tips, and better deals. A case can be made that this wisdom is an emergent property, resulting from the aggregation of large catalogues and information, and the associated tie-in of user areas and access to content. These areas are best represented by “Virtual Malls” such as, and, but are also expected to rapidly merge with social networking sites in the so-called Web 2.0 world.

There is already a linkage between merchandiser sites and places such as,, and This integration will ultimately occur via business partnerships or mergers, but it will be initially accelerated by automation known as Collective Intelligence, the process that combines the behavior, preferences or ideas of a group of people or sites to gain new insights[1].

Analogously, it is to be expected that, as this vertical industry matures, we will continue to see the emergence of portals specialized according to industry. That is, we will see “electronic virtual malls” integrating offerings on the one hand, and acting as “aggregators” dealing with the specific industry groupings. The aggregators will be able to convert the volumes of data found on the Internet into useful information. This information will be presented in a form which will be customized for information seekers as a consolidated package of knowledge. The automated assembly of related knowledge designed to fulfill the "seeker's" goals can be related to the area of specialization of the site. This trend will be evident first in consumer-facing verticals such as travel sites,, and various other special-domain sites such as and The question you’ll need to answer is how to make your company part of this new world?

[1] Programming Collective Intelligence—Toby Segaran

Friday, November 5, 2010

Data, Taxonomies, and the Road to Wisdom Revisited

While early computing was referred to as “Data Processing”, the term “Information Systems” became prevalent with the increased sophistication of functionality. This makes sense. After all, there has always been a platonic goal to have computers process information just as humans do, except much, much faster. As originally framed, this goal was known as AI (Artificial Intelligence) and, despite some early successes with heuristic algorithms and neural networks, AI research eventually reached major roadblocks. Ultimately, AI’s most touted commercial achievement was the codification of narrow domains of expertise under the guise of “Expert Systems”. Expert Systems went through a hyped-up phase back in the Eighties only to fade away with the realization that the logic needed to replicate how humans process and organize knowledge is dependent on contextual, subjective, and often un-expressible decision-making rules. In other words, we humans process knowledge in a manner that is often inaccurate, biased or even intuitive. Still, the subjectivity of our knowledge has served us well along our evolutionary path and is more than enough to help us deal with quotidian existential needs, even if this knowledge is not always precise. (Who cares if a tiger wasn’t actually hidden in the brush? Your ancestor taking off upon the rustling of leaves was only being sensible!)
Recently, implementing “fuzzy” and flexible computer logic has yielded more effective AI applications, particularly to systems applying the Bayes Theorem, which relies on prior and conditional probability logic. Modern Machine Learning algorithms applying this and other algorithmic variations usually return reliable results to problems dealing with pattern and language recognition. However, given the probabilistic nature of the base algorithms, results are sometimes wrong. To err is not only human. Today’s computers can also err.
Given that the Bayes rule and other algorithms provide results that are not always correct, we may well conclude that there is a universal law stating that intelligence implies fallibility. If we are ever going to rely on these systems for life-and-death scenarios, we will need to incorporate some form of control feedback in the way they reach their results. Perhaps in humans, “Wisdom” is that control.
But how then do we attain Wisdom?
By now, you may have noticed that I am using the term “Information” in its most generic sense. Information can often be “misinformation”. Yet, misinformation and even lack of information are also forms of information (a dog that failed to bark was the clue that helped Sherlock Holmes solve a crime). When implementing information systems it does help to classify the type of information we are dealing with. There is raw information, and there is wisdom-based information. This progression to wisdom involves a series of steps that must be methodically climbed: Data, Content, Knowledge, Understanding and ultimately Wisdom.
  1. Data is primarily raw figures and “facts”; by nature it is voluminous and difficult to deal with and so is best stored and communicated in a mechanical way. Data can be wrong. The old GIGO adage (Garbage In/Garbage Out) captures what ought to be the highest priority in the automation of data: ensuring that the data inputted into the system is correct. Do not become confused by the term “Big Data”, by the way. Big Data actually refers to the Knowledge step on the ladder as it deals with the acquisition of knowledge via so-called Data Science analytics.
  2. Contentis data that has been collated, ordered and classified. That is, Content is Data plus its Taxonomy. Taxonomy is the categorization or classification of entities within a domain (the actual structure of the domain is defined by its Ontology). Consider the following taxonomies used to describe the animal kingdom.
    Linnaeus Taxonomy:
    • Kingdom: Animals, Plants, Single Cells, etc.
    • Phylum: For Animals: Chordatas, Nematoda (worms), etc.
    • Class: mammals, amphibians, aves. . .
    • . . . et cetera
    In "The Analytical Language of John Wilkins,“ Jorge Luis Borges, the famed Argentinean writer who belongs to the ontological set of writers who deserved to win the Nobel Prize but didn’t, describes 'a certain Chinese Encyclopedia,' the Celestial Emporium of Benevolent Knowledge, in which he lists this very unique taxonomy for animals, classifying them as follows:
    • those that belong to the Emperor
    • embalmed ones
    • those that are trained
    • suckling pigs
    • mermaids
    • fabulous ones
    • stray dogs
    • those included in the present classification
    • those that tremble as if they were mad
    • innumerable ones
    • those drawn with a very fine camelhair brush
    • those that have just broken a flower vase
    • those that from a long way off look like flies
    • others
  3. Knowledge is what is produced when the information is placed in context and the resultant significance of relationships within the data is realized. The addition of contextual information requires some element of human input; so the progression to this stage will most likely not be possible through the use of computers alone.
    To see the difference between Content and Knowledge, I suggest you try this exercise: Go to and enter “IBM Apple”. You will get content listing all the sites in which IBM and Apple are discussed. Now, go to and enter, “IBM Apple”. You will get a digested and structured response comparing these two companies. The former is content; the latter is beginning to look a lot like knowledge.
    Production and discovery of knowledge is at the core of many start-ups business plans today. The emerging field of Data Science is leveraging big data sets to mine data in ways that produce knowledge.
    Organizations, such as Gallup or Nate Silver’s FiveThirtyEight, exist to mine data and content and produce knowledge on a variety of topics. Voting trends, consumer preferences, etc. are examples of mined knowledge. Business Intelligence, associated Data Mining technologies, and the more recent Internet-driven “Collective Intelligence” applications are examples of the more recent trends in the automation of knowledge acquisition. We are in the midst of moving from the Age of Content to the Age of Knowledge.
  4. Understanding is interpreting the significance of relationships between two or more sets of knowledge and deriving prime causes and effects from these relationships. While Gallup may unearth the knowledge that 33% of voters are likely to vote for a particular candidate, understanding why they lean that way is something that information systems can only hint at. Understanding remains an endeavor only humans are adept at. No matter what you may hear from the “hypesters” (not to be confused with the hipsters!), understanding cannot yet be performed by computers. As much as it might appear to be the case, the Siri and Google Talk systems lack an understanding of your commands.
    “Understanding” is how consultants and advisors make a living. Companies such as Gartner or writers of popular science and “How To” books are in the business of providing distilled understanding. Of course, if you happen to watch regular Sunday morning political discussion programs showcasing pundits and politicians in topical debates, you know that the “understanding” you get from these guests often can be biased and even wrong. Enter wisdom . . .
  5. Wisdom is the ability to choose between correct and faulty understanding. This is the famous feedback loop I referred to at the beginning of this article. The fact is, understanding can be the result of wrongly extracted knowledge, which may come from bad source data (outright misinformation), or content improperly formed with inappropriate taxonomies. For example, the taxonomy that classifies human beings according to race or some other categorization of “otherness” often leads to xenophobia, homophobia or racism.
    Wisdom represents the highest level of value in the information progression. Wisdom is not always objective or static. It can be subjective, and it is certainly dependent on the cultural environment or transitory circumstances. This is why it is unlikely that we will ever be able to codify “hard-coded” wisdom within computers, and why the belief that these future computers may act as judges in the affairs of men is dubious at best.
    Wisdom can be applied toward either material or spiritual benefits. Yes, Wisdom can be applied for profit and business advantage. However, just because something is applied with wisdom, does not dictate whether it is right or wrong. Beyond wisdom we enter into the realm of morality and philosophy. Even this last point is open to debate. Some have an “understanding” that moral-relativism is wrong, but some of us don’t think so.
    But I digress. . .
    Whether future software will be capable of Understanding (much less Wisdom), is open to debate. There is much we still do not know about how we humans think and about the nature of our cognitive processes. Humans mastered flying only after they stopped trying to replicate the way birds take to the air. Avionics accomplishes flying even better than birds by leveraging the underlying laws of nature; something birds do, only differently. This is why I believe that multi-million dollar projects such as the European “Human Brain” project[1] and the American-sponsored Brain Activity Map Project (BRAIN), that try to map the neurons in the human brain not unlike the way the human genome was successfully sequenced, have the markings of being fools’ errands. Recycling an old saying: “It’s the software stupid”. If the much predicted Singularity is to happen, it will probably require computer systems that “think” very differently to the way we humans do. And that “thinking” will be software based. Even then, I cannot conceive of truly automated wisdom (aka “Strong AI”) without first solving the question of what is “consciousness”. We are a couple of Einsteins away from figuring that one out.
    But I digress again . . .Whether strong AI is feasible or whether the Singularity will occur are problems best left for the next generation. As you stand securely atop the Content stage, remember that nothing is stopping you from moving up the next step on the road to wisdom: the Knowledge stage. Time to dive more into that Data Science stuff!

[1] See this link for a status on this project: