Monday, October 6, 2014

The Advent of Social Sentience

Gutenberg had no idea that his printing press would result in the publishing of more than 120 million book titles. Watson’s steam engine led to an industrial revolution that has given us practically everything we define as civilization today, climate-changing carbon footprint included.  Add to this the advent of computers, and the more recent Internet explosion, and it’s not hard to conclude that every major human advance of the last centuries has been driven by a confluence of diverse factors combining in unexpected ways to create brand new technological and cultural spaces.  Version 1.0 of the Digital Age is so twentieth century when compared with how key developments for Platform, Access, and Information are now converging.   Enter the emerging field of Social Sentience.

Platform is the result of the evolution of computer systems toward the concept of infrastructure-as-a-commodity. The system engineering techniques of virtualization, clustering, and abstraction of layered services now allow the sharing of gigantic Platform-as-a-Service (PaaS) computer system farms known as “the cloud”.
When it comes to access, the increased pervasiveness of mobility, network access, and continued   miniaturization and cost reductions as predicated by Moore’s Law are coalescing into the more recently hyped “Internet of Things”.  We should not be surprised to see more and more personal biometric devices and appliances seamlessly connecting people with computers. We are entering an era of pervasive access to information; an Infosphere, where ultimately we will remain connected at all times, and the devices used to connect will no longer need to be personal, but will instead be embedded in our environment.  
The recent emergence of social media, along with the explosive online participation of over a billion people across the globe, has resulted in the creation of huge amounts of data. It is estimated that 2.5 Exabytes of data are generated everyday (you would need to buy one billion 1Gb thumb drives at Costco to get an Exabyte).   The potential value of mining this data has provided the economic justification to trigger advances in information and knowledge technologies:

  • Data Science that deals primarily with leveraging data for more specific predictive analytics purposes that can be applied to recommendation engines, logistic regression, spam filters, fraud detection, and forecasting.  In many ways, Data Science can be viewed as a more comprehensive form of traditional data analytics based on SQL based OLAP (Online Analytical Processing) data warehouses. 
  • On the other hand, Cognitive Computing deals with algorithms that learn and interact naturally with people. Also, while Data Science deals with explicit data, Cognitive Computing creates implicit information by applying heuristics for inference reasoning that exploit hidden patterns and correlations found in big data sets. The results might appear magical. Perhaps the most well-known example is the IBM Watson technology that was capable a few years back of winning a Jeopardy competition against top human players.
  • Collective Intelligence, also known as “Wisdom of the Crowds” (WOC), deals with the mechanisms that gather the combined input of millions of people in social media communities. Analysis of social networks is emerging as an ancillary element to help identify social influencers and relationships.  WOC can be leveraged by mining the millions of reviews and comments entered daily on sites such as Yelp, TripAdvisor, among others, and also by evaluating and capturing explicit advice and comments posted on blogs and social communication sites such as Facebook and Twitter.

What is Social Sentience?  Well, the dictionary defines sentience as “responsive to or conscious of sense impressions” or as “finely sensitive in perception or feeling”. The convergence of all the previously mentioned advances will lead to a surrounding social environment that can become sentient in the truest form of these definitions. This Infosphere will automatically leverage all the above technologies to give you access to information and resources that represent the best summarization of the combined societal knowledge, but with a highly personalized interaction. Social Sentience will act as your very own personal advisor. 
So, what would a Social Sentience engagement look like?  Today, when you get a recommendation from Netflix or Amazon, chances are that their algorithm bases its recommendation on your history or the history of people who have used similar products in the past. This is but a glimmer of things to come.
With Social Sentience, the “engine” will continuously monitor the environment around you—things like your location, the time of day, the current and forecasted weather, and your exercise level.  The engine will continuously match this environment to your known and inferred preferences and, depending on your choices, it will proactively provide you with advice and recommendations.
Imagine this scenario, you are walking the downtown and reach the proximity of a seafood restaurant that the Social Sentience engine discovers was visited by one of your Facebook friends (you respect this friend since you have “liked” most of her comments), and  who has entered a positive Yelp comment regarding the food.  Since it is lunchtime (in the near future, bio-sensors could track whether you are due for a meal), the engine will trigger a message pointing out the nearby restaurant.
Previously,  the engine first checked your food preferences—after all, if you don’t like fish, your friend’s recommendation will not matter—and also validated your appointments and credit card balance to ensure you have the time and the pockets to eat there.  The resulting experience, the quality of the recommendations and advice, as well as the timeliness and relevance, will be a quantum leap over what you are experiencing today.
In other words, Social Sentience will give you an array of offers, opportunities, and choices, which will be highly curated around your specific tastes and preferences (many of these inferred via machine learning algorithms), as well as taking into consideration the tastes and preferences of your social circle. This view will be contextual (are you on business or on vacation?) as well as appropriate to the time and circumstances.  
As you can see, having a Social Sentience system that delivers these capabilities involves an integrated use of the three fundamental tracks I’ve mentioned.  I am not talking about more esoteric “artificial intelligence” topics, although a Social Sentience engine like the one described here may appear to be “intelligent”.

What lies on the horizon is not outside the realm of current technology. The actual challenges come from valid socio-political and cultural concerns.  Things could get Orwellian very quickly. How can we ensure that this type of service does not transgress upon well-established privacy expectations?  And then there is certainly the question of safe-guarding what’s being tracked, as well as the wealth of information mined from your behaviors and consumer patterns.  Clearly, protections against security breaches by hackers and others will also be a major issue. It must be said that most of these issues are already surfacing in light of how Google, Facebook, and other social media sites are gathering tons of data about you even as we speak. My own view is that the ultimate solution will have to come with actual legislation that takes into account the current world we live in by delivering us a much-needed Digital Bill of Rights.  Perhaps as we finally elect more and more technological-savvy representatives, this will come to pass. 

Tuesday, August 12, 2014

How to boldly go about creating a mobile strategy

For all the advanced sci-fi paraphernalia in the classic Star Trek series, it would be fair to say that the communicator device Captain Kirk used when on an alien planet was probably surpassed by the Motorola Razr several years back. With millions of apps, an untamed cornucopia of emerging mobile technologies, and a future that everyone agrees will be all about mobility and more mobility, the need to create a mobile strategy seems almost as daunting as the Star Trek deep space explorations.
Creating a mobile strategy is such a bold effort simply because there are so many options and alternatives out there. What functionality are you to provide? What are the target devices to support? (Tablet, Phablet, Smartphone, Kindle, Laptop, Google Glass?) What are the operating systems to use? (Android, iOS, Windows 8) How should you approach the implementation? These are but a few sample questions that appear unanswerable at first.
Thing is, a “mobile strategy” is not very meaningful as a standalone exercise. A mobile strategy should and must be something that naturally emerges from your enterprise’s specific business objectives and priorities. Attempting to articulate a mobile strategy without first understanding the business drivers will make your journey more like a “Harold & Kumar’s Go to White Castle” adventure. Mr. Spock would not approve.
For example, if the business objective is something as straightforward as having smartphones and tablets show the same content as that on the web site and nothing more, your approach would be to normalize the presentation technology to flexibly cover as many devices as possible with a single implementation. HTML 5 would be the way to go in this case.
However, forming a more detailed strategy does quickly become a more nuanced challenge once you try to address specific mobile functionality requirements. Do you need to introduce revenue generating functions across the mobile spectrum? Is the main purpose to provide service and support via mobile devices? Is this primarily an informational and branding exercise? Is the mobile strategy core to future operations?[1]
Only after you have gotten a clearer sense of the answers to these questions will the traditional evaluation of scope of functionality vs time to market vs budget apply (see my prior blog Defining the Project Scope and Requirements for more on this). Compounding the analysis is the fact that a variety of choices regarding target channels, client devices, and technologies is hardly homogeneous. Mobility today is a vendor-driven field, and it will continue to be so until the day it becomes satisfactorily covered by industry standards.
Once the business provides direction of the target market for the mobile applications, you can use this information to decide which devices you should target first. There is a wealth of statistical information on the web to help you[2]. For example, in finding the right choice for the largest demographic, it helps to know that Android devices represent 61% of the market versus 36% for iOS.
Obviously, you will want to get more granular information about device usage based on these demographics. If the target market is West Coast millennials, chances are they are heavier Apple users. If the target market is a mobile sales force with tablets or the corporate staff equipped by internal policy with Windows 8 Nokia Smartphones, knowing what to prioritize becomes self-evident.
In other words, the first steps of a Mobile Strategy are to define the WHAT, the TO WHOM, and even the WHEN things should be accomplished in order to best support the business objectives. Perhaps the decision that’s most open to the IT area is selecting HOW this strategy will be implemented. But even then, the technology decisions need to be consistent with the business purposes of the strategy. For example, if the functionality being provided is deemed to be mission-critical and generates revenue (say, you are asked to permit purchases of your products from a mobile device), it is sensible to develop a specialized App for the target devices. Furthermore, if the functionality is deemed to be something that provides a competitive advantage or happens to be THE product (e.g. the item is a Game App that you are marketing), then it also makes sense to create the internal core competencies to develop it and maintain it yourself and not try to outsource its development.
If the target market is focused on a specific platform (say Apple’s iOS), and it is expected this will be the case for the foreseeable future, it would make sense to develop the application directly with the Apple development framework. But if performance or long term maintainability is not as important as breadth of support and time to market, then it is best to explore the use of cross platform development tools. However, heed this warning about these tools: These should be viewed mostly as tactical solutions only. They will enable you to cover more devices, more quickly, but you will be restricted to lowest common denomination functions (lest you choose to customize implementations for a particular platform; thus defeating the purpose of having a single solution across the space)[3].
To summarize, a Mobile Strategy should follow something similar to the Table of Contents shown below. Still, you should always keep an eye on the future. You may have an additional section or an appendix with recommendations as how best to understand and position the strategy vis-à-vis emerging mobility trends. The sooner you know what you will do with wearable devices such as Smart-watches or specialized devices such as Fitbit or biometric monitors, the better. It’s never too soon to prototype how mobile functionality might play with the so-called “Internet of Things”. Such research efforts should also be part of the strategy.
You can’t expect to get your initial strategy 100% right. As with any strategy, you should assume your mobile strategy is a living thing. Keep it updated and fresh in light of new developments. In fact, there is a good chance that the business objectives regarding mobility will quickly change as well. Remember, just as with the USS Enterprise, your mission is one that includes a great deal of exploration!
  1. The business objectives driving the strategy. This is a recap of the objectives the strategy is seeking to meet. These objectives should reinstate the business purpose as well as timeframe and funding expectations and constraints.
  2. What is the specific business strategy regarding mobility: functionality, priorities, target demographics, etc. Remember that a mobile strategy should also specify the functionality available when the device is offline.
  3. What is the end-to-end architecture supporting the mobile strategy. In the end, all mobility strategies are about making information accessible to the user wherever the user happens to be! The end-to-end architecture must fully describe the component partitioning of user interface, business logic and backend data in accordance with service oriented constructs.
  4. What are the preferred technologies used to support the mobile strategy. This is where you describe the operating systems, languages and development frameworks to be utilized, along with best practice tenets that must be followed. If the mobile strategy deals with employee mobility requirements, you should decide if your company will adopt a BYOD (Bring Your Own Device) approach. Either way, you will need to specify the mobile devices that will be purchased or leased for employees at the brand and model level and what should be supported.
  5. Integration with third part content, applications, and social media. If the strategy calls for integration with third party services such as Facebook, Instagram, Google Maps, etc. you will describe the approach to this integration from a presentation and API perspective.
  6. Implementation strategy. Will you purchase third party applications or toolkits, will you outsource proprietary development to third parties, will you recruit contractors or internal talent, will you use a hybrid approach and if so, for what particular components.
  7. How will you provide support and maintenance. This section should also describe support processes as well as the expected service levels. Emergence of 5G will further dilute the difference between accessing online services via WiFi or carrier networks, but you should try to adjust service levels accordingly (e.g. if the user is about to make a large data transfer via carrier network, warn them about potential cost implications).
  8. Security and Privacy. Here you describe the approach to authentication and encryption, if any.
  9. What is the implementation roadmap? Clearly, cost and funding levels will heavily influence the pace of implementation.
  10. Testing and User Acceptance criteria. Obviously, this section should be fully negotiated with the business so that clear success criteria can be established.

[1] Take the example of Hilton’s recent announcement that it plans to invest $550M to integrate guest’s Smartphones to its property operations. Even though most of this capital will probably be used to upgrade hundreds of thousands of rooms locks rather than on actual mobile technology, this is a good example of a holistic investment.
[2] The Mobithinking site has a great deal of research on this. Check it out.
[3] Cross platform tools can force an unwanted dependency on the tool vendor (these vendors do not always last long), and they will likely present some performance issues caused by the sub-optimal use of specific platforms.

Thursday, June 5, 2014

Your organization as a sports team

The World Cup in Brazil is about to start and after reading the leading articles about the various team’s line-ups and alignments, I was reminded that there is nothing like the world of sports to serve as an analogy about the importance of structuring a team in accordance with skills. Take for instance Manchester United (MU), one of the world’s top football franchises. MU comfortably won the 2012 English Premiere League championship under the direction of Sir Alex Ferguson.  When Sir Ferguson retired, his assistant was promoted to the coaching role with essentially the same team, but they ended up with an embarrassing seventh place the following season. What gives?
It turns out more often than not that actual results are more dependent not on who is on your team, but on how available skills are used.
I once took over a team that had the reputation of failing to deliver on time. When I analyzed the issue, I discovered that the group had been working under a single development pipeline and, as a result, strategic deliverables were being impacted by the team’s need to do short term firefighting and to address urgent tactical requirements. I also discovered that, although most of the developers where extremely professional and were working very hard, each resource was not necessarily being assigned to the task most aligned to her skills.
It often comes as a surprise to technology dilettantes that not all computer folks are the same. A former CFO initially turned down my request to hire a DBA, given that I already had plenty of “IT guys” doing the PC support stuff (he did approve it once I explained the difference!). Behind the presumably geeky-look of a software developer may lie someone whose core strength is the ability to methodically reverse engineer others people’s code, or someone more attuned to creating highly optimized algorithms; others might be more comfortable dealing with data or with user interfaces.
Our role as leaders is to motivate the team, but we also have a role as managers to make sure each resource is given a role that ensures the highest potential. During my diligence, I found that some team members wanted to serve as team leaders rather than being hands-on developers; whereas others who just wanted to code were being asked to manage others.  Understanding the actual strengths and ambitions of every member is essential. This is an area where you can partner with your HR department to gain a meaningful perspective of all technical assets at your disposal.
As in sports, some of your best players are those who can serve in more than one role, but most typically, you will need to diagnose the strengths of each individual contributor. Just as in American football when you would not position a small player as a defensive lineman or a 340 pound player as quarterback, you would not place someone whose expertize is XML or JavaScript as a data base administrator.
In the end, I solved the problem by splitting the team into two groups: one responsible for all tactical, periodic development deliverables (this included software maintenance), and the other responsible for the larger strategic initiatives. The result of managing two separate delivery pipelines was very successful. After this change, target dates for larger projects were met as well as the deployment of several tactical deliverables. Obviously, the situation was not static (another good reason not to use robots as managers!).   The ongoing process must involve assessing each individual’s transfer and reassignment from one function to another; identifying training needs and mentoring team members so that they can progress in line with (realistic) career objectives while best serving the interest of the organization. In other words, doing the business of managing.
When it comes to IT transformation, part of the process is identifying skill gaps as they pertain to your strategy. This is why it is so important to have a realistic strategy along a well-defined high level architecture.  Calling for a system based on LAMP when the majority of your team has a .NET provenance is asking for trouble. In any case, you will no doubt need to plan for needed re-training, particularly if you are introducing emergent technologies. You may want to check a previous blog of mine on training. In any event, part of the strategy is to identify how you will close these gaps. Sometimes you will hire the needed resource, not unlike a coach during the drafting season. Others, you will consider reaching for external consultants, sub-contractors or by doing outright outsourcing, especially if the need is only temporary. The diagram below was depicted on an earlier blog of mine dealing with the sub-contracting and outsourcing topics.  Only you can make your core team shine!

Tuesday, April 29, 2014

Paving the yellow brick road to Big Data

One of my favorite treats as a young child in Mexico City was a candy called “Suertes” (“Luckies”). It consisted of a cardboard roll containing little round candies known as “chochitos” and a small plastic toy (the toy was the lucky surprise: usually a cheap top, a miniature car, or a soldier figurine).  It was a cheap treat—think of a third-world version of Kinder Eggs.  Less third-world was the way these “Suertes” were packaged.  I now know that each roll was formed with a recycled IBM punch card further wrapped in rice paper to prevent the diminutive round chochitos from falling through the used card’s EBCDIC –encoded perforations[1].
Since the cards were essentially eighty column data encoders, I came to this conclusion: Data is fungible; it can even be used to wrap candies!
While the world’s population has more than doubled since the punch card days, data storage capability has grown exponentially during the same period.   In fact, storage capacity is poised to outstrip the maximum information content humanity is able to generate. According to a research study published in the Science Express Journal[2] , 2002 was the year when that digital storage capacity exceeded the analogue capacity. By 2007, 94% of all data stored was digital.  While it is estimated that the world had reached 2.75 Zettabytes of total data storage in 2012[3], we are expected to hit the 40 Zettabytes mark by 2020 which comes to about 5.2 Terabytes of data for every human being alive.
Not only has digital storage become a dirt-cheap commodity, but advances in compression and search algorithms have turned storage into a dynamically accessible asset—a true source of information. The emergence of the Cloud also allows further storage optimization. (I would be surprised to learn Amazon is storing a copy of your online books in your cloud space versus simply maintaining an index pointing to a single master copy of each book in their catalogue.)
The ability to store huge amounts of data in a digital form speaks to the phenomena of “Datification”. True, most of what we are now placing in digitized form are pictures and videos, and studies show that less than 1% of all this data has been analyzed. But even as  more than half a billion pictures are being posted to social media sites every day, new machine learning techniques to help us analyze this type of graphic content are being developed. There is no doubt that we are truly in the midst of the Digital Era. Or rather, the Era of Big Data  .  .  .

Big Data has been defined as having the following attributes: Volume (obviously!), Velocity (dealing with the need to get data via an on-demand, even streaming basis), Variety (encompassing non-structured data), and Veracity (making sure the data is trusted). The field of Data Science is being formed around the exploitation of big data, particularly in ways that take advantage of the emerging properties derived by the four-V attributes.  The emergent phenomenon reveals that Data is now viewed as a product in its own right.
One of the most exciting ways in which  the Data Science/Big-Data phenomena has delivered value is with the unexpected ways data correlations can appear and be exploited for surprising business purposes.  You are probably familiar with how Google is able to track flu epidemics based on search patterns, and how companies are finding ways to market to various demographics based on ancillary consumption data (Wal-Mart noticed that, prior to a hurricane, sales of Pop-Tarts increased along with sales of flashlights).
But while all this is fine from a theoretical and anecdotal perspective, as a CIO, CTO, or IT executive for a medium size or small company you would do well to ask: What does all this hype have to do with my company’s bottom line?
In my last article I recommended evaluating potential big data-applications for your business. Even if you do not know precisely how all this big data transformation will impact you, there are steps you can proactively take now.  Just as in the story of the Wizard of Oz, this is a case where the journey is part of the destination. You should pave the yellow brick road that will take you there:
  1. Revisit the state of Data Governance in your organization. Obviously you should maintain the traditional SQL related roles, but transforming towards big data requires a fresh look at storage engineering, data integrity, data security, and the need to train for and secure needed emerging skills such as those of data scientists.
  2. Establish a “Datification” strategy for your business. Have you ever seen those reality shows about Hoarders? That’s it. You must become a fanatical data hoarder. This is not the time to dismiss any of the data you capture as too insignificant or expensive to store. Part of the strategy is the creation and documentation of taxonomy of data to better organize and understand potential data interrelations.
  3. Re-focus on data quality and integrity. Review your data cleansing and deduplication processes. Adapt them to meet the higher volumes presented by Datification.  The ideal time to ensure the data you capture is as clean as possible is at the point of data acquisition. The old adage of Garbage-In/Garbage-Out still applies with big data, except now the motto is Big Garbage In/ Big Garbage Out.
  4. Normalize the data. Just because the data is in digital form, does not mean you can use it.  Big-data practitioners estimate that about 80% of their work goes into preparing the data in a manner that can be exploited.
  5. Review and adapt the data security strategy. Design your data security strategy from the get go.  I recommend you visit two of my previous blogs discussing the subject of security:  “The Systems Management Stack”, and “Security & Continuance”.  Bottom line, your security strategy should be part of the core data strategy.
  6. Move to the Cloud, even if the cloud is internal. Too much time is being spent deciding whether or not to “Move to the Cloud”.  Most businesses I have come across are wary of placing strategic data assets in a public cloud.  You should separate the debate as to whether or not to make the move to a public cloud from the need to ensure the data can be in a “cloud” form. You cannot have Datification without Cloudification. This means that you should be using virtualized access and storage of your data to the nth degree.  You should ensure decoupling all access of the data from its physical location via appropriate service-level interfaces.   The decision as to whether or not to use a local private cloud, network private cloud, or public cloud or any other variation (Platform as a Service, Infrastructure as a Service, etc.)  is the topic for another blog article. Be aware that if you try to create and manage your own cloud you will need to secure the appropriate internal engineering resources. This is not an inexpensive proposition. Also, you and your cloud consultant will need to define a Storage Area Network strategy that allows placement of heterogeneous data with large scalable capabilities. Following this route will also require you to define non-SQL data replication, data sharding, and backup strategies. The time to start this process is now.
  7. Conduct a census of useful externally available data. A key premise of big data is the view of data as a product in its own right. Not only are you positioning your company’s data as a capitalizable asset that could potentially be made available to others as a revenue generating option, but you will also be in a position to access and exploit data assets available by others. At a minimum, you should conduct a census of potential data set sources openly available from public entities and governments and define a strategy of how you can better exploit these assets. 
Obviously you will have to face the task of justifying the needed investment to your CEO and financial controllers.  Projects related to data virtualization intrinsically improve availability, and other projects dealing with security (PCI or otherwise) should all be justifiable purely on best-practice, business continuance basis.  You will need to tap into traditional operational budgets to better fund them.  Also, this is one of those cases where you will need to find obvious functional features that you can jointly sponsor with your business partners (these are the proverbial “low hanging” fruits).  If there is not enough money (when is there?), you don’t have to do everything at once. You can begin with data elements your taxonomy has identified as most essential.
Furthermore, there is an increasing realization that big data can actually be accounted as a company asset. After all, the company valuations of Facebook and Twitter are primarily based on the strength of their data sets. For example, it is currently estimated that the value of each member to Facebook is about $100. Customer acquisition costs in the social media space are usually estimated to be in the range of $5 to $15; so properly structuring consumable data sets can be used as part of your financial justification.
That’s it. This endeavor should keep you busy for a while.  At the end of the road you will have proven you had courage and a heart all along; plus you’ll get a Big Data diploma too!  

[1] Of course as a child, I did not know the punch cards were being repurposed to hold the candy and so I always wondered why someone would “design” perforated cards to hold the chochitos!
[2]The World’s Technological Capacity to Store, Communicate, and Compute Information” by Martin Hilbert and Priscila Lopez.
[3] Optimally compressed. One Zettabye equals one thousand Exabytes. One Exabyte equals one billion Gigabytes or one million Terabytes. The actual digitized speech of all words ever spoken by human beings could be stored in 42 Zettabytes (16 kHz, 16-bit audio). What follows after Zettabytes, in case you are wondering is: Yottabye, Xenottabyte, Shilentnobyte, and Domegemegrottebyte which in addition to having 18 Scrabble-busting letters in its name, it represents 1033 bytes.

Monday, March 10, 2014

The next big transformation

If you revisit my earlier blog entitled “Prognosticatingthe Future[1], you will see that a key technique in prognostication is the ability to identify the various technology trends whose trajectories combine in novel and synergistic ways.

So what are the most significant technology trends today?  What will be the next big transformation effort?

Recently, Gartner identified their top ten trends for 2014[2].  Unsurprisingly, mobile technologies and the Cloud are mentioned, but a brand new entry on their list is the advent of the so-called Smart Machines.

Gartner defines Smart Machines as contextually aware, intelligent personal assistants, and smart advisors (such as IBM’s Watson). I feel this definition is somewhat narrow. After all, self-driven cars and recent advancements in robotics are also representative of the advent of these Smart Machines. But whether one envisions a Knight Rider Pontiac, Siri on steroids, or HAL from Space Odyssey, Smart Machines are ultimately the outcome of the convergence of several emerging technologies. This is where the action is taking place. The focus must be on the underlying technologies needed to accelerate the next transformation effort:

Machine Learning (ML): Bill Gates recently listed “Getting ahead in Machine Learning” as one of the three things he wishes he had started doing earlier[3]. Indeed, as of late, interest in ML has been nothing short of explosive. A Google search of Machine Learning returns 175 million results, and Stanford University’s Professor Andrew Ng free course on Machine Learning has attracted over half a million visitors for the first lecture, and retained a healthy 20,000+ viewers in his last.  ML is making strides thanks to the application of both unsupervised and supervised learning algorithms, including traditional ones like the Bayes Theorem and more recent ones such as Sparse Distributed Memory. Indeed, trying to predict the possible applications for ML would be like trying to predict the applications for the Internet. ML is a set of core technologies sure to revolutionize many of the IT processes in your company today.

Big Data:  If you are in a major financial institution, an insurance conglomerate, a large travel reservation system or any other large company with access to millions of loyalty or credit records, you are in the happy position of being able to gather and exploit massive amounts of data. If not; don’t fret. Even normal-size companies would be lost in a data desert were it not for the emergence of the Internet. Efforts on the Semantic Web front with global ontologies such as OWL (Web Ontology Language, which should be abbreviated WOL if you ask me), and open semantic databases such as Freebase[4], containing information on millions of structured topics contributed by a broad community, make it possible even for small companies to benefit from “Big Data” aggregations.  Additionally, the Web has a wealth of raw information that can be mined and re-purposed via the use collective intelligence gathering tools.

Big Data practitioners highlight that 80% of the effort goes into preparing the data and 20% into analyzing it. There are several vendor supported ETL (Extract-Transform-Load) tools you could use (IBM’s Datastage, Microsoft’s SSIS, Oracle’s Data Integrator, etc.) for this purpose. However, they apply mostly to mappings from one structured data format to another. When mining the Web, you will need natural language parsers. To assist in this effort, I suggest you check the University of Washington’s Open Information Extraction project code named ReVerb. This project has mined the Web to automatically build Subject-Predicate-Object relationships from English sentences, yielding data bases with millions of entries[5].

Toolkits & Languages.  Apache’s Mahout[6] is an open source toolkit consisting of classification, clustering, and other scalable machine learning algorithms.  Also openly available are a number of non-SQL data base products such as Hadoop, MongoDB, and Cassandra that can be leveraged to help you build and exploit your own Teradata-sized data extractions. On the Natural Language Processing front, there are a variety of freely available lexical databases (WordNet), corpora, and toolkits (NLTK).  Obviously, the 800-pound gorillas (Google, Microsoft, IBM) are making significant investments to offer products and services in this area. IBM has recently opened its Watson Cognitive Computer API to external developers[7]; so you are sure to see a continued emergence of start-up companies with products targeting this area.

The Environment: Clearly the other identified trends such as mobility and the Cloud will also help make pervasive access to Smart Machines a reality. Mobility will evolve further via wearable computers (think Google Glass or smart-watches), and the much touted “Internet of Things” will fuse with mobile computing to better allow universal access to the services provided by these smart systems.  

The Business Transformation Impact.

An old adage states that if you are a big company you want to appear to be a Mom-and-Pop shop, and if you are a small company you want to look like a global conglomerate. Achieving these goals for either size company is made easier through the use of Data Science. The emerging field of Data Science is a direct result of the convergence of Big Data, Cloud, and Machine Learning. However, while today’s Data Science is geared primarily toward predictive analytics applications, it would be a good idea if, together with your business team, you began to evaluate these and other potential applications:

  •  Collective Intelligence Analytics. Explore what data is available on the Web that could be analyzed to the advantage of your business. This includes sentiment analysis[8] for your company and its products from social web sites, as well as competitive pricing, and future demand analysis.
  • Recommendation Engines. Like Amazon or Netflix recommendation engines, your company could more effectively execute cross-selling, up-selling, or targeted promotions based on specific machine learning customer associations. This involves linking your current CRM system with available industry data, yielding behavior patterns based on demographics and explicit and implicit preferences.
  • Information filters/hunters. This includes the more traditional spam-filters, but will also  lead to modern automatic information topic detectors, and automated news and reports summarizers.
  • Correlation Analytics. Not all applications need to be customer-facing. Data mining based on pattern matching and other ML regression techniques can be applied to examine operational failure logs against prior symptoms. This type of approach can also aid in quality control in manufacturing.
  • Shopping/Search Avatars. You will be able to unleash your electronic avatar to continuously search and shop for the best deals or information items. The avatar will know enough about your preferences and cost constraints to alert you to opportunities and even act on your behalf.
  • Security. Recent security breach incidents have raised the need for additional levels of security. Facial and voice recognition systems are bound to benefit greatly from ML advances.

Does this mean that Artificial Intelligence has finally arrived?

Ever since the dawn of the computer age it was foreseen that machines would one day match and even exceed the intelligence of humans. This strong view of Artificial Intelligence (i.e. “Strong AI”) was envisioned to occur “within decades” and, as a result, it became a strong field of study in leading universities worldwide.

Inventors had tried for years to replicate a bird’s fight by devising machines consisting of flapping wings and artificial feathers. In the end they only succeeded after the Wright Brothers invented a machine that resembled more a bicycle than a bird’s body.  Analogously, building a chess program able to defeat a chess Grandmaster was achieved at the end of the twentieth century when IBM’s Deep Blue beat Garry Kasparov, but it used techniques that do not resemble how we humans actually think. So, impressive as that feat was, no one tried to claim Deep Blue was proof AI had finally arrived.

Still, recent strides with Machine Learning have once again opened the old Artificial Intelligence debate. . .

Yes, it is conceivable that we will soon have a machine able to pass the Turing test with flying colors by “fooling” a human into believing the machine is human. Even then, once we look under the covers, the ML advances will look more like the type of ‘brute force’ approaches used in chess playing software than the structured thought answers originally contemplated as “Strong Artificial Intelligence”. ML is like a video-camera; intelligence is still the eye. Yes, we might one day develop something as complex and seemingly sentient as Samantha from the movie “Her”, but the essential question as to whether such an “entity” is truly intelligent in a human sense will boil down to asking whether the software truly has consciousness.  And debating the issue of consciousness is something that can quickly devolve (or evolve) into a more philosophical and religious debate with profound existential implications. Are we humans the mere biological machines neuroscientists say we are,  endowed with the illusion of ‘consciousness’ that is a by-product of our brain, or is there something more transcendent going on?

I myself used to believe that anything in our brains could be reproduced computationally, but after reading about the Penrose-Hameroff model of consciousness[9], which proposes consciousness is a quantum physics related phenomena, I am not so sure anymore.

As we sort out this doozy of a dilemma, I prefer to use the term “Simulated Intelligence” to refer to the new algorithms and ‘smart machines’ of the future. After all, there is no question that a form of ‘weak’ AI is actually occurring now with ML and that computers may soon appear to have true intelligence traits, especially in narrow-domain applications.

Even as philosophers debate the true meaning of life and the possibility that artificial consciousness might exist, there is no reason why you and your company cannot begin to plan a point-by-point transformation plan based on how these exciting developments can be made to work to the benefit of your business.

Tuesday, February 18, 2014

Operating Legacy Software and the Portrait of Dorian Gray

Believe or not, software is subject to the laws of entropy.  Software also ages. Software that is maintained carefully ages more gracefully and might even get to perform Mick Jagger-like antics, but in the end, all software must be refactored, converted, migrated or disposed of, according to your needs.

The idea that it is possible to reliably operate old software ad-infinitum simply by maintaining high-level operational standards is somewhat deceptive.  If software is there to support business processes (and it truly is), then software cannot be frozen in time any more than business can be frozen. Software must be maintained continually and adjusted so that it can keep up with the perennial business demands.

Even if your software was originally written under best industry practices, best industry practices tend to change. Each time your programmers apply changes to the software to accommodate a new business function or new features, there is risk that the software will depart from its original 100% tested purity. Even if your team is mindful of every change, and follows rigorous regression testing, the fact is that the environment in which your software operates is also changing. Vendors introduce new library versions, new compilers, new operating software releases, each of which is bound to affect the way your software works. Even the way modern systems are engineered (increased virtualization, data sharing, and network bandwidths) can impact the operation of formerly “stable” software.  For example, portions of software that have worked well previously may now fail when it is suddenly revealed that the legacy data handler had been expecting full ownership of the data, or the latency of an old subroutine may now exhibit concurrency problems due to faster data access.  And software with improperly designed components might even break when presented with a simple change of data schemas.

Granted, if you were able to keep your software up to date by applying the type of practices the aviation industry follows to keep airplanes forever new, you could then argue that your legacy software could continue to be operated with the highest reliability and availability levels required by your business.

Let’s be honest.  If you are like 99% of companies out there, chances are your legacy software more closely resembles a 300,000 mile reworked Chevy truck than a refurnished Boeing 727. And even if it doesn’t, your software will eventually follow the path of Dorian Gray’s portrait, staying young only until the day the mirror reveals its true age when compared to the latest and greatest standards available today.

What to do then?

There is a tendency to want to try to keep the legacy software running longer by reactively over-investing in additional infrastructure and by tightening the change control processes.  Nothing wrong with tight change control but, if the plan calls for continued changes to the existing software, then you will only be prolonging the inevitable end.  You can only react to problems for so long before the law of diminishing returns comes to bite you.  Doubling-down on your legacy software may work only until the next outage happens. And beware, during the next crisis you might no longer be retaining the people who had knowledge of the technology used in the original software (how many Pascal or Cobol programmers are still out there?).  The loss of ‘institutional knowledge’  of your legacy system is often irreparable.

Now, don’t get me wrong.  Your business lives and dies by the performance of your legacy software, and I am not suggesting you simply toss it away as though it were one of Elizabeth Taylor’s ex-husbands. No. What I am suggesting is that you need to put a proactive change strategy in place, and that you will need to execute on a well thought out transformation plan.

You will need to answer questions such as the following:
  • What is the predicted life expectancy of the current legacy software? This is an excellent time to revalidate the current and predicted business requirements. Build a roadmap.
  • What is the target software environment that will be able to carry the business requirements for the next five to ten years? You will need a solid and practical system architecture framework.
  • How will you maintain the current legacy software to keep it sufficiently functional while managing costs?  You will need a tactical operability plan.
  • While migrating to the new system, what will be your level of investment for maintaining the legacy software? You will need to decide on the appropriate tenets to introduce new functionality.
  • How and when will you execute the migration of customers/users from the current software to the new environment? What are the data migration challenges and tools needed? What are the customer re-training needs?  A comprehensive migration plan is in order.

Throughout my IT Transformation blog, I have written extensively about the many areas of consideration when dealing with legacy environments and on the techniques and practices needed to assure success. Below is a list of some of my previous blogs that are most relevant to this discussion:

Wednesday, January 8, 2014

Flexible systems as the core of bespoke solutions

I remember reading the influential book “Future Shock” by Alvin Toffler as listened to a vinyl record of  Donna Summers’ “Could it be magic” (Since I am not a vinyl-loving-hipster, feel free to carbon-date me based on that clue).  While I have forgotten much of the book, one prediction in particular struck me the most:  future technology would enable highly customizable products. One must credit Toffler’s foresight for having predicted, smack in the middle of the ‘everyone-wears-stamped-polyester-shirts’ decade, that there would be a day when, instead of buying mass-manufactured clothing, we would be able to ‘tailor’ our own clothes to very precise specifications and fashion sense.
From my hospitality background I’ve come to use the word ‘bespoke’ to refer to very individualized customization capabilities.  While Mr. Toffler’s other predictions have yet to happen, the so-called “nth degree customization” is now a present possibility. Think of how you do searches in Google. Instead of forcing you, the user, to learn predicate logic syntax (e.g. Using wildcards, plus AND, OR, NOT keywords to create a specific query, the way prior search engines such as AltaVista did), Google uses sophisticated algorithms, applying knowledge of your prior searches and click-through activity, to cast a wide search net with results uniquely matched to you.   
Bespoke options are now becoming more common in manufacturing and service industries. The much touted 3D printer technology and the low cost availability of ‘design-your-own’ products (t-shirts, mugs, cards), are but forerunners of what is yet to come.
Why has this level of individualization not existed until recently? True, AltaVista did not have the millions of computers now being used in Google farms, but the most fundamental attribute needed to provide complete customizability is the ability to design and implement truly flexible systems in a cost effective basis.
Without flexibility there cannot be bespoke offerings. When I speak of flexibility, I am referring to the underlying infrastructure, hardware, and systems needed to provide highly bespoken outcomes. Fact is the implementation of flexible approaches requires the use of higher-level abstractions, which are invariably very expensive in terms of resources. We’ve had to wait for Moore’s Law to finally make the cost of computing low enough to allow ‘wasteful’, flexible designs to take hold.
Long gone are the days when programmers had to squeeze out complicated algorithms while utilizing only a fraction of RAM, or when the high cost of storage required packing fields the way dates were encoded, resulting in the Y2K brouhaha.  Today, you should be able to implement more flexible systems by choosing the most appropriate programming languages, data bases, and systems. 
For instance, in the programming language front, there is a trend away from strongly typed languages that force programmers to perform code acrobatics, such as overloading around restrictive cast types imposed by languages, and toward more elastic weak-typed languages/scripts such as PHP, Ruby or Python.
Likewise, the advent of non-relational data bases[1] also reflects a shift away from the very constricting, highly structured schema definitions required by traditional data base designs.  After all, expecting a designer to know all the data elements, and their relationships, far in advance of implementation would be just as effective as a Soviet-era five-year plan.  Whether by using document-oriented MongoDB or the RDF-based triplestore graphs favored by Semantic Web practitioners, the flexibility with which non-relational entries get constructed, indexed, and then stored, actually accelerates the decentralized view of processing to better facilitate cloud based implementations.
This under-the-hood flexibility ultimately allows deferral of design decisions toward the user. I am speaking not only of the ability to configure the software to the user’s states, but to tailor interaction experiences according to the user’s dynamic preferences. Programmers can now dynamically accommodate different user choices and interaction flows without resorting to tedious design re-factorization exercises.
The diagram below depicts the range of possibilities.  If you strive for high performance, low resource implementations, you will most likely choose technologies in the center-left of the diagram. If you seek high customization then you will need to work with technologies on the right.

All in all, we are living in very exciting times whereby ultimate system designs can now accommodate the most innovative approaches. With cheaper processing power and storage, system developers can afford to be a bit freer in how software is designed. While all this flexibility can actually improve the time to market of your transformation initiatives, it does force you to discipline your approaches to prevent chaotic outcomes. As Spiderman’s[2] uncle wisely noted, ‘with great power comes great responsibility’; so this power to be inefficient should be used wisely: to increase flexibility at all levels, rather than to lazily waste resources. A nice mix of common sense and best practices ought to do the trick. For example, if using Python, it becomes even more essential that your team abides by PEP 8 for Python style guide, peer-reviews, and other agile development practices. Also standardization of development tools (IDE) and libraries, plus strict version control and change management, combined with well-defined Q/A standards should be given the focus they deserve. In the end, you will have to rely on high team qualifications and a development culture that incentivizes doing things well.
These days I listen to music from my Spotify playlists that contain only the songs I enjoy.  I listen to these virtual LPs in a bespoke fashion, while reading, “The Singularity is Near”, by Ray Kursweil. He predicts that in the near future (2045) computers will exceed humanity’s comprehension abilities.   Time to listen to “Could it be Magic” once more.  Let’s see how that works out![3]

[1] Although they are also referred as NoSQL systems, SQL API’s are in fact available for these. 
[2] OK. Let’s credit Stan Lee from his Amazing Fantasy #15 (August 1962) now classic Spider-Man story.
[3] Yes, read this as a skeptical remark.  A future blog may expand on why I think ‘strong AI’ (Artificial Intelligence reaching human-levels of consciousness) is not really feasible.