Feeds:
Posts
Comments

BigData Trend

 

Obviously the BigData trend is growing rapidly. Just a look at job growth in that area will validate that.

 

 
Not having even recognized its potentials, people now started talking about a bubble!

http://www.infoworld.com/t/business-intelligence/here-comes-the-big-data-bubble-180852

 

To me BigData is an enabler of KnowledgeModeling, which will drive the technological innovations for the years to come. I don’t expect to see any slowdown in that trend for a decade or two.

 

During MIT's Brains, Minds, and Machines symposium, which started on May 3rd, some of the founders of artificial intelligence and cognitive science tried to explain the lack of progress in AI over the last few decades.

AI pioneer Marvin Minsky started by asking: "Why aren't there any robots that you can send in to fix the Japanese reactors? The answer is that there was a lot of progress in robotics in the 1960s and 70s and then something went wrong.  Today you'll find students excited over robots that play basketball or soccer or dance or make funny faces at you. But they're not making them smarter."

Ultimately the general consensus is that the stagnation is due to decline in funding after Cold War and on early attempts to commercialize AI and because the research is focusing on ever-narrower specialties rather than seeking the bigger questions underneath.

 

Read more at:

http://web.mit.edu/newsoffice/2011/mit150-brain-ai-symposium.html

Decision Paralysis

 
Until not long time ago our challenge was to find the relevant information for decision making. These days our challenge is to cope with information overload in decision making!

Not only the access to information has opened up, but also our universe of available "options" and "selection parameters" have expanded in all possible dimensions.

To provide context, lets first look at the volume of data we are talking about:

  • In entire human history, until 2003, 5 exabytes of information was created. (one exabyte equals 10^18 bytes).
     
  • During 2006, approximately 161 exabytes of information were created.
     
  • In 2010 over 900 exabytes of info got created or more than 2 exabytes every Day!! 
    (That includes over 200 billion emails and more than 28,000 hours of YouTube video uploads per day)
  • And in 2020 we expect to create 35,000 exabytes of data, or 35 Zettabyte.
This exponential growth of information will have a profound effect in our life. 
On one hand it will result to game changing advancements in AI and machine learning.   On the other hand it will have negative impact on our decision making abilities.
 

According to a recent research at Center for Neural Decision Making at Temple University our decision making ability and quality decreases with information overload. People start making stupid mistakes and bad choices.

Along the way subject matter experts are adjusting and learning how to filter noises. But for everybody else decision making is getting harder (a.k.a decision paralysis syndrome!).

This to me is another reason why knowledge modeling is going to become more and more important. Knowledge Models can focus on what matters and learn to filter noise, as experts do.

 

  

Watson winning Jeopardy ..

 

These days we hear a lot about Watson, the IBM AI supercomputer that defeated two greatest Jeopardy players on February 16th 2011.

Watson is in size of 8 refrigerators, a cluster of ninety IBM Power 750 servers with a total of about 3000 processor cores and about 16-terabyte of memory. (For comparison the US Library of Congress contains about 10 terabyte of data)

The large memory is used hold ontological object graph of concepts, entities, properties and their relationships. 

 

Putting it in Knowledge Modeling context Watson works using a chain of Knowledge Models as followed:

  Diagnostic Model -> Explorative Model -> Analytic Model -> Selective Model

 

- The Diagnostic Model utilizes Natural Language Processing (NLP) to decipher the Jeopardy question and identify the "problem" (what is being asked for)

 

- An Explorative Model utilizes the internal memory to present all possible answers, each represented as a Hypothesis.

 

- The Analytic Model is in charge of evaluating and calculating the confidence level of each Hypothesis on multiple dimensions.

 

- And finally, the Selective Model ranks the options and chooses the best possible answer based on predefined thresholds and dimensional weightings.

 

According to the IBM research team Watson is using a dynamic weight setting  that enables it to learn during the game. (For example adjusting its confidence level according to Jeopardy category)

 

To me, one of the most impressive part of Watson is the massive parallelism that makes a timely response possible.

 

Apparently the overwhelming victory in Jeopardy resulted in a partnership between IBM and Nuance to apply Watson technology in Healthcare industry, with the first commercial offering planned within 24 months!!

 

 

Would you allow a computer to pick movies for you to watch?

 

Going through Netflix movie list I was thinking about the idea of delegating my selection process to a Knowledge Model.

This is actually similar to Netflix initiative in 2006 for creating a new recommendation engine, however in a more personalized way.
(Read more about Netflix competition with $1 million prize and the winner here: http://www.netflixprize.com//community/viewtopic.php?id=1537)

 

So I started by monitoring my own behavior during selection process.

In the first iteration I captured all the key attributes that I used when selecting a movie, such as: genres, main Actors/Actresses, year the movie was made, director and user ratings.
 

Then I started with capturing situation based parameters such as day of the week, my mood and so on.

For example I would normally pick a comedy when I am tired (e.g. during the week) and an action movie during holidays. 
 

So far so good … I just need to pass those additional parameters to the model.

But then I noticed my selection changes dramatically, when I have company. Not only my own preferences, but also my perception of their personality would impact the selection. For example I wouldn't pick a Mystery movie with someone who can't sit quiet!

 

Gradually few more variables popped up and the model was getting more complicated, however still manageable.

But ultimately I decided to consider the selection process as part of the entertainment and dropped automating it.

Nevertheless that was a fun exercise …

What is Enterprise Knowledge Bus?

 
First lets see what is Knowledge good for.

Well, except when you want to show off, there is really no benefit to knowledge unless it is put into action!
 

Looking back that is where software has been used; to take actions.
But so far we have been focused on optimizing the execution part of action.

As the software industry advanced, we started a paradigm shift by introducing ESB (Enterprise Service Bus) and SOA (Service Oriented architecture) to further optimize execution and facilitate interoperability. 

But by definition action implies also need for knowledge.

Entering 21st century, armed with cheaper hardware, increasing processing power and larger storage capacities now we need to turn our focus to knowledge part of actions.

 

- What if I could apply the same strategic planning knowledge that I use in Chess gameto product selection or go to market strategy in my business?

- What if there was a way to store all pieces of knowledge available in my organization into a centralized repository where people and systems could pick, pull and chain to make actions more effective and efficient.

 

This might remind you of Knowledge Management movement of 90s.

However to me the current KM systems are mostly about an "information management system" plus "collection of best practices for processing information". While that is an excellent way of managing artifacts, it is not solving the "direct" needs for knowledge. 

 
Think of the following scenario:

You have a dispute with your neighbor.

- I provide you with a searchable Federal and State law repository and consider your problem as resolved!

- Or I can ask you few questions and provide you with 3 options ranked based on your preferred time and financial commitment to resolving the issue along with likelihood of outcomes. Upon picking your option, I will provide you with a formal letter to be mail to your local authorities.

In this context I would refer to as "enablement" vs. "empowerment".

 

With the vast array of information produced and published these days, accessibility will not go a long way.

 

The Enterprise Knowledge Bus (EKB) is a concept that I have been playing with in my mind for years. The idea was born when I realized that 1) the amount of generated knowledge is growing much faster than one can absorb and 2) we are increasingly outsource out knowledge needs to "trusted" providers.

In a nutshell EKB is a system that not only facilitates access to knowledge related artifacts, but it is also able to apply those to certain situation and contexts to produce actionable outcomes.
In practice it facilitates sharing, reusing and combining knowledge models to be used by systems and rational agents. 

 

For more details on Enterprise Knowledge Bus check my related paper.

 

Most of today's applications have in one way or the other some representation of Knowledge incorporated. However those are completely baked into the application code. There is no explicit notion of knowledge and as such no urge for reusing or enhancing (learning) knowledge.

In the coming wave of Knowledge Powered Applications (KPA) however here is conscious effort to externalize, reuse and continuously enhance knowledge. 

The differentiation between traditional and Knowledge Powered Applications starts at design level. The architect starts with a notion of modular knowledge components in mind and reflects that throughout entire design.

A good example is the portfolio management and optimization application that we worked on few years ago. That application was composed from three chained Knowledge Models:

 

        Explorative Model -> Analytic Model -> Selective Model

 

  • Explorative Model was designed to produce all possible reallocation options for the users' set of assets and liabilities, while taking the regulatory constraints and institutional guidelines into account.
    A Monte Carlo like logic along with a rule engine was used in this Knowledge model.
  • Analytic Model was designed to objectively analyze the options and generate a set of Pros/Cons for each. In this specific case our Knowledge Model is also able to take real-time market data into consideration.
  • Selective Model was designed to create a multidimensional decision model on the fly. Then it calculates and assigns weighting to each dimension of decision space according to user profile and preferences and finally picks the best option using an optimization algorithm.

 

On the implementation side a shared EAV taxonomy was utilized to exchange facts as [Entity, Attribute, Value].  For inferences a rule engine, a neural net and a proprietary dynamic tree were used to meet the aggressive performance requirements.

These Knowledge Models were linked to feed each other and collectively created the most sophisticated and powerful solution in their respective market.

 

As you might have noted each model is independent and could be reused, replaced and improved without impacting any other area of the application. This is how a true KPA should be designed.

Knowledge vs. Information

What is Information and what is Knowledge? There are many different views on that.

Let me first start with an example:
 

Think of your reaction if you were just told that:

- Your WBC is 20,000

- My DTI is 0

- Or the girl you just met online is 1.3 Fathom

 

Well, these facts by themselves will not mean much if you can't put them in context.

 

You wouldn't be able to make any sense of it, even if I were to tell you that for example WBC stands for "White Blood Count".
But (hopefully) your physician would react to this piece of "information" and send you immediately for further tests.

 

Now, the situation would have changed if you "knew" that the "average" WBC is 7000.

Although this could be considered another piece of information, it will give you enough context to decide on a reaction.

 

In the second scenario it should be enough to know DTI stands for "Debt To Income". In this case you would be able to infer a ratio and conclude that my financials seems to be in a good shape.

 

And in the last example in addition to knowing Fathom is a unit for height, you will need to the conversion ratio to a familiar unit to conclude anything.

 

Now, what makes any of these Information vs. Knowledge?

 

Here is my answer to that:

 

Knowledge is ability to make information actionable.  Another way to put it: Knowledge is a collection of Information along with a process (cognition) that allows to make decisions, answer questions or solve problems. The key is that Knowledge is tied to information processing.  Accordingly:
 

"3 x 3 = 9" is information

while ability to do multiplication is knowledge.

Now the question is can I externalize or outsource that ability?

The answer is clearly yes. That is what you do on daily basis when using financial advisors, Physicians or lawyers. The same can be at system level by applying knowledge models for specific use cases. 

Think of a calculator and its embedded Knowledge Model …

 

In most cases when designing a model, knowledge is not readily available in a consumable format. As such a Knowledge Acquisition (KA) process has to be performed.

This should be considered as the most important step in a Knowledge Modeling project. Models that are not properly constructed and generalized will lack reusability, due to some underlying assumptions or lack accuracy due to excessive bias to certain cases; a phenomenon referred to as over-fitting.

 

The Knowledge Acquisition is performed in two ways:

 

  • Direct method:

    By engaging subject matter experts; a practice that is also referred to as elicitation. In this case a knowledge engineer will work with the expert(s) to capture and properly structure the knowledge. This happens through interviews, questionnaires, role play, observation of tasks performed or Teachback.

Direct Knowledge Acquisition approach has two major challenges:

- Experts don't always know all that they know and use
- Tacit knowledge is typically very hard to articulate and describe

 

  • Indirect method:

This method relies on data analysis, case study, simulation and resulting inferences and machine learning techniques.

 

 

Sometimes also a combination is used to cover corner cases.
Either way, generalization should happen before the model is codified. 

 

 

Knowledge Models can be categorized into following seven groups:

 

  • Diagnostic models
    Use case: I have these symptoms. What is the problem? 
     
  • Explorative models
    Use case: Ok, I know the problem. What are my options?
     
  • Selective models
    Use case: Now I know my options. Which one is the best?
     
  • Analytic models
    Use case: How  suitable is this option for my objective?
     
  • Instructive models
    Use case: How can I achieve that?
     
  • Constructive models
    Use case: I need a <…> with these specifications <…>.
     
  • Hybrid models
    Use case: Diagnostic -> Explorative -> Selective -> Analytic -> Constructive

 

For more details see my Introduction to Knowledge Modeling paper.