Tuesday, February 15, 2011

The state of artificial intelligence: Where Watson falls down on Jeopardy

The Jeopardy games this week pitting top Jeopardy competitors against IBM's Watson supercomputer are fascinating. They offer glimpses into how well the field of artificial intelligence is developing. Here is my analysis, after day one.

Watson combines many technologies. Firstly it has to process natural language. It has to do this twice, once when 'reading' vast volumes of information (at this stage it can take it's time) and again when it has to process a Jeopardy 'answer' in order to respond with the correct question. The subtle nuances of language, such as puns, ambigiuties and shades of meaning pose huge challenges both for Watson and the human competitors. I think the engineers building Watson have done a terrific job in this area; Watson didn't seem to stumble too much in this task.

Secondly, Watson has to take the 'surface meaning' it obtains from processing language and build sophisticated knowledge representations in its memory. Watson excels at extracting and representing straightforward facts. But it seems to have trouble dealing what I call 'meta-knowledge' (or knowledge about knowledge) and the deep meaning the plots of stories. In one question it showed that it did not understand who the antagonist was in certain conflicts in the Harry Potter series, not correctly identifying Voldemort. I think this represents the state of the art in Artificial Intelligence (AI): We will not have a truly sentient machine until all the subtle interactions in both real-world and fictional event sequences can be deeply modelled by the machine, with correct analysis of such elements as motivation, causality and subterfuge.

Thirdly, Watson applies many different algorithms to reason with various hypotheses about possible correct responses. Again the Engineers have done a superb job here.

A lot has been made of Watson's accidental repetition of a wrong response 'the nineteen-twenties', that a human competitor had just got wrong. I think it is an unfortunate oversight that a speech-recognition feature was not added to Watson to deal with just this situation. However I believe that improving Watson so that it can correctly listen to what other competitors are saying would not be very difficult. The error Watson made, I believe relates to a combination of slightly faulty knowledge representation (issue two above) and reasoning algorithms (issue three above).

Watson is not 'strong AI' in that it doesn't reason with full mathematical rigour. It uses a lot of probabilistic matching technology. Some people have criticised this, claiming that it therefore isn't AI. However I disagree. Frankly I think that we humans mostly use this 'scruffy' kind of AI in our reasoning too.

Where do I think AI is going? I think that we are finally seeing the dawn of truly useful AI. We are starting to be able to use it to do sophisticated analysis.  I think sentient machines are still some decades away, but will appear in the lifetime of people alive today. That said, forecasters of the future of AI have consistently been wrong.

There have been many blog posts and articles about how Watson works and is doing.  I recommend this post by John Rennie. A sample of the show is online at YouTube.

And by the way, a plug for the University of Ottawa, my employer: Alex Trebek, Jeopardy host, is an alumnus of our University, so these episodes have special significance for the Computer Scientists and Software Engineers here.


  1. Hi Tim - Nice summary! Just a point of correction: The Feb 9th NOVA show on Watson described how Watson is explicitly programmed not to repeat a human's wrong answer, motivated by it doing exactly that in one of the practice rounds (repeating the human's wrong answer "mosquito"). However, I think what happened in the Feb 14th show - if I remember right - was the human answered "the 20's" and Watson then answered "the 1920s", i.e., technically it was offering a "different" answer, although we all know it's the same thing. That in itself is a reflection on what it does and does not understand. Best wishes! Peter Clark (from the Ottawa days!)

  2. I have posted an update regarding the second day at