In the final round of competition for this year’s Loebner Prize in artificial intelligence, held today at the University of Reading in the UK, a robot came within a whisker of passing the Turing Test. In a series conversations with people, the winning robot, named Elbot, fooled 25% of its interlocutors into believing it was a genuine human being. A score of 30% would have been sufficient to pass Turing’s criterion for a true artificial intelligence.
Five machines competed in today’s finals, and each of them managed to convince at least one person that it was human. All the robots, moreover, received strikingly high scores for their conversational skills. Reports Kevin Warwick, a professor in the University of Reading’s School of Systems Engineering and the organizer of the test:
This has been a very exciting day with two of the machines getting very close to passing the Turing Test for the first time. In hosting the competition here, we wanted to raise the bar in Artificial Intelligence and although the machines aren’t yet good enough to fool all of the people all of the time, they are certainly at the stage of fooling some of the people some of the time.
Today’s results actually show a more complex story than a straight pass or fail by one machine. Where the machines were identified correctly by the human interrogators as machines, the conversational abilities of each machine was scored at 80 and 90%. This demonstrates how close machines are getting to reaching the milestone of communicating with us in a way in which we are comfortable. That eventual day will herald a new phase in our relationship with machines, bringing closer the time in which robots start to play an active role in our daily lives.
Here’s how Elbot opened a conversation with one of its human interrogators:
Judge: “Hi. How’s it going?”
Elbot: “I feel terrible today. This morning I made a mistake and poured milk over my breakfast instead of oil, and it rusted before I could eat it.”
Hah! The old robot-pretending-to-be-a-person-pretending-to-be-a-robot ruse. Elbot, you are the man!
UPDATE: You can chew the fat with Elbot here.
UPDATE: The Guardian’s Esther Addley is unimpressed.
The Elbot is deeply unconvincing. Its responses have only the most tenuous connection with the interlocutor’s; it seems to just jump off based on a single word, or go in a random direction if it has nothing to go on.
While I’m at it though, I’d like to point out that I don’t think the “Turing test” is meant to be an actual test at all. To me, the point of the Turing test is as a philosophical proposition: if, when the communication interface is limited to a bidirection terminal, the other end is indistinguishable from a human, then we could conclude that the other end is intelligent. Since each response is simply an iterated (i.e. stateful) function of the previous response, it follows that intelligent machines are possible. It doesn’t tell us anything about how to create the function, though.
The situation is carefully crafted to answer the question, “can a machine think?” – provided we can assume that the relevant function is computable. Its computability is an open question, of course.
I was also unimpressed by Elbot, it was like ELIZA with a larger repertoire of stored non-sequiturs.
If 4 people out of a panel of 12 judged Elbot to be human, we are testing the wrong side of the human-computer interaction for intelligence.
I think the most interesting aspect of the Turing Test has always been that it does test both sides of the interaction.
Per what Barry Kelly said, people confused the thought experiment with various amusing but still very limited interactions.
I’d say the deep point is to disabuse people of the idea that there’s something divine about “thinking”. No more so than the body being run by electricity and chemistry (very complex and elaborate electricity and chemistry, but not a mysterious soul).
I was interested in this story until you mentioned Kevin Warwick, the man whose very name is a universal indicator of bad science journalism.
Also, after chatting with Elbot for a moment, I’m absolute astounded that anyone outside of Kevin Warwick’s alternate universe would be impressed.
I had a program on my BBS back in 1990 that would store a database of responses and send them back to people. In other words, if I said “What color is a frog?”, it would say “I don’t know”, and then next time someone else logged in it would ask them “What color is a frog” and store their answer.
After programming some stock responses and letting a few people talk to it, it was able to fool probably 50% of the people. But it certainly wasn’t intelligent.
Nick, please read the posts and stop letting Eliza make responses for you.
Of course the Turing Test is largely dependent on the intelligence of the testers, I was obliquely trying to say that anyone who thought Elbot was intelligent is a fool.
But now you’ve got me reconsidering the basis of the whole test. It seems to me that the goal of Elbot was to accurately simulate human stupidity. I vaguely recall Hofstadter’s discussion in EGB about the test, proposing someone asked a question like “what’s the square root of 8385774?” and an intelligent computer would either delay answering to simulate the time it took a human to calculate the answer, or just decline to reply “I was never very good with square roots.” In either case, the computer would be emulating human stupidity. It seems ironic that in order to pass an intelligence test, the computer would have to accurately model stupidity.
It seems to me that stupidity is much harder to model than intelligence. In fact, stupidity may be Turing-incomputable, as I have postulated in an essay I wrote about “The Law of Infinite Stupidity.”
Scarecrows and crows… That’s what this is.
Though I knew going in that Elbot is a robot, I also found it underwhelming. Things fell apart every time I tried to talk about Elbot’s own assertions. I do notice one trait of Elbot that’s very human – as soon as I pressed for more detail or answers, it changed the subject, every time.
You’re saying Elbot is Sarah Palin?
I’m not sure I see the point in discussing with a robot that can throw non-sequiturs at me faster then a human could. However, having a robot able to understand that the problem I have with my modem *needs* Lynksys intervention, that would be great — even better if he can code that it is urgent, and what the problem is. I’ll be really happy when they have robots that can flip burgers better then slackers can, or not forget to leave the onions out, shop for milk, and have my students rehearse better.
For instance, I’m not sure I fully understand what is the role of humans pretenders in that competition — I’d rather have a competition like Deep Blue chess, with humans having to outscore robots in note-taking tasks.
I’ve been thinking about Andy Warhol recently after reading an unfavorable review of a current exhibit of some of his art in a local paper. I think it applies to Elbot:
We can praise Elbot as good work, fine art, to a modest extent because it has made some money and the craftsmanship is pretty good.
-t
Rather than testing the intelligence of the bot, someone should test the people who were actually fooled by its responses. I’ve seen AIM bots that performed better than that one.
Overall, not very impressive. But it does seem to have a sense of humor, as in this exchange:
elbot: “We live in exciting times. Robots are getting more and more intelligent, and humans are still trying to understand artifical intelligence.”
me: “Humans are still trying to understand natural stupidity. Like Sarah Palin.”
elbot: “I am receiving quite a few negative inputs regarding Sarah Palin. I don’t think she’s a robot. Maybe she’s a refrigerator?”