« Cuckoo, cuckoo | Main | Happy birthday, "IT doesn't matter" »

"We still believe there is human involvement"

May 01, 2008

"Captcha" is the official term for those wavy strings of numbers and letters that you have to decipher before setting up an online email account or gaining access to other types of web sites. The acronym, coined by someone at Yahoo a few years back, stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. Captchas are intended to separate men from machines in order to prevent spammers and other nasty folks from using automated means to crack into sites.

Problem is, as the Washington Post reports today, the machines keep getting smarter. The spammers are thinking up ever more ingenious ways to break the captchas. It used to be assumed that spammers were somehow deploying people to crack the codes, paying tiny sums to third-world laborers to type in the characters. Google, which has recently suffered attacks on its captcha systems for Gmail and Blogger, still believes that people are doing the work. A Google spokesman tells the Post: "We still believe there is human involvement."

Security experts, though, are increasingly convinced that the most sophisticated captcha attacks are actually being carried out wholly by machines:

The attack that most clearly signals that computers were solving a CAPTCHA came about a month ago, when Websense detected what appeared to be some malicious traffic from one of its "threat-seeker" honey pots. Once it attracted the malicious code, the decoy sought repeatedly to create Hotmail accounts. Over and over, when it was presented with the Hotmail CAPTCHA, it sent the letter puzzle to another computer. That computer would respond within about six seconds, a speed that leads computer analysts to think the CAPTCHA was being cracked by a computer, not a human.

No one seems to be quite sure, though, how exactly the computers are doing it. And the increasing sophistication of the automated attacks puts site owners in a quandary, as the Post reports: "Microsoft and other Web companies say they are interested in creating human verification tests that are harder for computers to crack. But there's an inherent difficulty. Making the tests harder for the computer makes them harder for humans, too." You may outsmart the people before you outsmart the machines.

Which raises a bigger question: What happens if the bad guys get the AI first?

Advertisement: Are you ready for "The Big Switch"? Nicholas Carr's new book "is the best read so far about the significance of the shift to cloud computing," says the Financial Times. The Independent says it's "lucid and mind-boggling." Order now from Amazon.com.

Comments

I'm dubious that it's all-machine. This is the standard superhero-comic problem of "Why do crooks who are super-geniuses or have ultra-high-technology waste their time robbing banks and brawling?". Anyone who has developed image-reading technology beyond the state of the art is not going to be keeping it generally secret and only for spammer use. They could make far more money licensing it for OCR.

Don't believe everything you read in the papers. I suspect some CAPTCHAs are known to be cracked with well-understood attacks, but the companies affected are not going to say that.

Posted by: Seth Finkelstein [TypeKey Profile Page] at May 1, 2008 11:41 AM

Also, six seconds is too much time to rule out human intervention. I can "solve" most CAPTCHAs in a few seconds. I've read people are farming these tasks out "mechanical turk" style to people who do them for hours on end.

There is also the solve-this-puzzle-to-access-more-pornography technique as explained by Luis von Ahn in the Google Talk video linked from his home page.

(He and his CMU colleagues were the ones to coin the phrase, the first system was used at Yahoo)

Posted by: Tim Freeman [TypeKey Profile Page] at May 1, 2008 11:56 AM

I'm just wondering: Has there been any actual documentation of these captcha-solving sweatshops or the systems they use?

Posted by: Nick Carr [TypeKey Profile Page] at May 1, 2008 12:00 PM

I've always preferred the xkcd captcha myself.

Posted by: Meelar [TypeKey Profile Page] at May 1, 2008 12:22 PM

KittenAuth

Pretty hard to do automatically. Doesn't solve the free porn issue though....

Posted by: Thomas [TypeKey Profile Page] at May 1, 2008 12:56 PM

Cory Doctorow had a similar idea in a short story he wrote called "I Rowboat" (take off on Asimov's)

"“Spam-filters, actually. Once they became self-modifying, spam-filters and spam-bots got into a war to see which could act more human, and since their failures invoked a human judgement about whether their material were convincingly human, it was like a trillion Turing-tests from which they could learn. From there came the first machine-intelligence algorithms, and then my kind.”"

http://www.flurb.net/1/doctorow.htm

Posted by: Arnon Rotem-Gal-Oz [TypeKey Profile Page] at May 1, 2008 03:53 PM

Nick, there a typo in the text: "Googe".

Posted by: Sergey Schetinin [TypeKey Profile Page] at May 1, 2008 04:34 PM

They are probably training a neural network using data sets obtained by human users: graphic image vs. translated letter sequence. It’s the kind of stuff that graduate students have been doing for years. Unless they are varying the algorithm in subtle ways over a long period of time, the net given a large enough dataset would probably be able to learn the patterns and crack it pretty easily. Even with their limitations, nets can be pretty good at finding subtle patters that human programmer subconsciously coded in but didn't realize were there.

Posted by: Linuxguru1968 [TypeKey Profile Page] at May 1, 2008 10:38 PM

I sincerly beleive that trivia questions could be useful ways to resolve that issue: KittenAuth gives a good idea of the many ways you can vary the test format to defeat most computers.

Posted by: Bertil [TypeKey Profile Page] at May 3, 2008 01:30 PM

What a delightful quandary - it's getting so the science fiction writers can't keep up with reality!

Isaac Asimov and Phillip Dick must be chortling in their graves, while William Gibson seems to be just giving up on the future part of science fiction and writing novels set in the present.

Posted by: ckeene [TypeKey Profile Page] at May 6, 2008 11:55 AM

Speaking of Captcha's here are the top 10 worst ones if you want something to laugh about...

Top 10 Worst Captchas

Posted by: Botchagalupe [TypeKey Profile Page] at May 6, 2008 01:30 PM

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?


 Subscribe to Rough Type

Work in progress:
The Shallows

Nick's new book: bigswitchcover2thumb.jpg "Future Shock for the web-apps era" -Fast Company

"Ominously prescient" -Kirkus Reviews

"Riveting stuff" -New York Post

Order from Amazon

Visit Big Switch site

Read Q&A with Nick

Greatest hits

The amorality of Web 2.0

The engine of serendipity

The editor and the crowd

Avatars consume as much electricity as Brazilians

The great unread

The love song of J. Alfred Prufrock's avatar

Flight of the wingless coffin fly

Sharecropping the long tail

The social graft

Steve's devices

MySpace's vacancy

The dingo stole my avatar

Excuse me while I blog

Other writing

Is Google Making Us Stupid?

The ignorance of crowds

The recorded life

The end of corporate computing

IT doesn't matter

The parasitic blogger

The sixth force

Hypermediation

More

Nick's last book: Order from Amazon

Visit book site

Rough Type is:

Written and published by
Nicholas Carr

Designed by

JavaScript must be enabled to display this email address.

What?