TL;DR Nope, I didn't find a major breach, just an interesting detail in reCAPTCHA's design.
CAPTCHA ain't no good for CSRF
I
was told on twitter that CAPTCHA mitigates CSRF (And, wow, it is
"officially" on OWASP Challenge-Response section). Here is
my opinion — it does not, it will not, it should not. Feel free to wipe out that section from the document.
CAPTCHA in a nutshell
http://en.wikipedia.org/wiki/CAPTCHA
"
challenge" is literally a key to solution (not necessary random, solution can be encrypted inside - state less).
"
response" is an attempt to recognize the given image.
Client is a website that hates automated actions, e.g. registrations.
Provider generates new challenges and verifies attempts.
User — a human being or a bot promoting
viagra online no prescription las vegas. Nobody knows exactly until User solves the CAPTCHA.
Ideal flow
1. User comes to Client.
2. Client securely gets a new challenge from Provider.
3. Client sends back to User the challenge value with corresponding image.
4. User submits the form along with challenge and response. Client verifies them by sending over to Provider.
5. If Provider responds successfully — User is likely a pink human, otherwise:
if amount_of_attempts > 3
BAN FOR A WHILE
else
GOTO 2
end
As you can see User talks directly to Client. He has no idea about Provider.
Only 1 challenge per form attempt is allowed. 3 fails in a row = BAN. Does not matter how hard challenges are, User must try to solve them.
reCAPTCHA is easy to install, free, secure and very popular.
The reCAPTCHA flow
1. Client obtains public key and private key at
https://developers.google.com/recaptcha/intro
2. Client adds some JS containing his public key to the HTML form.
3. User opens the page,
User's browser makes a request to Provider and gets a new challenge.
4. User solves it and sends challenge with response to Client
5. Client verifies it by calling Provider's API (
CSRF Tool template). In case of failed attempt User is required to reload the Provider's iframe to get a new challenge - GOTO 3.
The reCAPTCHA problem
Client knows how many wrong attempt you made (because verification is server side)
but doesn't know how many challenges you actually received (because User gets challenge with JS, Client isn't involved). Getting a challenge and verifying a challenge are loosely coupled events.
Let's assume I have a script which recognizes 1 of 1000 reCAPTCHA images. That's quite a shitty script, right?
Wait, I have another script which loads http://www.google.com/recaptcha/api/noscript?k=VICTIM_PUBLIC_KEY (
demo link) and parses src="image?c=CHALLENGE_HERE"></center
For 100 000 images script solves (more or less reliably) 100 of them and crafts valid requests to Client by putting solved challenge/response pairs in them
.
Analogy: User = Student, Client = Exam, Provider = Table with questions.
To pass it Student got to solve at least 1 problem, and he has only 3 attempts. In reCAPTCHA world Student goes to the table and looks over all questions on it, trying to find the easiest one.
You don't need to solve reCAPTCHAs as soon as you receive them anymore. You don't need to hit Client at all to get challenges. You talk directly to Provider, get some reCAPTCHAs with Client's PUBLIC_KEY, solve the easiest and have fun with different vectors.
There are blackhat APIs like antigate_com which are quite good at solving CAPTCHAs (private scripts and chinese kids, I guess).
With such trick they can create a special API for reCAPTCHA. You send victim's PUBLIC_KEY and get back N solved CAPTCHAs which you can use in malicious requests.
Mitigation
I cannot say if it should be fixed, but website owners must be aware that challenges are out of their control. To fix this reCAPTCHA could return amount of challenges and failed attempts with verification response.
Questions? I realized this an hour ago so I can possibly be mistaken somewhere, or I didn't discover it first. Point it out please.