Think as developer.
Say, you need to implement /login?next_url=/messages functionality. Some action must verify that the next URL is either relative or absolute but located on the same domain.
What will you do? Let’s assume you will start with the easiest option - quick regexps and first-letters checks.
1. URL starts with /
2. URL starts with / but can’t start with //
3. At this point you realize your efforts were lame and unprofessional. You will use URL parsing library, following all the RFC-s and such - \ is not allowed char in URL, all libraries wouldn't accept it. Much RFC, very standard.
uri = URI.parse params[:next]
redirect params[:next] unless uri.host or uri.scheme
Absence of host and scheme clearly says it is a relative URL, doesn’t it?
Bypass for ruby, python, node.js, php, perl: ///host.com
1 is for path, 2 is for host, 3 is for ?
>A scheme-relative URL is "//", optionally followed by userinfo and "@", followed by a host, optionally followed by ":" and a port, optionally followed by either an absolute-path-relative URL or a "?" and a query.
>A path-relative URL is zero or more path segments separated from each other by a "/", optionally followed by a "?" and a query.
A path segment is zero or more URL units, excluding "/" and "?".
Given we have base location as https://y.com and where will following URLs redirect?
/host.com is a path-relative URL and will obviously load https://y.com/host.com
//host.com is a scheme-relative URL and will use the same scheme, https, hence load https://host.com
The question is where ///host.com (also ////host.com etc) will redirect?
Out of question, it is a path-relative URL too. Third letter is /, so it can’t be a scheme-relative URL (which is only //, followed by host which doesn’t contain slashes).
It has 2 URL units which are empty strings, concatenated with / and supposed to load https://y.com///host.com
The thing is, both Chrome and Firefox parse it as a scheme-relative URL and load https://host.com Safari parses it as a path. Opera loads http://///x.com (?!).
where #secret can be access_token or auth code
Use a Library, Luke
Functionality like /login?to=/notifications is very common so can be found almost on any website. Now the question is how Good Programmers validate it?
As proved in the beginning of the post, best practise would be to use URL parser.
Let’s see how major platforms deal with ?next=///host.com
Perl (parses as a path)
Python (parses as a path)
ParseResult(scheme='', netloc='ya.ru', path='', params='', query='', fragment='')
ParseResult(scheme='', netloc='', path='/ya.ru', params='', query='', fragment='')
ParseResult(scheme='', netloc='', path='////ya.ru', params='', query='', fragment='')
Ruby (parses as a path)
1.9.3-p194 :004 > URI.parse('///google.com').path
1.9.3-p194 :005 > URI.parse('///google.com').host
Node.js (parses as a path)
PHP (mad behavior, quite expected)
This doesn’t work (but should). You might be happy but wait, while all languages don’t parse /\host.com because it is not valid PHP gladly parses it as a path.
Thus PHP is vulnerable too.
Basically, with /\host.com and ///host.com we can get an open redirect for almost any website. Yeah. No matter you have “home made” parser or reliable server-side library - most likely it's vulnerable.
The only good protection is to respond with full path:
Location: http://myhost/ + params[:next]
Besides phishing, redirects can exploit many Single Sign On and OAuth solutions: 302 redirect leaks #access_token fragment, and even leads to total account takeover on websites with Facebook Connect (details soon).
Or you could, y'know, normalize the path if it is a local path.ReplyDelete
Seems silly to just check if it is a local path without actually making sure the path is remotely sane. `require('path').normalize('///x.com')` in node.js gives me `/x.com`, which is probably what we want, no?
Node just told you /x.com is a path. This is a bug because this is *not* a path, and if you respond with Location: ///host.com it will load host.comDelete
Link to the issues in both browsers' bug trackers?ReplyDelete
yes, here they areDelete
PHP documentation for parse_url states that the function is not meant to validate urls.ReplyDelete
The core filter extension, however, seems to behave correctly:
var_dump( filter_var ("///host.com", FILTER_VALIDATE_URL) ); // bool(false)
var_dump( filter_var ("/\host.com", FILTER_VALIDATE_URL) ); // bool(false)
never saw filter_var used for this purpose. Overall, PHP users don't validate anything.Delete
If what you respond with in a Location: header is anything but a full absolute URI (protocol and all), you're doing it wrong. (see RFC 2616)ReplyDelete
If your framework doesn't allow you to know inside the implementation of /login?next_url=/messages what your full URI is, your framework is doing it wrong.
Yes, browsers accept all sorts of things in Location: headers. That doesn't mean it isn't sloppy (and, as you show, vulnerable) practice to use anything but an absolute URI.
in perfect world everybody follows RFC, but not here)Delete
Fair points. Also internal/relative URL's can also be external-URL's. there is this specific example, take \%email@example.com if the filters (mostly like Facebook's linkshim) trying to detect // or \/ that should bypass it. worked for me in multiple-cases.ReplyDelete
Some libraries aren't that secure. I wrote an article about how Python module UrlParse fails to reconstruct urls leading to an open redirect.ReplyDelete