Last month, I went hunting for security bugs in GitHub, a popular platform for sharing and collaborating on code. After spending many hours mapping out GitHub’s infrastructure, and testing for weaknesses without any significant results or leads, I shifted my focus to the service providers. This is a write-up about two of the issues I found, which both have since been addressed.
Trawling Amazon S3 buckets
There aren’t many organisations that don’t use Amazon S3 for object storage in some way, and there aren’t many organisations that have correctly configured all of their S3 buckets either. That’s because it’s far too easy to screw up a bucket’s access control list (ACL), granting anonymous or any authenticated AWS users read and/or write access. Assuming GitHub would be no exception, I started compiling a list of their S3 buckets, based on information from public sources. (Un)fortunately, this resulted in only a few buckets containing static assets that were configured properly anyway.
But then I realised many S3 buckets have very predictable names. They almost always consist of two or three parts: an organisation name, a name describing the bucket’s contents, and an optional suffix (usually the environment, e.g. “production”), separated by periods, hyphens, or underscores. I wrote a very straightforward script to enumerate the bucket names using a word list and a suffix list. What is nice is that you don’t actually need to interact with any of the AWS APIs to check for a bucket’s existence, you could simply query a DNS server, because existing buckets leave distinguishable DNS responses.
With a list of about a dozen names of buckets possibly belonging to GitHub, I ran another script to determine which buckets were accessible by anonymous or authenticated AWS users. I found one S3 bucket that wasn’t configured properly, which allowed anyone to list the bucket’s contents and download objects. I later learned that among the objects in this bucket were internal GitHub infrastructure graphs, which is the kind of information an attacker would be looking for.
The issue was addressed by GitHub by updating the bucket’s configuration. A useful auditing tool for AWS services is Scout2 by iSEC, which is capable of highlighting misconfigured S3 buckets.
Taking over domain names
The other, far more interesting, vulnerability I found was in the systems of one of GitHub’s CDN providers that would have allowed an attacker to take over arbitrary sub domains of both github.io, and githubusercontent.com, including sub domain names that were already in use, e.g. raw.githubusercontent.com, and existing GitHub Pages domains.
This issue was caused by a bug in a procedure that checks whether a particular domain name could be claimed, and assigned to an account with this CDN provider. This procedure uses Mozilla’s Public Suffix List, a curated list of all levels (e.g. co.uk) at which a domain name may be registered for a particular top-level domain, to check whether a domain name can be registered. The problem was that if someone would assign the domain name foo.example.org to their account, and the parent suffix (i.e. example.org) is on that list, then it would not verify whether example.org was already in use by someone else. Since both github.io and githubusercontent.com are on that list, anyone could claim any of the sub domains.
The GitHub domain entries on the Public Suffix List:
// Submitted by Ben Toews <firstname.lastname@example.org> 2014–02–06 github.io githubusercontent.com
At the point where an attacker was able to assign any of the aforementioned GitHub domain names to their account, he or she could point these domain names to literally any back-end, and thus fully control the domain. What made this bug particularly interesting was that an attacker could even use GitHub’s TLS certificate for services on their account. Which is great from an exploitability perspective, because an attacker would be able to transparently man-in-the-middle any (existing) sub domain of both github.io and githubusercontent.com, and, for example, serve malicious content for (specific) requests.
Some GitHub services that were vulnerable:
- github.io: Serves GitHub Pages.
- raw.githubusercontent.com: Serves raw file content from GitHub repositories.
- gist.githubusercontent.com: Serves raw file content from GitHub gists.
The issue was addressed by GitHub’s CDN provider by removing the GitHub domain names from (their version) of the PSL. They also reviewed all other entries on the list to make sure there was no further impact.
Both vulnerabilities were awarded a monetary bounty. I asked GitHub to donate the bounty for the first vulnerability to The Tor Project, because they do important work, and need more (independent) funding. I would like to thank GitHub for matching the donation.
GitHub also put up a hunter profile on their bug bounty website, along with descriptions of both vulnerabilities.