Welcome to the fourth OnionScan Report. The aim of these reports is to provide an accurate and up-to-date analysis of how anonymity networks are being used in the real world.
In this report we will analyze how https is being used (and misused) by onion services.
HTTPS is still very rare on the Dark Web. Only a handful of sites have an active https endpoint, and only a tiny fraction of those present valid certificates.
We show that certificates can be used as another Identity Correlation vector - and site operators should audit their configurations and certificates to ensure that they are not leaking private relationships.
A Note on Numbers
The lifespan of an onion service can range from minutes to years. When providing generalized numbers, especially percentages, we report approximate figures based on multiple scans over a period of time, rather than a single snapshot.
Trends in HTTPS Onions
Of the ~6000 sites we have received a successful response from 276 (4%) were active on port 443.
It is worth noting that many of the certificates returned are not valid. The majority of sites present certificates for other, usually clearnet, domains. The rest are signed by unknown certificate authorities and/or are expired.
The only valid certificates were provided the following organizations:
- AJAr Foundation
- Facebook (4 domains)
- Propublica (4 domains)
- Privacy International
- FirstLook (SecureDrop instance)
All valid certificates that we found were signed by the DigiCert certificate authority.
Fun Fact: The largest cluster in the image above (top right hand corner) are associated through the Let's Encrypt certificate authority.
By using certificates and their chains we were able to discover a number of identity correlations which were not detected using other methods. These correlations are summarized below:
- Certificates are valid for one or more other onion domains - anyone can sign a certificate for another domain (whether it is valid or not is another question!) - so linking through this method carries some risk, but practically is a good indicator of a relationship.
- Certificates are signed by the same authority - This is a very weak association, however, due to the small number of sites and the variety of certificate authorities, if two sites are associated with the same niche authority then there may be a relationship.
- Shared self signed artifacts e.g. the certificates are signed by the same custom Certificate Authority or are associated with the same local IP addresses and/or hostnames. This is a fairly strong correlation.
There are also very strong indicators e.g. Certificates share and/or are valid for the same clearnet domains. This is an obvious identity correlation, though if present is unlikely to expose publicly unknown information.
Besides the obvious case that most of the certificates presented are not valid for the given onion address we noticed another interested pattern.
Many onion web sites serve different content on port 80 (http) than they do on port (443) - from the examples we have seen we believe this is due to co-hosting where port http is being forwarded correctly, but https is not.
We found 66 instances of sites exposing different content via http and https. Many of these were minor variations (e.g. a default directory index listing which included the port number in the page) - however there were many examples of sites exposing completely alternative content, for example:
- A site exposing a blog over http and a CalDAV server over https.
- A site redirecting to two completely different clearnet domains depending on whether the request was http or https
- A site exposing a regular page over http, but returning a default "This page does not exist." message over https.
- A site exposing a regular page over http but redirecting to a clearnet site over https.
- A site exposing a default apache install over http and exposing a Cacti install over https.
It should be clear that while it is not common, there are many instances of server configurations centered around https that may lead to site discovery or even deanonymization.
Other OnionScan News
- OnionScan is now a Docker Container thanks to Michael Patton! - There is lots more in this space we would like to do and see happen e.g. bundling OnionScan with a Dark Web Starter Container to help folks deploy secure sites.
- Justin Seitz wrote a post describing how to run OnionScan on larger batches, and also released a python script which has some nice features (like automating new domain discovery) - we look forward to integrating some of these directly into OnionScan in the near future.
- Sarah wrote about what's next for the OnionScan tool and provided a brief tour of the new features OnionScan has acquired in recent months.
If you would like to help please read Sarah's post OnionScan: What's New and What's Next for some great starting off points. You can also email Sarah (see her profile for contact information).
Goals for the OnionScan Project
- Increase the number of scanned onion services - We have so far only successfully scanned ~6500 (out of ~12,000 domains scanned).
- Increase the number of protocols scanned. OnionScan currently supports light analysis for HTTP(S), SSH, FTP & SMTP and detection for Bitcoin, IRC, XMPP and a few other protocols - we want to grow this list, as well as provide deeper analysis on all protocols.
- Develop a standard for classifying onion services that can be used for crime analysis as well as an expanded analysis of usage from political activism to instant messaging.