Tuesday, April 3, 2018

Scanning HTTPS for Mixed Content

Back in 2014, Google raised the awareness of using HTTPS ("Secure HTTP") by making its use a ranking signal in Google search algorithms. HTTPS essentially establishes secure encrypted connections to the cloud. Google further raised the stake of not using HTTPS by announcing that, beginning in July 2018, the Google Chrome browser with the release of Chrome 68 will mark all HTTP websites as being insecure. The consequence of not converting to HTTPS is that site visitors will be persuaded by the warning message to bounce from your website.
Even before the impending drop dead date, Chrome and other popular web browsers such as Firefox and Edge have been warning visitors to HTTP-connected sites with an informational message.


Web administrators had taken heed and converted their websites to HTTPS, many taking advantage of the free SSL certificates issued by Let's Encrypt. However, if you have successfully converted to HTTPS, your work may not be done. You still need to verify that your website is properly recognized as being secure. You want to see the padlock icon displayed next to the web page's URL in the browser window.

To many administrators' surprise, even a properly converted HTTPS website may still be marked as being insecure. This is most likely due to the website's mixed content. For a web page to be deemed secure, everything loaded by that page must be encrypted by HTTPS. A web page with mixed content loads both encrypted as well as non-encrypted contents such as images, videos, stylesheets and scripts.

While it is possible to manually spot mixed web content on a web page, checking a non-trivial website requires automation. Mixed Content Scan is a command-line web crawler which scans for mixed content. The rest of this post explains how to install and use the tool.

Installation

Mixed Content Scan is a batch PHP application. To install the tool, use composer, a PHP package dependency manager. For the latest instructions on how to install composer, please refer to this link. Note that the said procedure installs composer in the current directory. Optionally, move the executable to a globally accessible directory using the following command.
$ sudo mv composer.phar /usr/local/bin/composer
To install Mixed Content Scan:
$ composer global require bramus/mixed-content-scan:~2.8
The Mixed Content Scan executable is placed in ~/.config/composer/vendor/bramus/mixed-content-scan/bin.

Scanning for mixed content

To scan a website for mixed content, simply provide its URL as an argument to Mixed Content Scan:
$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan https://shadowofyourwings.com/
By default, the tool outputs the scan report on the terminal("standard output"). Alternatively, you can specify an output file using the --output parameter as follows:
$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan --output <some/file/path> https://shadowofyourwings.com/
You can also use the --ignore parameter to specify a file which contains URL patterns that the tool will ignore and not scan. The example site I use is a WordPress website. The scanning tool comes with a sample ignore file for WordPress which is located in ~/.config/composer/vendor/bramus/mixed-content-scan/bin/ignorepatterns/wordpress.txt.


$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan --ignore=~/.config/composer/vendor/bramus/mixed-content-scan/bin/ignorepatterns/wordpress.txt https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.NOTICE: Scanning https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.ERROR: 00000 - https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.WARNING: http://gmpg.org/xfn/11
[2018-02-16 16:53:19] MCS.ERROR: 00001 - https://shadowofyourwings.com/about
[2018-02-16 16:53:19] MCS.WARNING: http://shadowofyourwings.com/wp-content/uploads/2017/05/peterLeung.jpg
[2018-02-16 16:53:19] MCS.WARNING: http://gmpg.org/xfn/11

[2018-02-16 16:53:20] MCS.ERROR: 00002 - https://shadowofyourwings.com/contact
[2018-02-16 16:53:20] MCS.WARNING: http://gmpg.org/xfn/11
... <output snipped> ...
[2018-02-16 16:53:38] MCS.NOTICE: Scanned 26 pages for Mixed Content

Mixed Content Scan numbers each page scanned, starting from 00000. In the above example, the About page (00001) has been flagged as having mixed content. The sources of mixed content as loaded by that page are twofold:
  1. Vulnerable image file.
    The peterLeung.jpg file is being loaded via the insecure HTTP connection. The fix is simple: go to the WordPress administration web page, and change HTTP to HTTPS on the About web page.
  2. Theme header profile
    The header of the default twentyseventeen WordPress theme contains a reference to http://gmpg.org/xfn/11. The code is in <document root>/wp-content/themes/twentyseventeen/header.php.

    Although the scanner reports its occurrence as a violation, browsers generally do not flag this as a mixed content error. This error can be safely ignored.