Wednesday, May 29, 2019

Beware of this find command gotcha



find is a basic useful command that Linux users run all the time. The command searches a file system from a given starting location, and returns all matches based on input filters that you provide as arguments.

The Gotcha


The gotcha is when you try to narrow the search by pruning a sub-directory from the search (including the directory itself and everything under it). For instance, suppose you want to find all files under the directory /data that are owned by root, excluding the sub-directory /data/keepit and all files underneath.



My first attempt at the solution results in the following find command.

find /data -path /data/keepit -prune -o -user 0


The -o argument specifies the logical 'or' operator. The expression on the left,  '-path /data/keepit -prune'  indicates where to prune the search. The idea is that when the search reaches /data/keepit, the -prune argument causes the search to not descend further into the sub-directory. Furthermore, -prune always returns true. Hence, the whole expression returns 'true', without having to evaluate the expression on the right of -o.

The expression right of -o tests for root ownership (root is user 0).

I was befuddled to learn that running the above command returns /data/keepit (but not its descendants). If the search is snipped at /data/keepit, why is the sub-directory itself included in the output? Besides, /data/keepit is not owned by root.

Being unaware of this behavior could lead to some unintended and very bad consequences as files named in the find output are often piped to the xargs command for further processing.

The Explanation


Before I present my solution, let's discuss why the point of pruning, i.e., the sub-directory named in -path, is actually included in the output.

The primary purpose of find is to search for file matches. Yet, it can have side effects through actions you specify on the command line. In addition to -print/-print0, there is also the -exec action. Unless you explicitly specify an action, the find command assumes the default action is -print.

The above example has no explicit -print or -exec action, therefore, the  action defaults to print all file matches. This explains why /data/keepit, a match for -path, is in the output. Its descendants, on the other hand, were excluded because of pruning.

The Solution


My solution is to specify -print explicitly on the command line.

find /data -path /data/keepit -prune -o -user root -print


Lo and behold. When you run the above command, /data/keepit is no longer part of the output.

By specifying the -print action explicitly, the find command no longer defaults  to printing out each file match. Instead, it will only print a file match if it is explicitly requested.

Summary & Conclusion

The pruning logic of the find command is quite confusing. Reading its man page offers some help, but may generate more questions than answers. I hope that this article is of help. But, I recommend that before you use the -prune feature on your production data, test it on some dummy data first.

You have been forewarned.

Monday, March 11, 2019

ts: epitome of the Unix philosophy

Do one thing and do it well - the Unix philosophy

In this new age of Linux bloatware (hello, systemd), it is exhilarating to discover small gems like ts, a command-line tool that prepends a timestamp to each output line.

How is this useful?


I run scripts all the time—bash scripts, Ansible playbooks, etc–to automate system administration tasks. Many longer-running scripts that I run output statements in real-time to report what they are doing. For example, running an Ansible playbook will automatically output the name of the individual task as it is being executed. By default, however, no timestamps are displayed for the tasks.


$ ansible-playbook -b -i hostsfile myPlaybook.yml
PLAY [localhost] ****************************************************************

TASK [Gathering Facts] **********************************************************
ok: [localhost]

TASK [Disabe Caps Lock] *********************************************************
ok: [localhost]

TASK [Install X apps] ***********************************************************
ok: [localhost] => (item=autokey-gtk)
ok: [localhost] => (item=gnucash)

TASK [Install 64-bit texamker - Debian] *****************************************
changed: [localhost]
...snipped...
PLAY RECAP **********************************************************************
localhost : ok=8 changed=2 unreachable=0 failed=0


Often, I do want to display the timestamps for logging or troubleshooting purposes. An unusually short (or long) execution may signal something is amiss.

Granted, you can use the Ansible-specific profile_tasks plugin to profile your tasks. But, I propose ts as a quick-and-dirty solution: just pipe Ansible output to ts like the following.


$ ansible-playbook -b -i hostsfile myPlaybook.yml |ts
Mar 11 11:44:05 PLAY [localhost] ****************************************************************
Mar 11 11:44:05
Mar 11 11:44:05 TASK [Gathering Facts] **********************************************************
Mar 11 11:44:05 ok: [localhost]
Mar 11 11:44:06
Mar 11 11:44:06 TASK [Disable Caps Lock] *********************************************************
Mar 11 11:44:06 ok: [localhost]
Mar 11 11:44:06
Mar 11 11:44:06 TASK [Install X apps] ***********************************************************
Mar 11 11:44:10 changed: [localhost] => (item=autokey-gtk)
Mar 11 11:44:14 changed: [localhost] => (item=gnucash)
Mar 11 11:44:14
Mar 11 11:44:14 TASK [Install 64-bit texamker - Debian] *****************************************
Mar 11 11:44:19 changed: [localhost]
...snipped...
Mar 11 11:44:30 PLAY RECAP **********************************************************************
Mar 11 11:44:30 localhost : ok=8 changed=3 unreachable=0 failed=0


Optional ts arguments

By default, the ts command inserts the absolute timestamp into each output line. You can use the -s argument to replace the absolute timestamp with the elapsed duration since the start of execution.

$ ansible-playbook -b -i hostsfile myPlaybook.yml |ts -s
00:00:01 PLAY [localhost] ****************************************************************
00:00:01
00:00:01 TASK [Gathering Facts] **********************************************************
00:00:01 ok: [localhost]
00:00:02
00:00:02 TASK [Disable Caps Lock] *********************************************************
00:00:02 ok: [localhost]
00:00:02
00:00:02 TASK [Install X apps] ***********************************************************
00:00:06 changed: [localhost] => (item=autokey-gtk)
00:00:10 changed: [localhost] => (item=gnucash)
00:00:10
00:00:10 TASK [Install 64-bit texamker - Debian] *****************************************
00:00:15 changed: [localhost]
...snipped...
00:00:26 PLAY RECAP **********************************************************************
00:00:26 localhost : ok=8 changed=3 unreachable=0 failed=0



Another useful argument to know is -i. With this argument, each output line displays the elapsed time since the previous output line. You don't need to do the mental math to calculate how long a task took.

$ ansible-playbook -b -i hostsfile myPlaybook.yml |ts -i
00:00:00 PLAY [localhost] ****************************************************************
00:00:00
00:00:00 TASK [Gathering Facts] **********************************************************
00:00:01 ok: [localhost]
00:00:00
00:00:00 TASK [Disable Caps Lock] *********************************************************
00:00:00 ok: [localhost]
00:00:00
00:00:00 TASK [Install X apps] ***********************************************************
00:00:04 changed: [localhost] => (item=autokey-gtk)
00:00:04 changed: [localhost] => (item=gnucash)
00:00:00
00:00:00 TASK [Install 64-bit texamker - Debian] *****************************************
00:00:05 changed: [localhost]
...snipped...
00:00:00 PLAY RECAP **********************************************************************
00:00:00 localhost : ok=8 changed=3 unreachable=0 failed=0


In summary, ts is a fast and easy way to add timestamps to script or command output.

Tuesday, May 1, 2018

Snaps as self-contained, auto-updating, universal software packages

One of Linux's unique selling points is that users can choose from a variety of Linux distributions, each with its own features and advantages. However, a byproduct of the proliferation of distributions is that Linux developers are burdened with extra labor to package and deploy software in multiple incompatible package formats, such as RPM and DEB, using different package managers.

There have been several attempts to address this obstacle of deploying software across multiple Linux distributions. The latest such initiative, spearheaded by none other than the venerable Canonical Ltd, the Ubuntu developer, is Snapcraft. Snapcraft is the developer tool for packaging software in the universal snap format.

The rest of this post explains how to use snaps, from a user rather than developer's perspective.

[2018-05-23 update] A snap from the official Snap store was found to contain hidden cryptocurrency mining code, and was since removed from the store. The news highlight the fact that the mere existence of a snap in the Snap store does not guarantee its integrity. The lesson is to only use snaps from trusted sources featured in the Snap store, such as the original software author, an official maintainer, or a trusted community source.

What are snaps?

Snaps are universal software packages that can be deployed across the major Linux distributions and architectures including IoT. Although snaps is a format championed by the creator of Ubuntu, it is supported on all major Linux distributions including Debian, Ubuntu, Mint, Fedora, Gentoo, ArchLinux, Manjaro, and OpenSUSE.

Besides being distribution-agnostic, snaps are also self-contained. Linux users are all too familiar with "the dependency hell", ie, the occasional extreme frustration experienced in software installation due to dependency issues. A snap is self-contained in that it bundles the required runtime libraries inside the package.

Once you install a snap on your system, it will be auto-updated to run the latest release. You however can manually roll back to a previous version of the software if you so desire.

In summary, snaps are universal, self-contained, auto-updating Linux software packages.

Install snapd

To be able to run snap packages on your Linux system, you must first install snapd, the service responsible for running and managing snaps. The following command will install snapd on a Debian system.
$ sudo apt install snapd

The Snap store

You can search the relatively small but growing online Snap store for snaps to install. Alternatively, you can search using the command-line interface:
$ snap find firefox
Name     Version   Developer  Notes  Summary
firefox  59.0.2-1  mozilla    -      Mozilla Firefox web browser

Why would you download the firefox snap when firefox is readily available as a DEB, via the Debian standard repository? Similarly, chromium and libreoffice belong to the same category of software. The answer is that, in most cases, snaps offer a much more recent version of the software than that from the native Linux package manager. For instance, Debian 9 ("Stretch") packages FireFox 52 in its standard repository whereas you can get Firefox 59 as a snap.

The Snap store features some snaps that are not available from the standard repositories of major Linux distributions. Examples are ghostwriter and vidcutter. Instead of building such software manually, you can download and deploy their corresponding snaps from the Snap store.

Below is a non-exhaustive list of software available in snaps that I personally find useful(or fun).
  • Chromium
  • Firefox
  • Ghostwriter-casept (a Markdown editor)
  • LibreOffice
  • Minecraft
  • Nextcloud
  • OBS Studio (for screencasting and live video streaming)
  • Opera
  • Skype
  • Slack
  • Solitaire
  • Spotify
  • VidCutter (a video editor)

To find out more about a particular snap, execute the following command:
$ snap info minecraft
name:      minecraft
summary:   Minecraft is a game about placing blocks and going on adventures.
publisher: snapcrafters
license:   Proprietary
description:   A game about placing blocks while running from skeletons
snap-id: aJQRf6WPQq04DH0TB2HdTB6K9rf6I1yX
channels:                
  stable:    latest (11) 148MB -
  candidate: latest (11) 148MB -
  beta:      latest (11) 148MB -
  edge:      latest (11) 148MB -

How to install a snap

Installing a snap is as easy as:
$ sudo snap install solitaire

If you don't have root privileges, you can still install snaps by first signing in to the Ubuntu Snap store. You will need an Ubuntu One account (which is free).
$ sudo snap login <your-email-address>
$ snap install solitaire

To list the snaps already installed on your system:
$ snap list
Name       Version    Rev   Tracking  Developer  Notes
core       16-2.32.1  4327  stable    canonical  core
skype      8.18.0.6   23    stable    skype      classic
solitaire  1.0        2     stable    1bsyl      devmode

To purge a snap from your local system:
$ sudo snap remove solitaire

Note that the above command will delete all snap-specific data and settings.

How to run a snap

Installing a snap in most cases will automatically create a shortcut on your desktop menu system. For instance, you will find an entry for the solitaire snap in the Games sub-menu.

Alternatively, you can always run the snap command explicitly on the command line.
$ snap run <your snap command>

The caveat with the command-line approach is that the snap argument may not always be obvious. For instance, with the solitaire snap, the corresponding name to use for the run command is solitaire.1bsyl, not solitaire. You can find out the specific name to use by examining snap's bin directory:
$ ls /snap/bin/
skype  solitaire.1bsyl
$ snap run solitaire.1bsyl

Tuesday, April 3, 2018

Scanning HTTPS for Mixed Content

Back in 2014, Google raised the awareness of using HTTPS ("Secure HTTP") by making its use a ranking signal in Google search algorithms. HTTPS essentially establishes secure encrypted connections to the cloud. Google further raised the stake of not using HTTPS by announcing that, beginning in July 2018, the Google Chrome browser with the release of Chrome 68 will mark all HTTP websites as being insecure. The consequence of not converting to HTTPS is that site visitors will be persuaded by the warning message to bounce from your website.
Even before the impending drop dead date, Chrome and other popular web browsers such as Firefox and Edge have been warning visitors to HTTP-connected sites with an informational message.


Web administrators had taken heed and converted their websites to HTTPS, many taking advantage of the free SSL certificates issued by Let's Encrypt. However, if you have successfully converted to HTTPS, your work may not be done. You still need to verify that your website is properly recognized as being secure. You want to see the padlock icon displayed next to the web page's URL in the browser window.

To many administrators' surprise, even a properly converted HTTPS website may still be marked as being insecure. This is most likely due to the website's mixed content. For a web page to be deemed secure, everything loaded by that page must be encrypted by HTTPS. A web page with mixed content loads both encrypted as well as non-encrypted contents such as images, videos, stylesheets and scripts.

While it is possible to manually spot mixed web content on a web page, checking a non-trivial website requires automation. Mixed Content Scan is a command-line web crawler which scans for mixed content. The rest of this post explains how to install and use the tool.

Installation

Mixed Content Scan is a batch PHP application. To install the tool, use composer, a PHP package dependency manager. For the latest instructions on how to install composer, please refer to this link. Note that the said procedure installs composer in the current directory. Optionally, move the executable to a globally accessible directory using the following command.
$ sudo mv composer.phar /usr/local/bin/composer
To install Mixed Content Scan:
$ composer global require bramus/mixed-content-scan:~2.8
The Mixed Content Scan executable is placed in ~/.config/composer/vendor/bramus/mixed-content-scan/bin.

Scanning for mixed content

To scan a website for mixed content, simply provide its URL as an argument to Mixed Content Scan:
$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan https://shadowofyourwings.com/
By default, the tool outputs the scan report on the terminal("standard output"). Alternatively, you can specify an output file using the --output parameter as follows:
$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan --output <some/file/path> https://shadowofyourwings.com/
You can also use the --ignore parameter to specify a file which contains URL patterns that the tool will ignore and not scan. The example site I use is a WordPress website. The scanning tool comes with a sample ignore file for WordPress which is located in ~/.config/composer/vendor/bramus/mixed-content-scan/bin/ignorepatterns/wordpress.txt.


$ cd ~/.config/composer/vendor/bramus/mixed-content-scan/bin
$ ./mixed-content-scan --ignore=~/.config/composer/vendor/bramus/mixed-content-scan/bin/ignorepatterns/wordpress.txt https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.NOTICE: Scanning https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.ERROR: 00000 - https://shadowofyourwings.com/
[2018-02-16 16:53:18] MCS.WARNING: http://gmpg.org/xfn/11
[2018-02-16 16:53:19] MCS.ERROR: 00001 - https://shadowofyourwings.com/about
[2018-02-16 16:53:19] MCS.WARNING: http://shadowofyourwings.com/wp-content/uploads/2017/05/peterLeung.jpg
[2018-02-16 16:53:19] MCS.WARNING: http://gmpg.org/xfn/11

[2018-02-16 16:53:20] MCS.ERROR: 00002 - https://shadowofyourwings.com/contact
[2018-02-16 16:53:20] MCS.WARNING: http://gmpg.org/xfn/11
... <output snipped> ...
[2018-02-16 16:53:38] MCS.NOTICE: Scanned 26 pages for Mixed Content

Mixed Content Scan numbers each page scanned, starting from 00000. In the above example, the About page (00001) has been flagged as having mixed content. The sources of mixed content as loaded by that page are twofold:
  1. Vulnerable image file.
    The peterLeung.jpg file is being loaded via the insecure HTTP connection. The fix is simple: go to the WordPress administration web page, and change HTTP to HTTPS on the About web page.
  2. Theme header profile
    The header of the default twentyseventeen WordPress theme contains a reference to http://gmpg.org/xfn/11. The code is in <document root>/wp-content/themes/twentyseventeen/header.php.

    Although the scanner reports its occurrence as a violation, browsers generally do not flag this as a mixed content error. This error can be safely ignored.

Friday, March 16, 2018

A review of 3 best-of-breed Markdown editors


As a technology blogger, I write HTML documents that are hosted on different platforms such as WordPress, Drupal, and Blogger. I like to compose HTML using the Markdown markup language. Unfortunately, the HTML editors bundled with the aforementioned platforms do not support Markdown natively. It is true that you can download Markdown plugins for WordPress and Drupal. But, at the end, I still find the HTML editors to be too intrusive for a writer such as myself to stay focused and productive.

Fortunately, there are many good special-purpose Markdown editors out there. My web authoring process involves first composing the document using a Markdown editor, and then copying and pasting the output HTML into the Content Management System(CMS). Below, I evaluate 3 open-source Markdown editors: justmd, Remarkable, and ghostwriter.

I will evaluate each editor from two sometimes conflicting viewpoints, that of a geek and a writer. As a geek, I side with editors that have many bells and whistles. But, as a writer, I prefer editors that help me create, often by filtering out as much distraction as possible, and forcing me to focus on the next word, phrase, sentence to put on the page.

justmd

justmd is a minimalist, bare-bones Markdown editor. When you open justmd, you will see a single window with 2 window panes of equal size, located side-by-side. One pane is where you enter the Markdown text; the other is the HTML preview pane. Although you can change the overall size of the encompassing window, you cannot change the ratio of the 2 panes. The geek in me cannot help but cringe at the discovery. After all, it is common among Markdown editors (including Remarkable and ghostwriter) to have separate input and preview windows that you can independently resize and even hide. Conversely, the writer in me gives justmd a big shout-out for its austere simplicity. You just open the app, and immediately start writing, without having to adjust the size of any window component. Writers will find justmd more conducive to writing than many editors that are much more customizable.

Minimalist as it is designed to be, justmd, as a Markdown editor, is not feature complete in its current status. The following features, which I deem to be very important for writers, are still missing in justmd:

  • Spellchecker.
  • Word count.
  • Auto save.

This post was written entirely using justmd, and the overall experience was very positive. The lack of a spellchecker and word counter did not hamper the writing at all. On the contrary, it enhances my productivity by breaking the bad habits of constantly checking the word count and looking out for spelling errors in the midst of writing. Most Content Management Systems are capable of spellchecking and word counting. So, those tasks can be deferred until later, after you paste the HTML into the CMS.

Finally, I comment on the ease of installing justmd. None of the 3 editors being reviewed here are pre-packaged in the official repository of a major Linux distribution. Having said that, installing justmd is as easy as 1-2-3.

  1. Download compressed tarball from justmd website.
  2. Uncompress the tarball using command tar -zxvf justmd-linux-x64-v1.1.1.tar.gz.
  3. Create shortcut to justmd binary.

Remarkable

Featurewise, Remarkable is middle-of-the-road, between justmd and ghostwriter. It has word counting, but no spellchecking. Like justmd, both input and preview functions coexist as panes side-by-side in a single window, but you can stack them vertically or horizontally, and you can resize each pane proportionally within the window.

Now, as a writer, I find Remarkable's user interface too colorful, too distracting. Specifically, its overly generous use of color for syntax highlighting and icon design is detrimental to the primary writing task. With color, less is more.

You can download the Remarkable package in .deb or .rpm format from its Linux download page. Users of Debian, Ubuntu, Fedora, SUSE, and Arch systems will find installation straightforward.

ghostwriter

ghostwriter is the most mature and feature complete of all 3 MarkDown editors. It offers spellchecking, word counting, auto saves, and much more.

Two unique features are especially noteworthy to writers: Hemingway and Focus modes. In Hemingway mode, two particular keyboard keys are disabled, namely, the delete and the backspace keys. The rationale is to increase productivity by delaying document editing as much as possible. In Focus mode, only the portion of the document you are working on is made prominent, and the rest fades out. You can configure the focus to be the current sentence, the current single or 3 lines, or the current paragraph.

Despite the rich feature set, the ghostwriter user interface is surprising clean and uncluttered.

The input and live preview functions reside in separate windows that you can resize and move around individually. Keen observers will definitely notice there is a real-time lag between actual text input and the update of the live preview. This is not a bug in the program. On the contrary, ghostwriter is programmed to only update the live preview when you stop typing(for a fraction of a second). The technical reason given by the developers is that the delay smoothens the jitters in synchronizing the rendering of large files. I can see many writers actually support this design decision because attention should be primarily focused on the writing, not the rendering, of the document.

Recall that the overall objective for using a MarkDown editor is to generate HTML code to insert into a CMS. With justmd and Remarkable, you need to first export to a HTML file, and then import the file (or copy and paste its contents) into the CMS. On the other hand, ghostwriter provides a shortcut Copy HTML button which is discreetly tucked away at the bottom right of the window. The button is a minor feature in the overall design scheme, but has a disproportionally high value to end users. Clicking the button copies the HTML code in its entirety into the clipboard. Importing the HTML into the CMS simply involves pasting the contents of the clipboard.

ghostwriter provides packages for Ubuntu , Fedora, openSUSE, and Arch Linux AUR. If you run Ubuntu or any of its derivatives such as Linux Mint, ghostwriter can be installed after adding a PPA repository and updating the local cache.

sudo add-apt-repository ppa:wereturtle/ppa
sudo apt update
sudo apt install ghostwriter

If ghostwriter is not pre-packaged for your distro, e.g., Debian, you can follow the on-line instructions to build the executable yourself. Depending on the particular distro and release, be prepared to spend some considerable time as you may run into the proverbial Linux dependency hell.

Feature comparison

Features justmd Remarkable ghostwriter
Cross-platform Linux(x64), Windows(x64), macOS Linux, Windows Linux, Windows
Linux installation Downloadable executables Downloadble packages for Debian, Ubuntu, Fedora, openSUSE, Arch Downloadble packages for Ubuntu, Fedora, openSUSE, Arch
Export to HTML, PDF Yes Yes HTML, PDF, Word, ODT
Spellchecker No No Yes
Auto save No No Yes
Word count No Character, word, line counts Character, word, line, sentence, paragraph, page counts
Live preview Fixed window proportion Hidable, variable proportion Separate resizable window (no dual panel)
GitHub-flavored syntax Support for tables Yes (tables, strikethrough, emphasis, etc) Yes (tables, strikethrough, emphasis, etc)

Summary & conclusion

A writer's working style is intrinsically idiosyncratic. A writing environment that is distraction-free to one person may not be stimulating enough for another. Yet, ghostwriter is the clear winner of the 3 editors because it strikes a balance between clean design and feature richness. However, if ghostwriter is not pre-packaged for your Linux distro (say Debian), justmd and Remarkable are definitely worthwhile alternatives.