ORCID ID: https://orcid.org/0000-0003-3466-5038


For my scientific publications go to this page. Below is a reverse-chronological selection of articles I've published online about command line tricks (especially with AWK), shell scripting and a few favourite FOSS applications. Many of these articles appeared on Andrew Powell's site The Linux Rain, but all the recent ones (2018-05 =>) are in my BASHing data blog series.


BASHing data 2: The curious world of UUIDs      17 May 2024

BASHing data 2: AWK one-liners to multi-liners      10 May 2024

BASHing data 2: Extract successive pairs from a list, and rapidly grow a list      3 May 2024

BASHing data 2: Post- and pre-incrementing (var++ and ++var) with AWK      26 April 2024

BASHing data 2: DataMatrix codes and data content      19 April 2024

BASHing data 2: CSV to JSON to CSV, awkwardly      12 April 2024

BASHing data 2: Finding near-duplicate spelling variants      5 April 2024

BASHing data 2: Table in a PDF to a TSV, on the command line      29 March 2024

BASHing data 2: Print a character as a variable with BASH printf      22 March 2024

BASHing data 2: Mapping with gnuplot, part 5      15 March 2024

BASHing data 2: Mapping with gnuplot, part 4      8 March 2024

BASHing data 2: Counterfeit spaces: the NBSP menace      1 March 2024

BASHing data 2: Convert Microsoft serial day numbers to YYYY-MM-DD      23 February 2024

BASHing data 2: GNU datamash and months      16 February 2024

BASHing data 2: Mojibake with 2 hearts and 52 bytes      9 February 2024

BASHing data 2: Finding identifier codes with and without extra characters      2 February 2024

BASHing data: People are the best data cleaners      8 April 2022

BASHing data: Mapping with gnuplot, part 3      6 April 2022

BASHing data: Mapping with gnuplot, part 2      30 March 2022

BASHing data: gron the JSON flattener      23 March 2022

BASHing data: How to flatten ("unpivot") a data table      16 March 2022

BASHing data: Search for (exact) strings; report line, column and context      9 March 2022

BASHing data: Auto-incrementing version letters      2 March 2022

BASHing data: Online shopping and a one2many tweak      23 February 2022

BASHing data: DNA-style frameshift cryptography      16 February 2022

BASHing data: Apple + Microsoft = character confusion      9 February 2022

BASHing data: How to use patsplit (GNU AWK)      2 February 2022

BASHing data: Scripting a temperature notifier      26 January 2022

BASHing data: A dog-cat-horse-turtle problem      19 January 2022

BASHing data: Tidy tables for data processing      12 January 2022

BASHing data: Are you 10000 days old yet?      5 January 2022

BASHing data: Building an ODT on the command line      29 December 2021

BASHing data: Making a transect into a point and circle      22 December 2021

BASHing data: Detecting truncations: another sometimes successful method      15 December 2021

BASHing data: Gremlin detection bigly improved and a NUL problem avoided      8 December 2021

BASHing data: Combinations from 2 lists: speed trials      1 December 2021

BASHing data: How to watermark a UTF-8 plain text file      24 November 2021

BASHing data: What's wrong with my footprintWKT?      17 November 2021

BASHing data: A quick cross-file comparison with AWK      10 November 2021

BASHing data: On visual contrast and QR codes      3 November 2021

BASHing data: Duplicate records differing only in unique identifiers      27 October 2021

BASHing data: Some regex tests with grep, sed and AWK      20 October 2021

BASHing data: TSV to CSV on the CLI (if you really have to)      13 October 2021

BASHing data: How to do replacements based on multiple field values      6 October 2021

BASHing data: How to find mixed Latin+Cyrillic words      29 September 2021

BASHing data: An AWK histogram with scaling      22 September 2021

BASHing data: Show Unicode code points for UTF-8 characters      15 September 2021

BASHing data: Put an editable command at the next prompt      8 September 2021

BASHing data: Yet another gremlin: the zero-width space      1 September 2021

BASHing data: zbarimg and blurry QR codes      25 August 2021

BASHing data: CSV viewers for CSV haters      18 August 2021

BASHing data: Two data formatting tweaks      11 August 2021

BASHing data: Visualising data as a PGM image      4 August 2021

BASHing data: Revisiting a command-line translator      28 July 2021

BASHing data: The data worker's guide to psiphiorrhea      21 July 2021

BASHing data: What is +ACY- doing in the data?      14 July 2021

BASHing data: Reverse or shuffle a string in a particular field      7 July 2021

BASHing data: There's data missing - please explain      30 June 2021

BASHing data: "Firstname Lastname" to "Lastname, Firstname", with complications      23 June 2021

BASHing data: The curious world of check digits      16 June 2021

BASHing data: Batch triangulation on the command line      9 June 2021

BASHing data: CSV to table, table to CSV      2 June 2021

BASHing data: The little museum and its data      1 June 2021

BASHing data: The Incrementing Fill-Down Error      26 May 2021

BASHing data: Mojibake madness      19 May 2021

BASHing data: A data checker's checklist      12 May 2021

BASHing data: Building a molar mass calculator      24 March 2021

BASHing data: How to fix "one2many" data issues      17 March 2021

BASHing data: Hunting Excel date twins      9 March 2021

BASHing data: DIY primary/foreign key relationships, again      3 March 2021

BASHing data: Four kinds of data anomalies      24 February 2021

BASHing data: A sunset surprise      17 February 2021

BASHing data: Converting a list to a presence/absence table      10 February 2021

BASHing data: How to find the missing parts of a series      3 February 2021

BASHing data: ASCII score bars and a gorblimey command      27 January 2021

BASHing data: Spreadsheet annoyance no. 3: quotes have priority      20 January 2021

BASHing data: Form text and placeholders      13 January 2021

BASHing data: How to build a multi-file fields concordance      23 December 2020

BASHing data: Mojibake bonanza      16 December 2020

BASHing data: Comparing strings more clearly      9 December 2020

BASHing data: Re-format "blah,YYYYMMDD,blah" as "blah,YYYY,MM,DD,blah"      2 December 2020

BASHing data: How to stack columns      25 November 2020

BASHing data: Check the day of year, given a date      18 November 2020

BASHing data: Updating a file from a lookup table      11 November 2020

BASHing data: How to keep an eye on field numbers      4 November 2020

BASHing data: A short rant about Python, R and UNIX      28 October 2020

BASHing data: How to use flags in AWK (revisited)      21 October 2020

BASHing data: The myth of equinoctial gales      14 October 2020

BASHing data: Building a data table from a sentence      7 October 2020

BASHing data: Three kangaroos in the ocean      30 September 2020

BASHing data: Encoding detection smackdown      23 September 2020

BASHing data: Finding one-to-many entries in a data table      16 September 2020

BASHing data: Spotting spaces, and AWK's view of emptiness      9 September 2020

BASHing data: Checking DIY primary/foreign key relationships      2 September 2020

BASHing data: What's wrong with these records?      26 August 2020

BASHing data: How to do a both/neither/one/other tally      19 August 2020

BASHing data: A data table thousands of years old      12 August 2020

BASHing data: How to number copy/pasted commands      5 August 2020

BASHing data: Sharing data and metadata together      29 July 2020

BASHing data: A grizzle about captive data      22 July 2020

BASHing data: A quick repair job on a dislocated table      15 July 2020

BASHing data: Extra commas in a CSV      8 July 2020

BASHing data: How to find almost-duplicates      1 July 2020

BASHing data: Character equivalence classes 2: the nature of equivalence      24 June 2020

BASHing data: Character equivalence classes 1: search and replace      17 June 2020

BASHing data: How to bookmark directories in the shell      10 June 2020

BASHing data: Join consecutive lines if condition applies      3 June 2020

BASHing data: Printing repeats within repeats, and splitting a list into columns      27 May 2020

BASHing data: Add an issues field to a data table      20 May 2020

BASHing data: How to move selected lines within a file      13 May 2020

BASHing data: Spellchecking scientific names on the command line      6 May 2020

BASHing data: Dealing with an all-CAPS/first-CAP jumble      29 April 2020

BASHing data: Brace expansion with variables and arrays: eval to the rescue      22 April 2020

BASHing data: Checking date components across fields      15 April 2020

BASHing data: Targeted string replacements with sed and AWK      8 April 2020

BASHing data: More mojibake fun      1 April 2020

BASHing data: Second Tuesday of each month and a BASHing data century      25 March 2020

BASHing data: A curious pair of data ops      18 March 2020

BASHing data: Life tables      11 March 2020

BASHing data: Moving averages with AWK      4 March 2020

BASHing data: The easy-going syntax of AWK commands      26 February 2020

BASHing data: Changing TTY prompt, font and colors      19 February 2020

BASHing data: How to be uncertain with dates      12 February 2020

BASHing data: Data quality in iNaturalist downloads      5 February 2020

BASHing data: JSON Lines: record-style JSON      29 January 2020

BASHing data: Hunting gremlins      22 January 2020

BASHing data: Getting around a subshell problem      15 January 2020

BASHing data: Build your own character class inventories      27 December 2019

BASHing data: Msot popele can undreatnsd tihs setennce      20 December 2019

BASHing data: Emphasising text in the terminal      13 December 2019

BASHing data: Another surprising AWK trick      6 December 2019

BASHing data: Data validation on entry with YAD      29 November 2019

BASHing data: Python and shell tools      22 November 2019

BASHing data: Embedded newlines without a clue      15 November 2019

BASHing data: Topping and tailing, and the slowness of GNU sort      8 November 2019

BASHing data: Introducing the replo      1 November 2019

BASHing data: Steady as she goes, in Darwin      25 October 2019

BASHing data: An unexpected character replacement      18 October 2019

BASHing data: VisiData: a table explorer for the terminal      11 October 2019

BASHing data: How to guess the field separator in a table      4 October 2019

BASHing data: A command-line "Countdown" (UK) companion      27 September 2019

BASHing data: A muggle's guide to AWK arrays: 4      20 September 2019

BASHing data: Add leading zeroes that aren't really leading      13 September 2019

BASHing data: Getting data from an Enphase Envoy S      6 September 2019

BASHing data: A shell script for building a new table with reordered fields      30 August 2019

BASHing data: A muggle's guide to AWK arrays: 3      23 August 2019

BASHing data: Long, narrow tables vs short, wide ones      16 August 2019

BASHing data: The lat/lon floating point delusion      9 August 2019

BASHing data: A bulk replacement GUI with YAD      2 August 2019

BASHing data: Renumber a list after inserting a line      26 July 2019

BASHing data: Finding malformed markup      19 July 2019

BASHing data: A muggle's guide to AWK arrays: 2      12 July 2019

BASHing data: Return of the mojibake detective      5 July 2019

BASHing data: Leading and trailing whitespace      28 June 2019

BASHing data: Plotting data in the terminal with gnuplot      21 June 2019

BASHing data: Working around the BASH brace expansion rule      14 June 2019

BASHing data: A muggle's guide to AWK arrays: 1      07 June 2019

BASHing data: Growing the Cookbook's "broken" function      31 May 2019

BASHing data: Transpose, pivot and bin with GNU Datamash 1.4      24 May 2019

BASHing data: The magic of BASH string expansion      19 May 2019

BASHing data: How to delete, insert and replace whole lines      12 May 2019

BASHing data: How to delete, insert and replace whole fields      5 May 2019

BASHing data: Two ugly CSVs      28 April 2019

BASHing data: Spreadsheet annoyance no. 2      21 April 2019

BASHing data: Making pictures with data      14 April 2019

BASHing data: Quotes as characters      7 April 2019

BASHing data: Dog and cat data      31 March 2019

BASHing data: How to choose special characters, revisited      24 March 2019

BASHing data: The trouble with Windows CRLF      17 March 2019

BASHing data: Data with bulges      10 March 2019

BASHing data: Two special data validations      3 March 2019

BASHing data: Data from dingbats: copying down      24 February 2019

BASHing data: Fancy numbering of records      17 February 2019

BASHing data: Getting data out of Excel safely      10 February 2019

BASHing data: Comparing fields across two tables      3 February 2019

BASHing data: Reformatting a list, cleverly      27 January 2019

BASHing data: Parsing scientific names      20 January 2019

BASHing data: Horizontal sorting within a field      13 January 2019

BASHing data: Drugs on the command line      6 January 2019

BASHing data: Changing the month format: a fairly general solution      30 December 2018

BASHing data: Has the rainfall pattern in my hometown changed?      23 December 2018

BASHing data: How many fruits in 5 apples, 3 oranges, 1 pear and 17 lemons?      16 December 2018

BASHing data: Putting information into a table from the table's filename      13 December 2018

BASHing data: Finding changepoints in a list, revisited      6 December 2018

BASHing data: Unwrap your fasta      1 December 2018

BASHing data: Avoiding senior moments with command-line functions      13 November 2018

BASHing data: How to find distances between lat/lons for geochecking      7 November 2018

BASHing data: Mapping with gnuplot      31 October 2018

BASHing data: Repair job: separate the tandem repeats      26 October 2018

BASHing data: Bird watching with AWK and grep      24 October 2018

BASHing data: How to enter nothing in a database      18 October 2018

BASHing data: How to validate ISO 8601 dates without regex      5 October 2018

BASHing data: Fightin' fields      30 September 2018

BASHing data: Fuzzy matching in practice      23 September 2018

BASHing data: Data on clay      20 September 2018

BASHing data: iconv and illegal input sequences      13 September 2018

BASHing data: Displaying data from table fragments      6 September 2018

BASHing data: SCI and 62;c62;c62;c...      25 August 2018

BASHing data: A record pager built with YAD      18 August 2018

BASHing data: 48 sea levels and a trope for your terminal      11 August 2018

BASHing data: Mojibake detective work      6 August 2018

BASHing data: Pseudo-blank ("empty") records and fields      4 August 2018

BASHing data: GUI ways to view and edit big text files      31 July 2018

BASHing data: Question marks that aren't really question marks      27 July 2018

BASHing data: Time series ops      23 July 2018

BASHing data: Curse of the CSV monster      18 July 2018

BASHing data: Partial duplicates      14 July 2018

BASHing data: Fun with BOM data      11 July 2018

BASHing data: Truncated data items      4 July 2018

BASHing data: Too many lat/lon digits      30 June 2018

BASHing data: Embedded newlines      23 June 2018

BASHing data: Combo characters      9 June 2018

BASHing data: Pivoting airlines      3 June 2018

BASHing data: A surprising AWK trick      27 May 2018

BASHing data: Compare parts of strings      22 May 2018

BASHing data: YAD repeat and edit      21 May 2018

Liferea hack: add links to ABC (Australia) news items      21 February 2018

How to deal with NBSPs in a terminal      2 February 2018

BASH drivers, start your engines      11 January 2018

A handy script for translations      1 January 2018

When will this script finish doing its job?      1 December 2017

A simpler track-changes script      13 November 2017

File manager: open a terminal, but not here      1 November 2017

Shifty dates in Microsoft Excel      3 October 2017

It rains pretty regularly, in the shell      14 September 2017

Tweaking uniq -c      28 August 2017

Quick - what time is it in Singapore?      1 August 2017

A script to find empty fields in a table      16 July 2017

Debian 9 on a Dell OptiPlex 9020 Micro      5 July 2017

How to tidy copied PDF text with a CoPa script      30 June 2017

Sorting numbers inside text strings      15 June 2017

Presentations in a browser      29 May 2017

AWK and a rainfall time series - Part 2      16 May 2017

AWK and a rainfall time series - Part 1      15 May 2017

Xfe file manager: an independent marvel      3 May 2017

BASH a block of bytes      24 April 2017

How to build and edit LibreOffice dictionaries      13 April 2017

The buttons of YAD      2 April 2017

grep vs AWK vs Ruby, and a uniq disappointment      26 March 2017

Scripting an arithmeticker      12 March 2017

Tips for tpp and patat      22 February 2017

Scottish Country Dancing stats      1 February 2017

A script to log what my GPS tells me      17 January 2017

Eek! My rounding is biased!      7 December 2016

Scripting a DNA sequence viewer      8 November 2016

Hunting gremlin characters      21 October 2016

Teach your grandmother to write scripts      14 September 2016

Finding unmatched braces (brackets)      13 August 2016

How to use flags in AWK      16 July 2016

Proofreading for illusions with grep and AWK      18 June 2016

Transposing rows and columns: 3 methods      13 May 2016

I think I like backreferences (sometimes)      1 December 2015

Keeping emails as text files: 2 scripts      28 October 2015

How to interleave, alternate and collapse lines of text on the command line      6 October 2015

The joys of ISOdates      17 September 2015

Scripting a fancy chooser for recently used files      17 August 2015

Gnumeric: a filter-and-export script      28 July 2015

How to insert code snippets on the command line without executing them      12 July 2015

How to read a file N lines at a time in BASH: 3 methods      29 June 2015

Split a table and number the pieces: two methods      28 May 2015

Some baby name problems      1 May 2015

A "Track Changes" script for data cleaning      26 March 2015

How to garble      20 February 2015

DMS to DD to KML with AWK and sed      6 February 2015

Grouping things with AWK      22 January 2015

Building sequences of numbers on the command line      13 January 2015

Building a desktop Wikipedia checker      24 November 2014

Software is not data      7 November 2014

How to repeat a script, or not      21 October 2014

On dates and stuffed non-dates      6 October 2014

The header line: how to add, delete and ignore it      21 September 2014

Joining tables on the command line      2 September 2014

Splitting a file elegantly      29 August 2014

Tips on getting (and suggesting, and editing) user input      1 August 2014

How to get nowhere in particular [generating a random lat/lon]      1 August 2014

Top 10! fun on the command line      21 July 2014

ODT to TXT, but keep the line numbering      30 June 2014

Scripting a 4-color multiple grepper      20 June 2014

Tips on tables      4 June 2014

A pivot table in AWK      23 May 2014

Why I (sometimes) love regular expressions      5 May 2014

Scripting a 'Find-and-Replace' for big text files      23 April 2014

Multiple-item data entry with YAD      7 April 2014

Scripting an OCR text archiver for Trove      23 March 2014

How to kill blank lines elegantly      16 March 2014

Building a gazetteer table from KML files      10 March 2014

Scripting a log for a single application      4 March 2014

Scripting a super-fast points plotter for Google Earth      23 February 2014

Scripting a character chooser with dzen2      13 February 2014

Finding changes in a sorted list: a trick      22 June 2013

A very tiny GIS      5 March 2013

CoPa: 2 scripts for LibreOffice Calc and 1 for the kid in you      29 January 2013

CoPa scripting: change text between copy and paste      21 December 2012

Build a scientific names dictionary for LibreOffice      17 September 2012

Color picking made simple      5 September 2012

Compare two images easily with Geeqie      31 July 2012

Checking a website for incorrect links      25 June 2012

A spreadsheet jukebox      13 June 2012

Basics of KML      21 May 2012

Measures on the command line      2 April 2012

The F4 trick in Gnumeric      23 March 2012

Find time zones with the command line      20 February 2012